I’ve spent a lot of time struggling to get this board to work. My issue is that it isn’t recognising it’s own onboard PSRAM… I’ve contacted the seller but…China.
Once I get it to see the PSRAM (which I think is simply an ESPhome code / config thing) it should work.
In the meantime I’ve ordered some of these which apparently do work. I’ll be able to confirm once they arrive.
with Microphone only version of the firmware from here.
I have tried the firmware with the speaker and it appears to not recognise the wake word with that firmware. I have not had a lot of time to play yet with the speaker. I have also ordered some N16R8 S3’s as the memory size is important apparently. This is not my field of expertise but I am making progress.
That is very similar to the one I’m trying to get working (ESP32-S3FH4R2) but with no luck. BigBobbas has been helping me but the ESPhome log shows the device not seeing that it has PSRAM…
UPDATE: I re-tried using the same GPIO as this example and it works now. There was obviously some strange conflict with the GPIO I had selected. So now I can safely say that the boards I linked earlier do work.
Thanks! The developer who mentioned that it didn’t work on any S3 with PSRAM did mention memory differences between models being one of the issues. I thought might be different versions had memory from different manufacturers or something (not my area of expertise either) but it sounds like he was meant it requires a specific amount of PSRAM since that board only has 2MB with 8MB being ideal and possibly 4MB but I’ll stick to 8MB as the price differences is maybe 2 dollars if there is even an option to choose the same model with different amounts
Also thanks to the other posters and links, it’s good to know the devkit and wroom-1 appear to work as long as they have enough PSRAM. It really sounds like that’s the deciding factor but obviously more boards need to be tested. They did mention to post any boards/models users get working. I’m guessing Discord is the place to post that information if you do get it to work on a board that hasn’t already been confirmed. Thanks again!
I’ve no idea if this is a only my device thing, but the wake word doesnt appear to respond over time, this isnt a new issue it was happening on the old build without local wake word.
Appears to happen over time and I have to restart the device. I’ve not seen it reported on the issue trackers so I’m hedging more towards it being an issue with my s3box3.
Theres also an audible “pop” every now and then, I assume this is the microphone becoming active and is normal, but may or may not be related
Well, I just ordered one. I’ll let everyone know how it works out. On paper it should work but we all know that doesn’t always on out. I just happened to search Amazon and they have them in the US store for the same price. My main issue was ordering from AliExpress and having to deal with a return if it didn’t work but Amazon will take anything back so if it doesn’t work I’ll just send it back for a refund. only 7 left in stock. Not sure about the UK store.
If you live in the UK then you can buy an ESP32-S3-BOX-3 from my store.
I only have a limited amount and no idea how popular they are going to be. I hope people will understand it’s best price I can do it for with all of the effort that’s gone into the site etc.
I attempted to compile the code for the s3-box-3 and got the following compile termination, any ideas?
...
Compiling .pioenvs/esp32-voice-node-5a9788/src/esphome/components/micro_wake_word/micro_wake_word.o
Compiling .pioenvs/esp32-voice-node-5a9788/src/esphome/components/network/util.o
In file included from src/esphome/components/micro_wake_word/micro_wake_word.cpp:1:
src/esphome/components/micro_wake_word/micro_wake_word.h:19:10: fatal error: tensorflow/lite/core/c/common.h: No such file or directory
#include <tensorflow/lite/core/c/common.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
*** [.pioenvs/esp32-voice-node-5a9788/src/esphome/components/micro_wake_word/micro_wake_word.o] Error 1
========================= [FAILED] Took 288.57 seconds =========================
Removing all the yaml and trying again fixed it and it compiled. After adding my esp32-s3 to HA should I expect to be able to add it to my assist pipeline at the bottom under wake word? It just says I don’t have a wake word engine setup yet.
This is not mostly my work and could use some attention to detail for the included h files used for Arduino by esspressif specifically for the korvo-1. Works for the S3 korvo-1 though.
I just got my Box3, when its working I like it. Though it seems to cut off or misundertand words. For instnace when I say “turn on the den” I get an error saying something like “can’t find the device Din” or that.
My main concern however is the wake word “Ok Nabu” is only working around 20% of the time the first time and usually have to say between 3-5 times to get it to wake.
Still surprised about all the satellite talks when basically everyone has a mobile phone and most have tablets. So why spend cash on some satellites if all we really need is wake word support in the companion app.
Would even work on Android TVs, Android cars etc. etc.
No cost, lots of processing power and readily available everywhere.
I guess you do not have kids or other family members that do not always carry their phone everywhere (if they even have one, which I am sure most small kids do not have). Even I who do carry my phone almost all the time personally still use our existing Google Nest / Google smart speakers a lot for hand-free voice control.
I believe that most common usecases are when and where handsfree operation makes sense, like example in the kitchen while your hands are busy, with usecases like controlling lights (brightness or ON/OFF), set/operate timers and reminders or alarms, adding stuff to shopping lists or to-do list, and music controls.
Regardless, there are several usecase reasons that appeal to mainstream users and that is why Google and Amason have each sold more than 500,000 Google Nest / Google Home and Amazon Echo / Alexa smart speakers each ao far.
Check out result in this wish list poll once you done it yourself:
Not sure what you are trying to show me with that poll. It is about features, not hardware. And wake word support is one of the top priorities there.
I do not have kids but I do not need kids to know that I have multiple Android and iOS devices in my household. And I do not need to be the owner of the iOS devices to be able to use them to use voice control.
So the point is, that if the companion app supported wake words, anybody in my home could enter any room that has any Android or iOS device lying around and could give voice commands.
The device just needs to be in hearing distance.
And you are quoting sales for echo, alexa etc. woth 500 k. 500 k units is not that much. And very few people here want to share all their data with big tech.
So a lot of people are buying more or less expensive satellite hardware. They are fun to play with but they have little future. They will lie around somewhere in a year or two because they are too bulky or too slow. Or because people realize that they need to buy one per room because they are not as mobile as all our Android and iOS phones and tablets.
So, sure, you can buy lots of dedicated hardware for a task that really is just a mic and speaker. Or you could use what everybody owns and most people even have old devices lying around (I still have my Samsung S2 and S6edge). So my wife and I would currently own >6 perfectly fine “satellites” in the form of phones and tablets. Just waiting to be used as local voice controls.
Cost for 6 ESP S3 boxes? Couple of hundred euros. For what? A big, bulky satellite with inferior screen and sound compared to a mobile phone or tablet
Alex, you do what works best for you. It’s great if you prefer to use the Android app rather than setup satellite devices. That is why there are several options.
Personally I find that getting my phone out, logging in to the phone, and starting the app before I can turn a light on or off … is more painful than getting out of my chair and walking to the light switch. But speaking a command seems so easy … all i need is for it to work reliably
The idea is to use wake words on the locked phone.
Or have devices like old phones or tablets remain unlocked.
Or speaking to my TV while I am watching.
You can already control all your devices with the locked phone by using the tiles. Now it is “just” necessary to add wake words
And the computing power of a 50 € tablet is much higher than that of a 50 € esp device. Mic and speaker are also better. Imagine just hanging a bunch of firehd tablets on your walls and speaking to them. Would look much nicer than esp devices and offer nice big screens and great touch control