ESP32 S3 Box 3: Why is this so difficult?!

I had a startlingly easy go of it with my BOX unit (non S3). A couple weeks ago, I did the following:

  • Selected my device type

  • Hit Connect, selected proper COM port, & hit Install.

A couple minutes later, I had a working voice assist satellite device, with working mic, speaker, and cutesy display.

I immediately asked it “Hey Nabu, turn off XX lights”
And much to my surprise, without doing any other config, IT DID IT!
(+ other voice requests)

I haven’t really played with it much since, & don’t know if it’ll have any issues with newest ESPHome code.

But I was absolutely flabbergasted that it came up so easily.

I’m still doing HA voice control here with Alexa and/or Siri. But I’m now convinced that HA Voice Assist is much farther along than I had thought, & I can seriously think about starting to move to it anytime.

You can find all the YAML code that it’s using, via the GitHub link near bottom of page – firmware/voice-assistant at main · esphome/firmware · GitHub

Follow-up: If anyone uses the Ready Made Projects to build a voice satellite with M5Stack Atom Echo device, I’d have to agree that unit is more toy like. Almost un-hearable speaker volume. And no wake word – you have to push its button for Push-to-Talk functionality (apparently not enough horsepower to process wake word on device). And mic didn’t always seem to pick up voices clearly.

But even so, it IS a functional voice assist satellite, which worked immediately after loading firmware. (And is cheap way to experiment)

Thanks for that! I tried that but that ended up breaking the microphone.

I’m in the death spiral of attempting to install as well. I’ve follow the instructions here:

ESP32-S3-BOX voice assistant - Home Assistant (home-assistant.io)

which don’t work as there is never an option to set the WIFI credentials.

I tried this: firmware/wake-word-voice-assistant/esp32-s3-box-3.yaml at main · esphome/firmware · GitHub but it was rejected as being too large.

Anyone with a suggestion on how to proceed with the ESP32-S3-BOX-3 variant?

I agree, it shouldn’t be this hard. This is a bad look for the Home Assistant/ESPHome ecosystem.

1 Like

Look here

2 Likes

I had a similar issue on Arch Linux. So this may not be relevant to you. I found that after the device got flashed the permissions on the USB reverted back and I didn’t have access anymore. As a hack I opened a background terminal and did:

while :; do chmod 666 /dev/ttyACM0; sleep 2;done

Replace ACM0 with your device name. Keep it running until you’re all done then ctrl+c

1 Like

FWIW, I couldn’t get this to work using Windows 11…it did work using using @pepe59’s suggestion on Ubuntu/Chrome. No clue as as to why.

I think you need to pin ESPHome to 2024.4.x to fix it.
It works in my branch under GitHub - D3SOX/ESPHome-firmware: Holds firmware configuration files for projects that the ESPHome team provides.

This is really disappointing, there is a lot of blabla going on about voice assist, so I bought this very nice exp32 S3 box 3(the m5stack echo was a rather disappointing experience), got the yaml code into esphome in HA, and I end up with a non-working speaker. It does process voice commands, not as good as Google, but that’s oke for now.
While compiling the code, the log screen is flooded with messages / warnings / errors. How can this be unnoticed by the creators ?
I followed the advice from @smcnaught to switch back to the esp framework version: 4.4.6, that works for me, for now :slight_smile:

1 Like

Hello, where exactly should I write these few lines that restore ESPHOME to the previous version?
Does /homeassistant/esphome/esp32-s3-box-3-5a93fc.yaml match the beginning or the end?

source on ESPHome-firmware/voice-assistant/esp32-s3-box-3.yaml at 0fe1bcec60bfc415dc793357783a28685e95dc1e · D3SOX/ESPHome-firmware · GitHub is already patched, use that one.

1 Like

Thank’s! Do I need to write something? Wifi password something like that?

I finally got my ESP32-S3-BOX-3 and whacked the standard yaml on with only two changes:

micro_wake_word_model: hey_jarvis

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

I have 3 separate issues:

  1. Speaker doesn’t work. Known issue and I’ll try downgrading firmware to 4.4.6 as recommended above.

  2. Microphone is pretty bad. I tried talking to it from 2m away in a quiet room and it didn’t hear me. I had to be 0.5m away and even then it didn’t hear me sometimes. Is this also a bug in recent firmware?

  3. It takes aaaaaaaaaaages to acknowledge my command. I know my back-end (whisper & piper) is fine because if I trigger Assist on my phone and say “turn on lights”, it takes 3-4s at most. When I do it via my ESP32-S3-BOX-3, it takes around 20s! Could this be a configuration issue or just the way it is?

Appreciate any guidance with this!

EDIT: Setting ESP32 framework version to 4.4.6 broke my microphone too, until I downgraded my ESPHome docker container to version 2024.4.2. Now both mic and speaker work, and it takes 4-5s to complete a simple command, which is comparable to using my phone.

So the primary issue I’m left with is the wake word. It just doesn’t seem responsive at all, I always have to say it about 5-6 times even when only 20cm away. Any tips for this?

4 Likes

Any updates on this? I just set one up and no sound.

Edit - Has sound but extremely muted. I can’t figure out how to adjust volume, nor use as a smart speaker.

Late June and I’m stuck in exactly the same frustrating position. Bought an S3-Box-3 because HA were effectively pushing it on the site. Voice Assistant initially worked, then after an hour or so the microphone stopped responding. I managed to get the temperature, humidity and (terrible) presence sensors working, but nothing I do seems to bring back a functional microphone. So frustrating.

1 Like

Same here, 3 of them “bricked” and waiting for the fixed upgrade.
Hopefully it’s not a hardware problem.

look at this firmware it’s great

BigBobbas/ESP32-S3-Box3-Custom-ESPHome: Custom ESPHome config for ESP32-S3-Box-3 with sensors and touchscreen (github.com)

3 Likes

Much better, now there is a touchscreen with controls.
Thank you, will play with it and configure buttons.

Voice however is not working still, I’m beginning to think the HASS / Openwakeword or something there is bugged at the moment on my side.

Either way, I always wanted it to be a control screen primarily. Was disappointed to see that the HASS Voice firmware was a pure screen only with no templates like this firmware has.
Appreciated.

I’ve been having the same issues. On their demos it seems to work flawlessly, I wonder if there are other things that are running that are taking up memory.
Additionally for me, when I changed the wake word model it did this weird thing where it wouldn’t respond to ‘hey Jarvis’, but I had to say ‘ok Nabu’ right after saying hey Jarvis to get it to start listening, if I kept repeating either of those commands or wouldn’t work.