ESP32 S3 Box 3: Why is this so difficult?!

I’m not sure I’ve done any project in HA that has created more teeth gritting than getting this box to work! :slight_smile: I’m working on getting voice assistant cranking. I’ve made lots of progress, even getting it working flawlessly with ChatGPT! I just need to start on a dedicated voice assistant box. I’m starting with this Box 3 before I move into the more exciting realm of ripping apart my google home pucks.

I got the box bootstrapped and on my wifi by the usual means with no real fuss. I installed the Box 3 specific firmware with the short-and-sweet yaml that has the include from a github and my wifi secrets linking.

The box was deaf out of the gate. Microphone wasn’t working correctly. I wandered through searches and found some things to add in after bringing the whole config yaml down local to me. I got it to successfully compile, AND for the microphone to work once! I thought I was on my way. Then on a reboot, no more microphone.

I did much more toiling about for days with little change. I am also plagued with the very routine occurrence of giving the install command and the compile crashing. I do nothing except hit retry and the compile happily proceeds.

An ESPHome update came out yesterday with lots of fixes so I thought I’d give it a shot since I tried everything else. Viola! The microphone works! It hears me, processes the S2T and responds…but…now there is no speaker! I see it sending the reply .wav and the Box 3 playing it but no sound comes out. I reverted back to the “pure stock” version that is on github but same result.

Given the relatively small number of conversations on this, others seem to have gotten this sucker working. Can anyone give me some advice? I’m at my wits end. I’m not a newbie at esphome, i am running esp32 powermon and ratgdo without a single issue, it’s just the Box that is getting me!

I’m sure the first question will be “what’s your config yaml?” So here it is:

  name: esp32-s3-box-3-5aacd8
  friendly_name: ESP32 S3 Box 3
  esphome.voice-assistant: github://esphome/firmware/wake-word-voice-assistant/esp32-s3-box-3.yaml@main
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}

(followed by the api key and wifi references of course)

Any advice appreciated!

1 Like

I did have it working but after the latest ESPHome upgrade to 2024.5.0, it has broken. I have not troubleshooted the issue yet and but when I find the solution I will be providing update here.

Others are reporting the issues as well with no documented solution as of yet.

1 Like

Thanks for the reply, late yesterday I started to see people with same problem. That actually makes me happy because I know it’s not something that I borked. In that regard, misery loves company. I’ll hang out and wait for updates.

I used the Willow add-on. Problem free so far.

I don’t know why you guys are wasting your time on this. It’s all good if you like science experiments. The HA voice stuff just is NOT ready for prime time. Try Willow. I wrote up a detailed doc explaining how to get it working and IT WORKS. Literally 15 minutes to set it up…

Use the link Stiltjack posted.

1 Like

I think you are misunderstanding what my ask is. Willow still needs a voice “box” for you to talk to. I already have the voice assistant part of HA working just fine, I’m struggling with that box.

ESP32 S3 Box 3 is the voice box. Speech recognition is very good ideed.

No speaker sound here as well, just a “click”.
Watch the log while compiling, there are a lot of issue’s to be solved so it seems .

Short term fix - downgrading to an older version of esp-idf:

  board: esp32s3box
  flash_size: 16MB
    type: esp-idf
    version: 4.4.6

I had a startlingly easy go of it with my BOX unit (non S3). A couple weeks ago, I did the following:

  • Selected my device type

  • Hit Connect, selected proper COM port, & hit Install.

A couple minutes later, I had a working voice assist satellite device, with working mic, speaker, and cutesy display.

I immediately asked it “Hey Nabu, turn off XX lights”
And much to my surprise, without doing any other config, IT DID IT!
(+ other voice requests)

I haven’t really played with it much since, & don’t know if it’ll have any issues with newest ESPHome code.

But I was absolutely flabbergasted that it came up so easily.

I’m still doing HA voice control here with Alexa and/or Siri. But I’m now convinced that HA Voice Assist is much farther along than I had thought, & I can seriously think about starting to move to it anytime.

You can find all the YAML code that it’s using, via the GitHub link near bottom of page – firmware/voice-assistant at main · esphome/firmware · GitHub

Follow-up: If anyone uses the Ready Made Projects to build a voice satellite with M5Stack Atom Echo device, I’d have to agree that unit is more toy like. Almost un-hearable speaker volume. And no wake word – you have to push its button for Push-to-Talk functionality (apparently not enough horsepower to process wake word on device). And mic didn’t always seem to pick up voices clearly.

But even so, it IS a functional voice assist satellite, which worked immediately after loading firmware. (And is cheap way to experiment)

Thanks for that! I tried that but that ended up breaking the microphone.

I’m in the death spiral of attempting to install as well. I’ve follow the instructions here:

ESP32-S3-BOX voice assistant - Home Assistant (

which don’t work as there is never an option to set the WIFI credentials.

I tried this: firmware/wake-word-voice-assistant/esp32-s3-box-3.yaml at main · esphome/firmware · GitHub but it was rejected as being too large.

Anyone with a suggestion on how to proceed with the ESP32-S3-BOX-3 variant?

I agree, it shouldn’t be this hard. This is a bad look for the Home Assistant/ESPHome ecosystem.

Look here

I had a similar issue on Arch Linux. So this may not be relevant to you. I found that after the device got flashed the permissions on the USB reverted back and I didn’t have access anymore. As a hack I opened a background terminal and did:

while :; do chmod 666 /dev/ttyACM0; sleep 2;done

Replace ACM0 with your device name. Keep it running until you’re all done then ctrl+c