How to diagnose problems with Voice Assistant support

I’ve been struggling with this off and on for a couple weeks now, and so far I’ve had no luck with anything but finding threads on here and Reddit of people having similar problems with no follow-ups, so… hopefully someone has ideas.

The problem I’m having is that none of my ESPHome devices that are meant to be endpoints for a HA voice Assistant work. This is across a couple different ESP32-S3 units and an older ESP32-WROOM. All three have microphones that work fine via either ESP-IDF or Arduino, so I know (at least in broad strokes) that my ESPHome config for each board is correct. All of them show data being read by the mics in 64 byte chunks, so i2s is working properly. All of them log that the voice assistant component is starting sst, but there’s never any logging from the sst components, and there’s no results in HA or on the device.

The HA assistant debug logging shows the voice session – so the call to start it is making it to the API server. The debug screen shows the speech-to-text component, but with no text result:

metadata:
  language: en-US
  format: wav
  codec: pcm
  bit_rate: 16
  sample_rate: 16000
  channel: 1

But I can’t find anything documenting how to tell what is actually going on, so I can’t tell if the generated ESPHome code is doing something wrong with the i2s mic and not getting any audio, or if the audio is corrupted because of a format issue, or if it is even attempting to stream the audio to HA, or if HA’s recognition of the audio it got failed. It seems like an end-to-end black box that either works or doesn’t, with no way to really tell when it is the latter.

The assistant works fine when triggered via the web interface or the Android app.

So, basically, I’m trying to find out if I can:

  • save the data from a microphone component to a file on an SD card so I can actually see what is in it.
  • Log the data from the microphone (as wordy as that would be) so I can see if it is getting anything
  • See the length of the data that was sent and/or received by HA
  • Get any debugging at all on any of these things so I can figure out where the breakdown is.

Hell, a even an example debug log for a VA request that worked would be nice. The documentation doesn’t give any examples of a working request, so it’s hard to tell if there’s anything missing on the ESPHome side. I find it strange there’s no logging when it is sending things to HA, which makes me wonder if there should be and it just isn’t for some reason.

Anyway, if anyone has any suggestions, it’s greatly appreciated!

I can’t look up the url for you at the moment but do a google search for : ha voice assistant trouble shooting.
At the end of that page you will find details about saving your recordings.