Speaker buffer error when using voice assistant

Hello eveyone! Working with the new voice assistant features, I have been attempting to set up a custom esphome based voice assistant using the wake words. I can get it to respond to me but sometimes, when the response sentence is longer than 1 - 3 words the esp32 will print an error to its log:

 [voice_assistant:293]: Speaker buffer full.

Any tips on how to implement a fix for this issue? I can provide any info needed.

2 Likes

I’m getting the same message. Seems we are not alone: https://community.home-assistant.io/t/m5stack-atom-echo-diverse-error-messages/633340
I have not found any solution to this problem. My guess is that the relatively low amount of RAM on the ESP32 is causing this.
My ESP32 setup is also not very stable. Sometimes it works for a while and then suddenly it stops working. Hoping that a future update will fix these.

Same here, replies are pretty scratchy, very unstable.
Good that i o ly got one of them.

Same here, mostly for longer answers like “Entschuldigung, das habe ich nicht verstanden” (german answer if the query could not be handled). The end of the sentence is cut off in this case.
It’s kind of reproducible since it happens nearly every time, but sometimes it does work or won’t work for shorter responses.
I also tried using media player instead of speaker, but it won’t work with wakeword. I guess the CPU is too busy, because the media player does work when I turn off the wakeword.

Maybe it will be better in the next version, I think there’s a change in the response format to handle some other issue.

1 Like

I got it to work with “Media_player” component but the audio is 2x as bad as using the “speaker” component. The same symptoms just stutters twice as much through the whole playback.

If it’s the same issue I was having, the workaround I found for now, is to disable the wake word when the TTS starts playing, and re-activate it after it ends. It goes from unbearable, to perfectly clear.

I initially tried to do this directly in ESPHome, but quickly found it was easier to simply bypass it altogether. I built a script in HA that does all the logic, and I call it from within on_tts_start in ESPHome voice_assistant, from which I also removed the media_player parameter (HA handles the TTS output).
I haven’t noticed any kind of “slowdown” because of this wrapper, and been using it for several weeks now. It works well for me.

I posted the ESPHome config, as well as the script in question here on Github (more readable than posting large portions of code here). PS: Make sure to switch to the main branch for latest updates, since the community’s software alters github links to day-of-post revision.

Did anyone get this to work with the M5Stack Atom Echo following the “voice assistant for 13 dollars” guide from the Home Assistant website?

It is the same for me as described here:
It only works for a short time, then becomes choppy or no longer works at all. I have reinstalled it several times and followed the instructions exactly.
It runs smoothly on the mobile app.
I have since switched back to using HA manually, which is definitely faster and more reliable. (Probably also because I can’t always remember the right words :wink: )

I have noticed that using OpenAi for the conversation agent cuts that error way down.
I can get some pretty long responses from chatgtp and it not throw that error. I really dont think its an issue with hardware.
The satellites I have built are pretty reliable now.

Same issue using the HA agents, piper + whisper on a ESP32-S3-Korvo-1.

Ive managed to almost eliminate this.
First, all of the satellites are on the IOT network. Their own wireless network.
Second-I gave them all and the HA server priority access on the network.
I believe this is a streaming issue and not a hardware issue.

@Rich37804 Thanks for that update. What type of wireless router/access point are you using?

TP Link deco

Thank you, Rich. I’ve been seriously considering the TP-Link Deco series recently. Asus AImesh firmware has been significantly less than reliable/performant in my case the last several months. :frowning_face: