Contnued conversation on ESP32 with VoiceAssistant (ChatGPT)

good evening,
maybe I didn’t understand how it works or how it should be configured… I have an ESP32 with a microphone configured for Assist (wiht ChatGPT as conversation agent)
I saw that an update for the “continuous conversation” of Assist has just been released.
But of course it doesn’t work for me on my ESP32.

How do I configure it so that it can continue with conversations to a question?
Is it possible?

It. MUST be an actual VPE at the present time. Not any generic esp.

1 Like

thank you very much for answering me…
So, it’s not possible…

just to know: what is an “actual VPE?”

It has to be the Home Assistant Voice Preview Edition. (VPE). At least currently.

2 Likes

Just a note on this for anyone else stumbling on this thread… It is not a hardware limitation or anything, which requires Voice Assistant PE. The assist pipeline will handle it on the latest version of HA, it just takes adding the logic to the firmware. When the STT response ends with a “?”, the esp device has to stop listening for the wake word, and just start streaming the audio up to HA. Basically skipping the “wake” step. I almost have my firmware on atom echo working. Last issue I’m fighting is the device starts listening too quickly and picks up the actual TTS response playing over my speaker. If someone was not using an external speaker/sonos, what Im working through now would not be an issue.

2 Likes

I have a version working consistently on my Atom Echo’s.
https://github.com/bofisher/esphome-continue-conv

4 Likes

Hi can you describe the installation in more detail, this is exactly what I have been looking for.

I put a little more detail in github, but this is really just a POC. Other than a specific question, it will likely take some experience with customizing firmware.

Here is the updated repo. I added a few things and improved the delay time while TTS plays over the external media_player.

yes, thanks…

this was just an old post… but everything has been fixed. I hope I can test your setup on the AtomEcho too

Thanks for posting your code and I was checking out your Github. I have some Atom Echos and an S3 box and was also having problems with the continued conversation not working but I’m using the Home-LLM (Local LLM Conversation) integration.

What happens is the LLM (I’m using Anthropic) comes back with a question, then the device doesn’t remain open for me to answer, I have to say “Hey Jarvis” again each time to respond.

Is there a specific section of your code I could experiment with and drop in? I am using Bobba’s custom firmware right now for my S3.

Any help would be awesome and thanks!

Take a look at the continue_convo bool, check_if_question script, and the “on_end”. It checks for a “?” at the end of the TTS transcript. When you initiate an ask_question, it should also end with a “?”. The continue conv logic in my firmware will handle both a continue conversation and a ask_question scenario. Basically in “on_end” it just triggers “voice_assistant.start” when the “?” is detected. That will start the voice pipeline again without the wake word.

The LLM integration should not matter, as all of that is wrapped and handled by the voice_pipeline. One thing that confused me at first is that ask_question doesn’t really use the LLM. From an automation you define the question to ask. Then the voice pipeline handles the TTS and your response, but its the automation that processes the response, not the LLM. Take a look at the blueprint example that Home Assistant released for ask_question. I would recommend starting with yes/no questions.

Perfect and thanks, this is exactly what I needed to understand!