Voice Assistant: Follow-Up Responses

visionintegrations · February 24, 2024, 11:19pm

Hey All,

Hopefully this hasn’t been asked yet, though I couldn’t find it if it has.

When using a voice assistance with openwakeword detection and Extended OpenAI Conversation integration, how can we add the ability to continue a conversation or give a follow up response or question in the same conversation. Currently is seems that every time you say the wake word, it starts a brand new conversation and forgets about the previous interaction. This would be similar to Alexa waiting for a few seconds for you to say a follow-up question or instruction without having to repeat the wake word. Ideally if all of the interactions could be kept in some memory or storage so that the voice assistant could learn over time to its environment, that would be amazing…but one step at a time!

Thank you and if there is anything I can assist with, let me know. I’m still quite green in my dev and AI skills but I’m slowly learning.

daywalker03 · February 25, 2024, 1:42am

Voted. And I support this feature request as far as it can go.

daveloper · April 23, 2024, 1:13pm

Looking for this exact usecase

mr_tbot · May 2, 2024, 6:34am

1000% back this.

memyselfandm · May 3, 2024, 6:02am

this will make wyoming satellite the killer app for home assistant. also the extension of this which is allowing the wyoming device to be awoken by a service call, as opposed to the wake word.
example use case is for example i have a smart button remote and i want to set the long press action to wake my voice assistant, in the event i’m not able to or dont want to use the wake word.

schniepp · June 13, 2024, 3:55am

I think there are two separate issues here: (1) the assistant remembering the previous details from the conversation, and (2) needing to repeat the wake word to initiate any voice input from user.

For me, (1) is the bigger issue right now. I would be happy to say the wake word every time to continue the conversation. For me, (1) is already working when I used the assist via typing (assist button in Home Assistant). So somehow, the assistant is invoked in a different way between the “typing interface” and the voice interface

formatBCE · June 13, 2024, 4:13am

This “remembering” is done via “conversation ID” field. You can call service “conversation.process” several times, passing conversation ID from previous call, and assistants, that support it, will have context of previous requests.
AFAIK Assist doesn’t support conversation ID (it’s always null, even if you pass it), and I don’t see it yet how it could do that.

itispip · July 6, 2024, 12:07pm

Seems use AlexiiT’s StreamAssist is the solution? It allow you to call Stream.run service to directly start from SST phase, or even allow you to start from let AI ask you a question & you reply it.

Daniel_Reimer · August 25, 2024, 1:54pm

Until now I thought the same that saying the wake word again causes it to init a new conversation and all before was forgotten. Thus in the past I ran into a dead end more than once already where OpenAI asked something back and I was not able to reply. I few minutes ago I asked Assist to switch on my Yamaha Receiver and LG TV and it asked back which room this Receiver might be in. No clue why it asked that with only this ONE receiver, but that is not the problem here. I said the wake word again and just replied “the one in living room” and it happlily switched on the TV and receiver then. So it still was in the former conversation and knew then what it needed to do it’s job. Only thing I still would love to see is that the Assistant would immediately wait for my reply instead of making me say the wake word again, but all in all I was surprised that this worked so fine.

formatBCE · August 25, 2024, 3:30pm

OpenAI integration has the context window (it is actually manageable, AFAIK).
However, Assist doesn’t. We want it.
Also, would be great to make it listening right away after the question, eliminating the need in wake word.

pjn77 · September 1, 2024, 9:34am

Ive wanted the same thing so am working on modifying the default voice assistant code so that the voice assistant turns wake word detection off at the end of it processing the initial command, and start is again to listen for the next thing said.

An idle timer and a HA automation resets it after 20 seconds so it waits for the wake word again. Planning to make that part of the ESPhome config instead.

Ive got a thread about it here but I think this is achievable with the currently available tools.

MichiTD · September 9, 2024, 9:41am

This function would be great.

mgc8 · November 3, 2024, 11:19pm

I’m also interested in having this functionality in the Voice Assist pipeline. It’s already there in the app – when using the assistant from there, even with voice, it keeps the context of the conversation and you can do things such as Turn on the lights in the kitchen → The lights are now on → Ah, no, actually I meant the office => and it works just fine. With the Wyoming-style voice integration however there seems to be no context saved anywhere. Are these completely different implementations, or could we perhaps re-use the pipeline from the app?

saschaabraham · November 23, 2024, 8:31am

Used the Gemeni integration via wyoming satellite today. And there is a context awareness. Changed the heating temperature of a room to 10 degrees. After that, i said to switch it back to 20 degrees without providing the room name. It worked flawless.

Now the part with listen again without wakeword needs to be implemented…

mgc8 · March 27, 2025, 10:09pm

That’s amazing, just tested it as well and indeed it works nicely. Not sure when the change was implemented (as I’ve updated quite a few components lately), but it’s certainly a great improvement!

The “listen again” part might be tricky, since we’re dealing with disparate components, and it’s possible the “listening” part could kick in before the assistant itself had stopped talking, which leads to confusion; I’m avoiding this in my case by setting a long “refractory-seconds” period for wyoming-satellite, but that’s not ideal. And if it were actually listening automatically, it would need to somehow identify the user request as such without mis-triggering (what if there’s a TV show or music playing in the background?)… Not sure what the best way to solve this would be, except for much smarter voice detection models?

saschaabraham · March 28, 2025, 8:11am

With the 2025.4 beta it is able to listen again, if the llm return a question mark at the end of a text. So kind of follow up. Would be nice to have this in general…

gshpychka · April 4, 2025, 9:43pm

But we can’t set the conversation_id in the assist_satellite.start_conversation action, right? So it would be a brand new conversation

turboc · July 18, 2025, 5:03pm

it’s still not working for me. I ask it to turn on the “office lights”. it responds that it can’t do right now would I like to turn on the “office lights”, I say Hey Jarvis, yes, but it then just wants to know what I want to do. I just installed everything todaty so everything should be current.