I recently picked up a couple voice preview editions and set them up but am running into a weird issue where, when I ask one device a question verbally, the response is sent to each of the satellites in sequence. In other words, if I ask device A a question, it processes and when it’s ready to respond, it responds on device B. Once it’s done there, it then repeats the response on device A.
A bit about my setup. I’m running Piper and faster-whisper (and ollama) in containers on my unraid server so they have access to better hardware. The voice assistant is setup as:
I’m assuming there’s some underlying configuration with TTS to make sure the response is sent to the activated device but I’m not quite sure where to look or if it’s an issue on the HA side or on the remote Piper side. Any insight would be appreciated!
I should have included that this seems to happen if I’m using faster-whisper and piper locally in HA while using ollama on the server. So maybe it’s an issue with the model’s response and needs some sort of special prompt?
I have seen this problem recently from others, either here on the community forum or Discord, and there is a theory that some LLM models are using the “broadcast to the satellites” feature when it responds. A possible fix is to unexpose the satellites to the LLM (or change models).
Note. You won’t be able to use broadcast at all if you state it that way.
Say you may not use broadcast tools UNLESS SPECIFICALLY REQUESTED BY THE USER USING A REQUEST CONTAINING THE WORD BROADCAST.
Emphasis mine to show the part that has to be there - not that you have to shout at your llm. Just being specificaostated when and when NOT to use them should do it. And still retain funcction
Anyone have any other suggestions? I’m set up pretty much exactly the same as the OP except I’m using the regular llama3.2 model and I’ve tried everything here and it’s still responding on my 2nd PE device first and then the one I asked the question on.