Output Voice Assistant to other Audio Device

Hi there,

I set up a wyoming piper successfully on my docker host.
There I don’t have any speaker connected.

Is it possible to grab the response of piper in home assistant to redirect it to for example a sonos speaker?

Or is the audio gerneally sent to the device wich is triggering the voice assistant?

Nobody replied to this but I see it as immensely crucial part of the pipeline.

E.g. Pi Satellite takes openwakeword and inbound audio which the VA gets busy with.

Response from Home Assistant VA (or Ollama), instead of coming back to the Pi Satellite, plays through a Google Speaker or Sonos speaker.

I came to this because Eleven Labs TTS API key integration works for Browser or HA Android app VA usage, but no audio payload is making it back to the Satellite. So no Elevenlabs voice. E.g.

==> /tmp/oww.log <==
DEBUG:wyoming_openwakeword.handler:Receiving audio from client: 5468517240778

==> /tmp/sat.log <==
DEBUG:root:Event(type='synthesize', data={'text': ' My name is Mistral, a voice assistant for Home Assistant. I can help you control your home devices and provide real-time information about the current state of your devices using the GetLiveContext tool. For general knowledge questions not related to the home, I will answer truthfully from internal knowledge.', 'voice': {'name': 'Xb7hH8MSUJpSbSDYk0k2'}}, payload=None)

Note payload=None (and no audio playback)

But thinking about it, sending it to a Google Home or other central speaker like Sonos speaker would be ideal (though, seemingly not possible). Because I’d rather not buy a speaker for the Pi - Even if Elevenlabs TTS played back there.

How to send TTS reply from Voice Assistant to a separate speaker and be able to change it in the UI - ESPHome - Home Assistant Community

change tts.piper by your tts engine

I don’t quite follow your (accepted) answer in the other thread. My Entity is “Studio speaker”. Yaml isn’t saving.

“Message malformed: required key not provided @ data[‘command’]”

alias: "ollamaspeak"
description: "Captures a conversation, processes it with a local LLM, and speaks the response."
trigger:
  - platform: conversation
condition: []
action:
  # Action 1: Process the user's speech with your Llama3 agent
  - service: conversation.process
    data:
      agent_id: conversation.llama3_2 
      text: "{{ trigger.sentence }}"
    response_variable: rep

  # Action 2: Send the text response back to the conversation interface
  - set_conversation_response: "{{ rep.response.speech.plain.speech }}"

  # Action 3: Speak the response out loud using your TTS service
  - service: tts.speak
    target:
      entity_id: tts.piper
    data:
      cache: false
      media_player_entity_id: media_player.studio_speaker
      message: "{{ rep.response.speech.plain.speech }}"
mode: single

Also - It’s not clear to me why we are using an automation. Shouldn’t piping TTS audio from VA to a speaker be in the VA settings?

alias: "ollamaspeak"
description: ""
triggers:
  - trigger: conversation
    command: "{question}"
conditions: []
actions:
  - action: conversation.process
    metadata: {}
    data:
      agent_id: conversation.llama3_2  #change this
      text: "{{trigger.slots.question }}"
    response_variable: rep
  - set_conversation_response: "{{ rep.response.speech.plain.speech }}"
    enabled: true
  - action: tts.speak
    metadata: {}
    data:
      cache: false
       media_player_entity_id: media_player.studio_speaker
      #media_player_entity_id: "{{ states('input_select.changeout') }}"
      message: "{{ rep.response.speech.plain.speech }}"
    target:
      entity_id: tts.piper
mode: single

try this

i don’t undertand what you meant

what’s the name of your conversation agent in you ollama integration?

Ollama Mistral. However I’m trying with Default HA VA. It’s not “listening” to either.

Honestly though, my Pi satellite is so buggy. I’m switching to phone/browser.

Spent days and days on this.

the name should look like this " agent_id: conversation.yourllm "

I have that sorted.

Logs on Satellite:

DEBUG:root:Detection(name='hey_jarvis_v0.1', timestamp=19631700570901, speaker=None)
DEBUG:root:Streaming audio
DEBUG:root:Event(type='run-pipeline', data={'start_stage': 'asr', 'end_stage': 'tts', 'restart_on_end': False, 'snd_format': {'rate': 22050, 'width': 2, 'channels': 1}}, payload=None)
DEBUG:root:Event(type='transcript', data={'text': ' How can you help me today?'}, payload=None)
INFO:root:Waiting for wake word

==> /tmp/oww.log <==
DEBUG:wyoming_openwakeword.handler:Receiving audio from client: 19306016890030

==> /tmp/sat.log <==
DEBUG:root:Event(type='synthesize', data={'text': 'I am a voice assistant designed for Home Assistant, capable of controlling devices and providing real-time information about the current state of your smart home. You can ask me to turn on or off various appliances like lights, locks, fans, and more by specifying their name and domain or area.\n\nFor instance: "Hey Mistral, please turn on the kitchen light" or "Hey Mistral, is the studio lamp currently on?"\n\nIf you have any other questions or need assistance with something else, feel free to ask!', 'voice': {'name': '123456'}}, payload=None)

Payload=None (Ollama/Mistral VA with Elevenlabs TTS)

Is your automation supposed to hijack that and send it to my Google Home speaker?

On Android App - Still plays back on the phone, and not the Google Home Speaker.

sorry i can’t help you. i just made a few test and all work as intended

OK this auto is actually triggering - so we have progress. It just says “Done” on Satellite - and Google speaker works with Piper, but not Elevenlabs.

alias: ollamaspeak
description: ""
triggers:
  - trigger: conversation
    command: "{question}"
conditions: []
actions:
  - action: conversation.process
    metadata: {}
    data:
      agent_id: conversation.ollama_conversation
      text: "{{trigger.slots.question }}"
    response_variable: rep
  - set_conversation_response: Stand by
    enabled: true
  - data:
      cache: false
      message: "\"{{ rep.response.speech.plain.speech }}\""
      media_player_entity_id: media_player.bedroom_speaker
    action: tts.speak
    target:
      entity_id: tts.piper
mode: single

Issue in getting studio speaker working was it had gone offline. Tried another speaker and it’s working. Shame that it’s only Piper as that’s the main point is Elevenlabs is not working with Satellite. Only Mobile App and Browser.

This is all VERY buggy. But thank you for your help. Using Automations is not something I’d have thought of to intercept the TTS.

the “done” problem happen when you select a different conversation agent in the automation
ex:
conversation.llm in the auto but you speak to conversation.home_assistant

And it’s inexplicably stopped working having changed nothing