It took me a long time looking at different documentation sources to figure this out, so here it is for anyone who wants to know how to output the response speech from the ESPHome Voice Assistant to a Home Assistant Media Player (not a media player in ESPHome). Here is the fragment of my esphome.yaml:
Note that the service:, data:, data_template:, and variables: fields are properties of the esphome homeassistant.service
And that entity_id:, media_content_type:, and media_content_id: are the specific properties of the Home Assistant media_player.play_media service (from Developer Tools → Services)
The voice assistant speech output is "A URL containing the audio response is available to automations as the variable x"
does this realy work for you? i nearly tryd everthing. i was just aber to get my nest playing the url. she sias “http…” or my url buit not so soundfile itself. also on my andoid tv or any othe meida player. can any one help?
Well,
i did work for a while, but no more. I still get the text-to-speech returned, but it does not output sound on any of my speakers. The debug output is as follows:
And if I click on the “Play Audio” link below the debug output, it plays sound - on my computer or on my phone. I’ve upgraded ESPhome several times since I posted this solution, I guess some more experimentation is in order…
unfortunately this does not work for me. When I try it in the developer tools I have to add a target, but I can’t add it to the call in esphome.
Also, “x” contains the URL of the raw audio file / stream, not the response text. So even if it worked, I would expect to head something like “http…”?
The output on the Home Assistant mediaplayer works for me but at the same time the speaker of the Atoim Echo speaks the same line as well.
Is there any way to stop it doing this or at least mute that small speaker? (without desoldering it)
A more comprehensive solution, thanks to Amrit Prabhu of smarthomecircle.com, which shows how to direct the tts output to a local voice assistant:
# For an internal voice assistant, use tts.speak to send to tts.piper
#
on_tts_start: # this is required to play the output on a media player
- homeassistant.service:
service: tts.speak
data:
media_player_entity_id: media_player.my_media_player #replace this with your media player entity id
message: !lambda 'return x;'
entity_id: tts.piper #replace this with your piper tts entity id.
#
# For a cloud-based voice assistant, use tts.cloud_say to send to Home Assistant Cloud
#
on_tts_start:
# send the tts response on a home assistant media player
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.my_media_player #replace this with your media player entity id
message: !lambda 'return x;'
Wasn’t working here either. Finally found after days of searching, just needed to grant the esphome device permission to make Home Assistant service calls. You can do this in the device configuration. Hope this helps
using “wrong config” only works temporarily, after a short while the i2s buffer runs out and breaks the pipeline until it restarts. i’m sort of lucky i also have a fried echo, or i thought it was, but it’s only the speaker that is dead so that fixed that problem for me, yet we do need that option to define the output device, not everyone have half-fried echo’s…
also thought of setting volume of speaker to 0, but thats not a option for speaker, sadly i think we just have to wait and hope they start thinking OUT of the assistant box and realize input and output doesn’t have to be the same device…
Yes - I just got it working using the on_tts_start example in this thread although I’m using an Amazon echo rather than a HomePod. For an Echo, one must set the public URL for accessing HA in the Alexa MediaPlayer integration. Does HomePod have something similar ?
Here is my ESPHome yaml for an NodeMCU-ESP32S board - note that this is NOT original but code for the Atom M5 box + the code in this thread. It does still need a few tweaks - this still uses the attached speaker. If I exclude the speaker from the voice_assistant declaration, the board reboots before sending the text to the Echo. I also thing the esp-adf PR5230 is invalid now but somehow the latest esp-adf code was downloaded to my system and it builds ok. I don’t really know how that happened… At the end of the day, this whole pipeline needs some more formal treatment by people that know better.
I’m new to Home Assistant and ESPHome, and I’m not sure where the yaml goes in my config, or if I need to add other keys.
Does the on_tts_start key need to be nested within another key? If so, how would I know what key that is, and what other keys are required to be nested within that key?
Not sure if you got this working but I just add the on_tts_end part, I don’t have the on_tts_start in my config and it works perfectly with my Korvo-1, see below. Don’t forget to enable to configure the device to allow it to make action (service) calls as stated a few posts up…