It took me a long time looking at different documentation sources to figure this out, so here it is for anyone who wants to know how to output the response speech from the ESPHome Voice Assistant to a Home Assistant Media Player (not a media player in ESPHome). Here is the fragment of my esphome.yaml:
Note that the service:, data:, data_template:, and variables: fields are properties of the esphome homeassistant.service
And that entity_id:, media_content_type:, and media_content_id: are the specific properties of the Home Assistant media_player.play_media service (from Developer Tools → Services)
The voice assistant speech output is "A URL containing the audio response is available to automations as the variable x"
does this realy work for you? i nearly tryd everthing. i was just aber to get my nest playing the url. she sias “http…” or my url buit not so soundfile itself. also on my andoid tv or any othe meida player. can any one help?
Well,
i did work for a while, but no more. I still get the text-to-speech returned, but it does not output sound on any of my speakers. The debug output is as follows:
And if I click on the “Play Audio” link below the debug output, it plays sound - on my computer or on my phone. I’ve upgraded ESPhome several times since I posted this solution, I guess some more experimentation is in order…
unfortunately this does not work for me. When I try it in the developer tools I have to add a target, but I can’t add it to the call in esphome.
Also, “x” contains the URL of the raw audio file / stream, not the response text. So even if it worked, I would expect to head something like “http…”?
The output on the Home Assistant mediaplayer works for me but at the same time the speaker of the Atoim Echo speaks the same line as well.
Is there any way to stop it doing this or at least mute that small speaker? (without desoldering it)
A more comprehensive solution, thanks to Amrit Prabhu of smarthomecircle.com, which shows how to direct the tts output to a local voice assistant:
# For an internal voice assistant, use tts.speak to send to tts.piper
#
on_tts_start: # this is required to play the output on a media player
- homeassistant.service:
service: tts.speak
data:
media_player_entity_id: media_player.my_media_player #replace this with your media player entity id
message: !lambda 'return x;'
entity_id: tts.piper #replace this with your piper tts entity id.
#
# For a cloud-based voice assistant, use tts.cloud_say to send to Home Assistant Cloud
#
on_tts_start:
# send the tts response on a home assistant media player
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.my_media_player #replace this with your media player entity id
message: !lambda 'return x;'
Wasn’t working here either. Finally found after days of searching, just needed to grant the esphome device permission to make Home Assistant service calls. You can do this in the device configuration. Hope this helps
using “wrong config” only works temporarily, after a short while the i2s buffer runs out and breaks the pipeline until it restarts. i’m sort of lucky i also have a fried echo, or i thought it was, but it’s only the speaker that is dead so that fixed that problem for me, yet we do need that option to define the output device, not everyone have half-fried echo’s…
also thought of setting volume of speaker to 0, but thats not a option for speaker, sadly i think we just have to wait and hope they start thinking OUT of the assistant box and realize input and output doesn’t have to be the same device…
Yes - I just got it working using the on_tts_start example in this thread although I’m using an Amazon echo rather than a HomePod. For an Echo, one must set the public URL for accessing HA in the Alexa MediaPlayer integration. Does HomePod have something similar ?
Here is my ESPHome yaml for an NodeMCU-ESP32S board - note that this is NOT original but code for the Atom M5 box + the code in this thread. It does still need a few tweaks - this still uses the attached speaker. If I exclude the speaker from the voice_assistant declaration, the board reboots before sending the text to the Echo. I also thing the esp-adf PR5230 is invalid now but somehow the latest esp-adf code was downloaded to my system and it builds ok. I don’t really know how that happened… At the end of the day, this whole pipeline needs some more formal treatment by people that know better.
I’m new to Home Assistant and ESPHome, and I’m not sure where the yaml goes in my config, or if I need to add other keys.
Does the on_tts_start key need to be nested within another key? If so, how would I know what key that is, and what other keys are required to be nested within that key?
Not sure if you got this working but I just add the on_tts_end part, I don’t have the on_tts_start in my config and it works perfectly with my Korvo-1, see below. Don’t forget to enable to configure the device to allow it to make action (service) calls as stated a few posts up…
Hi all! First off, thanks to everyone in this thread, as this was what enabled me to get output to my GHome speakers when setting up my new Atom Echos.
That said, some things have evolved in recent times, specifically with regard to being able to easily disable the onboard speakers of the Atom Echo and S3 Box so the Assist responses only come out of your chosen speaker: the !remove statement. So, in the end, this is my current fully working version of the config edit for the voice_assistant: block, which also includes some tweaks for the Atom Echo’s microphone to allow Assist an easier time understanding you in more situations beyond total silence. With the below noise and vol tweaks, my Atoms can still hear me clearly from an entire room away most of the time:
voice_assistant:
noise_suppression_level: 4 # increase noise suppression to 3 -or- 4 from default 2 for better sound floor suppression
volume_multiplier: 5.0 # increase multiplier from 2.0 to 5.0 to give the mic a little boost...going above 5.0 with Atom Echo resulted in distorted audio for me
speaker: !remove # remove the default 'echo_speaker' entry so VA doesn't use internal speaker at all. NOTE: THIS ALSO DISABLES SOUND FOR TIMERS but LED will still flash on finish
on_tts_start: # this gets the TTS pipeline started earlier than 'on_tts_end' and reduces response delay for the user, but might not work for Amazon Echos
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.my_speaker_2
message: !lambda 'return x;'
Instead of rewriting the entire on_timer_finished: block to push the timer audio to the media_player, I’ll probably just expose the timer_ringing switch to HA and automate off of that, as I would like to be able to cancel the timers from an HA notification on my phone, as well, instead of just the button on the Atom. This isn’t a priority for me right now, as I don’t use timers very often. I’m just glad to be able to hear my Atom’s responses now without the faint crackly echo of the, well, Echo. haha
Hey guys!! I’m new to the forum, I found a solution trying to solve the problem of the sound only coming out on my Google speaker, I commented a line on the speaker, and it worked without error.