unfortunately this does not work for me. When I try it in the developer tools I have to add a target, but I can’t add it to the call in esphome.
Also, “x” contains the URL of the raw audio file / stream, not the response text. So even if it worked, I would expect to head something like “http…”?
It is strange. I am using the same yaml and it works as expected without a single problem.
Here is my voice assistant config portion is esphome, did I put the home assistant media player in the correct spot?
voice_assistant:
id: va
microphone: echo_microphone
speaker: echo_speaker
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
vad_threshold: 3
on_listening:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: pulse
on_tts_start:
- light.turn_on:
id: led
blue: 0%
red: 0%
green: 100%
brightness: 100%
effect: pulse
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.announcements
data_template:
message: "{{ my_stt }}"
variables:
my_stt: return x;
on_end:
- delay: 100ms
- wait_until:
not:
speaker.is_playing:
- script.execute: reset_led
on_error:
- light.turn_on:
id: led
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: none
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- lambda: |-
if (code == "wake-provider-missing" || code == "wake-engine-missing") {
id(use_wake_word).turn_off();
}
on_client_connected:
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.start_continuous:
- script.execute: reset_led
on_client_disconnected:
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.stop:
- light.turn_off: led
The output on the Home Assistant mediaplayer works for me but at the same time the speaker of the Atoim Echo speaks the same line as well.
Is there any way to stop it doing this or at least mute that small speaker? (without desoldering it)
I simply “broke” the configuration of the speaker in my M5Stack Atom Echo, so it does no longer play audio:
speaker:
- platform: i2s_audio
id: echo_speaker
# i2s_dout_pin: GPIO22
# dac_type: external
# mode: mono
dac_type: internal # wrong config to mute speaker
mode: left # wrong config to mute speaker
Then adding this to my voice_assistant gives me only output on my Sonos speaker
on_tts_end:
- homeassistant.service:
service: media_player.play_media
data:
media_content_id: !lambda 'return x;'
media_content_type: audio/mpeg
entity_id: media_player.wz_sonos
In an ideal world I would be able to use the build-in mic in the Sonos speaker… but hey, you can’t have everything I guess
A more comprehensive solution, thanks to Amrit Prabhu of smarthomecircle.com, which shows how to direct the tts output to a local voice assistant:
# For an internal voice assistant, use tts.speak to send to tts.piper
#
on_tts_start: # this is required to play the output on a media player
- homeassistant.service:
service: tts.speak
data:
media_player_entity_id: media_player.my_media_player #replace this with your media player entity id
message: !lambda 'return x;'
entity_id: tts.piper #replace this with your piper tts entity id.
#
# For a cloud-based voice assistant, use tts.cloud_say to send to Home Assistant Cloud
#
on_tts_start:
# send the tts response on a home assistant media player
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.my_media_player #replace this with your media player entity id
message: !lambda 'return x;'
Wasn’t working here either. Finally found after days of searching, just needed to grant the esphome device permission to make Home Assistant service calls. You can do this in the device configuration. Hope this helps
using “wrong config” only works temporarily, after a short while the i2s buffer runs out and breaks the pipeline until it restarts. i’m sort of lucky i also have a fried echo, or i thought it was, but it’s only the speaker that is dead so that fixed that problem for me, yet we do need that option to define the output device, not everyone have half-fried echo’s…
also thought of setting volume of speaker to 0, but thats not a option for speaker, sadly i think we just have to wait and hope they start thinking OUT of the assistant box and realize input and output doesn’t have to be the same device…
has anyone still got this working?
if so can you please post your config as i’m having no luck at all.
my HomePod pauses the current playing media but nothing plays.
Yes - I just got it working using the on_tts_start example in this thread although I’m using an Amazon echo rather than a HomePod. For an Echo, one must set the public URL for accessing HA in the Alexa MediaPlayer integration. Does HomePod have something similar ?
Here is my ESPHome yaml for an NodeMCU-ESP32S board - note that this is NOT original but code for the Atom M5 box + the code in this thread. It does still need a few tweaks - this still uses the attached speaker. If I exclude the speaker from the voice_assistant declaration, the board reboots before sending the text to the Echo. I also thing the esp-adf PR5230 is invalid now but somehow the latest esp-adf code was downloaded to my system and it builds ok. I don’t really know how that happened… At the end of the day, this whole pipeline needs some more formal treatment by people that know better.
esphome:
name: nodemcu-esp-32s
friendly_name: NodeMCU ESP-32S
esp32:
board: esp32dev
framework:
type: esp-idf
version: recommended
logger:
# level: VERBOSE
# Enable Home Assistant API
api:
encryption:
key: <key>
ota:
password: <key>
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Nodemcu-Esp-32S Fallback Hotspot"
password: "T18wj5zTcPbM"
captive_portal:
web_server:
port: 80
button:
- platform: factory_reset
id: factory_reset_btn
name: Factory reset
light:
- platform: status_led
id: gpio2_light
name: "Status led"
pin: GPIO2
- platform: esp32_rmt_led_strip
id: led
name: None
disabled_by_default: true
entity_category: config
pin: GPIO22
default_transition_length: 0s
chipset: WS2812
num_leds: 1
rgb_order: grb
rmt_channel: 0
effects:
- pulse:
name: "Slow Pulse"
transition_length: 250ms
update_interval: 250ms
min_brightness: 50%
max_brightness: 100%
- pulse:
name: "Fast Pulse"
transition_length: 100ms
update_interval: 100ms
min_brightness: 50%
max_brightness: 100%
i2s_audio:
- id: i2s_out
i2s_lrclk_pin: GPIO26
i2s_bclk_pin: GPIO27
- id: i2s_in
i2s_lrclk_pin: GPIO19
i2s_bclk_pin: GPIO18
speaker:
- platform: i2s_audio
id: echo_speaker
dac_type: external
i2s_audio_id: i2s_out
i2s_dout_pin: GPIO14
mode: mono
microphone:
- platform: i2s_audio
adc_type: external
pdm: false
id: echo_microphone
i2s_audio_id: i2s_in
i2s_din_pin: GPIO23
voice_assistant:
id: va
microphone: echo_microphone
speaker: echo_speaker
noise_suppression_level: 2
auto_gain: 31dBFS
# volume_multiplier: 2.0
vad_threshold: 3
on_listening:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
effect: "Slow Pulse"
on_stt_vad_end:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
effect: "Fast Pulse"
on_tts_start:
- light.turn_on:
id: led
blue: 50%
red: 0%
green: 50%
brightness: 100%
effect: none
- homeassistant.service:
service: tts.speak
data:
media_player_entity_id: media_player.office_black #replace this with your media player entity id
message: !lambda 'return x;'
entity_id: tts.piper #replace this with your piper tts entity id.
on_end:
- delay: 100ms
- wait_until:
not:
speaker.is_playing:
- script.execute: reset_led
# on_tts_end:
# - homeassistant.service:
# service: media_player.play_media
# data:
# entity_id: media_player.office_black
# media_content_id: !lambda 'return x;'
# media_content_type: music
# announce: "true"
# - script.execute: reset_led
on_error:
- light.turn_on:
id: led
red: 100%
green: 0%
blue: 0%
brightness: 100%
effect: none
- delay: 1s
- script.execute: reset_led
on_client_connected:
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.start_continuous:
- script.execute: reset_led
on_client_disconnected:
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.stop:
- light.turn_off: led
binary_sensor:
- platform: gpio
pin:
number: GPIO0
inverted: true
name: Button
disabled_by_default: true
entity_category: diagnostic
id: echo_button
on_multi_click:
- timing:
- ON for at least 250ms
- OFF for at least 50ms
then:
- if:
condition:
switch.is_off: use_wake_word
then:
- if:
condition: voice_assistant.is_running
then:
- voice_assistant.stop:
- script.execute: reset_led
else:
- voice_assistant.start:
else:
- voice_assistant.stop
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- voice_assistant.start_continuous:
- timing:
- ON for at least 10s
then:
- button.press: factory_reset_btn
switch:
- platform: restart
name: "Restart"
- platform: template
name: Use wake word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(va).set_use_wake_word(true);
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
- script.execute: reset_led
on_turn_off:
- voice_assistant.stop
- lambda: id(va).set_use_wake_word(false);
- script.execute: reset_led
- platform: template
name: Use listen light
id: use_listen_light
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- script.execute: reset_led
on_turn_off:
- script.execute: reset_led
script:
- id: reset_led
then:
- if:
condition:
- switch.is_on: use_wake_word
- switch.is_on: use_listen_light
then:
- light.turn_on:
id: led
red: 100%
green: 89%
blue: 71%
brightness: 60%
effect: none
else:
- light.turn_off: led
external_components:
- source: github://pr#5230
components:
- esp_adf
refresh: 0s
esp_adf:
I’m new to Home Assistant and ESPHome, and I’m not sure where the yaml goes in my config, or if I need to add other keys.
Does the on_tts_start key need to be nested within another key? If so, how would I know what key that is, and what other keys are required to be nested within that key?
Thanks in advance for any help.
Not sure if you got this working but I just add the on_tts_end part, I don’t have the on_tts_start in my config and it works perfectly with my Korvo-1, see below. Don’t forget to enable to configure the device to allow it to make action (service) calls as stated a few posts up…
on_tts_stream_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
on_tts_end:
- homeassistant.service:
service: media_player.play_media
data:
entity_id: media_player..output_speaker
media_content_id: !lambda 'return x;'
media_content_type: music
announce: "true"
on_tts_stream_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
on_end:
Hi all! First off, thanks to everyone in this thread, as this was what enabled me to get output to my GHome speakers when setting up my new Atom Echos.
That said, some things have evolved in recent times, specifically with regard to being able to easily disable the onboard speakers of the Atom Echo and S3 Box so the Assist responses only come out of your chosen speaker: the !remove
statement. So, in the end, this is my current fully working version of the config edit for the voice_assistant:
block, which also includes some tweaks for the Atom Echo’s microphone to allow Assist an easier time understanding you in more situations beyond total silence. With the below noise and vol tweaks, my Atoms can still hear me clearly from an entire room away most of the time:
voice_assistant:
noise_suppression_level: 4 # increase noise suppression to 3 -or- 4 from default 2 for better sound floor suppression
volume_multiplier: 5.0 # increase multiplier from 2.0 to 5.0 to give the mic a little boost...going above 5.0 with Atom Echo resulted in distorted audio for me
speaker: !remove # remove the default 'echo_speaker' entry so VA doesn't use internal speaker at all. NOTE: THIS ALSO DISABLES SOUND FOR TIMERS but LED will still flash on finish
on_tts_start: # this gets the TTS pipeline started earlier than 'on_tts_end' and reduces response delay for the user, but might not work for Amazon Echos
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.my_speaker_2
message: !lambda 'return x;'
Instead of rewriting the entire on_timer_finished:
block to push the timer audio to the media_player, I’ll probably just expose the timer_ringing
switch to HA and automate off of that, as I would like to be able to cancel the timers from an HA notification on my phone, as well, instead of just the button on the Atom. This isn’t a priority for me right now, as I don’t use timers very often. I’m just glad to be able to hear my Atom’s responses now without the faint crackly echo of the, well, Echo. haha
Hey guys!! I’m new to the forum, I found a solution trying to solve the problem of the sound only coming out on my Google speaker, I commented a line on the speaker, and it worked without error.
I tried nearly all variations here, but all I get as output on my amazon echo device is that it tells me something about an https url. But not the real response.
Any Idea?
Config looks like that:
on_tts_start:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: none
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.echo_wohnzimmer_2 #replace this with your media player entity id
message: !lambda 'return x;'
I finally got around to exposing timer_ringing (indirectly via new switch) so I could automate off of it. Now, via automations, I get a TTS announcement and a notification on my phone when a timer is finished.
Here is my current complete set of customizations for anyone wondering:
voice_assistant:
# Adjust Mic parameters for better understanding depending on room environment
noise_suppression_level: 2 # 1-4
volume_multiplier: 5.0 #1.0-6.0 (higher than 6 will result in major distortion)
# Don't use Atom's speaker at all
speaker: !remove
# Output response as a TTS to a chosen speaker
on_tts_start:
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.den_speaker_2
message: !lambda 'return x;'
# I want to know when an Atom loses connection to HA, so blink the light fast red
on_client_disconnected:
then:
- voice_assistant.stop: {}
- micro_wake_word.stop: {}
- light.turn_on:
id: led
red: 1.0
green: 0.0
blue: 0.0
brightness: 1.0
effect: Fast Pulse
state: true
# Expose a restart button to HA so the Atom can be remotely rebooted
# (can fix a stuck pipeline or unstable wifi connection in multi-AP/mesh environments)
button:
- platform: restart
id: restart_btn
name: Reboot
disabled_by_default: false
icon: mdi:restart-alert
entity_category: config
device_class: restart
# Expose a new switch to HA to indicate timer_ringing AND ability to
# toggle it back to an off state (acknowledges the timer, same as pressing
# Atom's front button); automate using this switch
switch:
- platform: template
name: Timer Ringing
optimistic: true
lambda: |-
if (id(timer_ringing).state) {
return true;
} else {
return false;
}
turn_off_action:
- switch.turn_off: timer_ringing
I believe Alexa devices work differently because the audio has to be sent to Amazon before being sent back down to the Echo (don’t get me started on Echo devices and Amazon’s data hoovering; suffice it to say don’t send anything you don’t want Amazon to aggressively use against you).
Config for atom echo with tts sent to separate media player.
Uses tts.cloud_say so will require home assistant cloud
Automatically sets the volume to 20% for voice responses and returns to previous volume state after response.
voice_assistant:
id: va
microphone: echo_microphone
speaker: echo_speaker
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
vad_threshold: 3
on_listening:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: pulse
on_tts_start:
- light.turn_on:
id: led
blue: 0%
red: 0%
green: 100%
brightness: 100%
effect: pulse
- homeassistant.service:
service: media_player.volume_set
data:
entity_id: media_player.living_room_speaker # Replace with your media player entity_id
volume_level: 0.2 # Set volume to 20%
- homeassistant.service:
service: tts.cloud_say
data:
entity_id: media_player.living_room_speaker
data_template:
message: "{{ tts_message }}"
variables:
tts_message: return x;
on_end:
- light.turn_off: led
- homeassistant.service:
service: media_player.volume_set
data_template:
entity_id: media_player.living_room_speaker
volume_level: "{{ states('media_player.living_room_speaker.attributes.volume_level') }}"
- delay: 100ms
- wait_until:
not:
speaker.is_playing:
- script.execute: reset_led
on_error:
- light.turn_on:
id: led
blue: 0%
red: 100%
green: 0%
Here’s a config that uses piper instead of tts.cloud_say for local tts
I run piper and faster-whisper via docker on a separate sever. You can simply use the piper addon if your hardware is decent or run as a container for very fast responses. As of 2024.12 my response times are a second or less.
voice_assistant:
id: va
microphone: echo_microphone
speaker: echo_speaker
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
vad_threshold: 3
on_listening:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: pulse
on_tts_start:
- light.turn_on:
id: led
blue: 0%
red: 0%
green: 100%
brightness: 100%
effect: pulse
- homeassistant.service:
service: media_player.volume_set
data:
entity_id: media_player.living_room_speaker # Replace with your media player entity_id
volume_level: 0.2 # Lower volume to 20%
- homeassistant.service:
service: tts.piper_say
data:
entity_id: media_player.living_room_speaker # Replace with your media player entity_id
message: "{{ tts_message }}"
variables:
tts_message: return x;
on_end:
- light.turn_off: led
- homeassistant.service:
service: media_player.volume_set
data_template:
entity_id: media_player.living_room_speaker
volume_level: "{{ states('media_player.living_room_speaker.attributes.volume_level') }}"
- delay: 100ms
- wait_until:
not:
speaker.is_playing:
- script.execute: reset_led
on_error:
- light.turn_on:
id: led
blue: 0%
red: 100%
green: 0%
For me, using this config the wake word works, but then it just gets stuck in “Assist satellite → Processing”, no errors in the logs. Before adding this to the config, the ATOM Echo was working fine, it only had the bad speaker sound.