Hi everyone,
I recently picked up one of these over the holidays
The reference docs for M5Stack products. Quick start, get the detailed information or instructions such as IDE,UIFLOW,Arduino. The tutorials for M5Burner, Firmware, Burning, programming. ESP32,M5StickC,StickV, StickT,M5ATOM.
It differs from the other echo model in two major ways:
No RGB LED (there is a green one behind the reset button, but it isn’t exposed via pins to the esp32 as far as I can tell)
Much larger PSRAM
The lack of LED means a typical voice pipeline now needs to issue some sort of ping/ding or sound to let the user know when the wake word is heard. Esphome has some great options in the components for this. Overall, really good, with some minor quirks I’m documenting here so we can continue improving voice.
The Echo S3R only has one I2S bus (not two like the Voice PE). This means the device "walkie talkie"s with the speaker and microphone taking turns on the bus. The public yaml for this device (esphome-yaml/common/atom-echos3r-satellite-base.yaml at 2fd326380b3ee362ddeaa0f101b1a77c195bd393 · m5stack/esphome-yaml · GitHub ) works, but has some quirks. It’s easy enough to play a sound over the I2S bus to a speaker in the “on_wake_word_detected” section, but we need to clear the bus before it gets re-occupied by the next call to voice_assistant.start (which uses the bus for microphone in the STT step). If we don’t clear the bus after playing the sound, the voice_assistant component will retry occupying the bus every 1 second (which is way too long for interactive voice commands). The public yaml tries to play a ding sound, followed by a 300ms delay, which may or may not be long enough to clear the bus.
A more reliable option to get this all working smoothly is to forcibly stop the media_player and the speaker in rapid succession and waiting for the bus to clear as part of the micro_wake_word component. This lets the voice_assistant component start capturing audio for STT very quickly after the wake word ding. We also get the chance to simply drop audio that’s too long. The ding from the Voice PE project is 1 second long by default. This config trims it to 250ms, and feels much more natural in my testing.
on_wake_word_detected:
- script.execute:
id: play_sound
priority: true
sound_file: !lambda return id(wake_word_triggered_sound);
- delay: 250ms
- media_player.stop:
- speaker.stop:
- wait_until:
condition:
speaker.is_stopped: i2s_speaker
- voice_assistant.start:
wake_word: !lambda return wake_word;
My testing also revealed that this won’t work within the “on_wake_word_detected” option within the voice_assistant component itself. It seemed like stopping either the microphone or speaker while inside the voice_assistant pipe resulted in the pipe stopping itself. This means this wake word ding config only works with on-device microwakewords.
Overall, really fun project, with a nice new device.
markss
(Mark)
January 15, 2026, 2:34am
2
Hi @afloat5271 I’m so glad I found your topic!
I too picked up an Atom Echo S3R over the holidays but have not had the same success you have had in getting it working.
Would you mind sharing your complete YAML file including your changes in this thread please?
Sure! Here ya go. Please note that I’m using a custom “hey tater” wake word.
substitutions:
# Phases of the Voice Assistant
# The voice assistant is ready to be triggered by a wake word
voice_assist_idle_phase_id: "1"
# The voice assistant is listening for a voice command
voice_assist_listening_phase_id: "2"
# The voice assistant is currently processing the command
voice_assist_thinking_phase_id: "3"
# The voice assistant is replying to the command
voice_assist_replying_phase_id: "4"
# The voice assistant is not ready
voice_assist_not_ready_phase_id: "10"
# The voice assistant encountered an error
voice_assist_error_phase_id: "11"
# Muted phase
voice_assist_muted_phase_id: "12"
# Finished timer phase
voice_assist_timer_finished_phase_id: "20"
esphome:
name: m5echos3r
friendly_name: m5echos3r
on_boot:
- priority: 600
then:
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- priority: -100
then:
media_player.speaker.play_on_device_media_file:
media_file: wake_word_triggered_sound
announcement: false
# Enable logging
logger:
#level: VERBOSE
# Enable Home Assistant API
api:
encryption:
key: ""
ota:
- platform: esphome
password: ""
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "M5Echos3R Fallback Hotspot"
password: ""
esp32:
board: esp32s3box
flash_size: 8MB
cpu_frequency: 240MHz
framework:
type: esp-idf
## Note: Disable these configurations if you face the boot loop issue.
sdkconfig_options:
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_ESP32S3_INSTRUCTION_CACHE_32KB: "y"
# Moves instructions and read only data from flash into PSRAM on boot.
# Both enabled allows instructions to execute while a flash operation is in progress without needing to be placed in IRAM.
# Considerably speeds up mWW at the cost of using more PSRAM.
CONFIG_SPIRAM_RODATA: "y"
CONFIG_SPIRAM_FETCH_INSTRUCTIONS: "y"
CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST: "y"
CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY: "y"
CONFIG_MBEDTLS_EXTERNAL_MEM_ALLOC: "y"
CONFIG_MBEDTLS_SSL_PROTO_TLS1_3: "y" # TLS1.3 support isn't enabled by default in IDF 5.1.5
psram:
mode: octal
speed: 80MHz
button:
- platform: factory_reset
id: factory_reset_btn
internal: true
binary_sensor:
- platform: gpio
pin:
number: GPIO41
mode: INPUT_PULLUP
inverted: true
id: user_button
internal: true
on_multi_click:
- timing:
- ON for at least 50ms
- OFF for at least 50ms
then:
- switch.turn_off: timer_ringing
- timing:
- ON for at least 10s
then:
- button.press: factory_reset_btn
# I2C Bus Configuration
i2c:
sda: GPIO45
scl: GPIO0
scan: false
id: i2c0
# I2S Bus Configuration
i2s_audio:
- id: i2s_audio_bus
i2s_lrclk_pin: GPIO3
i2s_bclk_pin: GPIO17
i2s_mclk_pin: GPIO11
audio_dac:
- platform: es8311
id: es8311_dac
bits_per_sample: 16bit
sample_rate: 48000
microphone:
- platform: i2s_audio
id: i2s_mic
sample_rate: 16000
i2s_din_pin: GPIO4
bits_per_sample: 16bit
adc_type: external
speaker:
- platform: i2s_audio
id: i2s_speaker
i2s_dout_pin: GPIO48
dac_type: external
sample_rate: 48000
bits_per_sample: 16bit
channel: left
audio_dac: es8311_dac
buffer_duration: 100ms
media_player:
- platform: speaker
name: None
id: speaker_media_player
volume_min: 0.5
volume_max: 0.8
announcement_pipeline:
speaker: i2s_speaker
format: FLAC
sample_rate: 48000
num_channels: 1 # Atom Echo S3R only has one output channel
files:
- id: wake_word_triggered_sound
file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/wake_word_triggered.flac
- id: timer_finished_sound
file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/timer_finished.flac
- id: error_cloud_expired
file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/error_cloud_expired.mp3
on_announcement:
# Stop the wake word (mWW or VA) if the mic is capturing
- if:
condition:
- microphone.is_capturing:
then:
- script.execute: stop_wake_word
# Ensure VA stops before moving on
- if:
condition:
- lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- wait_until:
- not:
voice_assistant.is_running:
# Since VA isn't running, this is user-intiated media playback. Draw the mute display
- if:
condition:
not:
voice_assistant.is_running:
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
on_idle:
# Since VA isn't running, this is the end of user-intiated media playback. Restart the wake word.
- if:
condition:
not:
voice_assistant.is_running:
then:
- script.execute: start_wake_word
- script.execute: set_idle_or_mute_phase
micro_wake_word:
id: mww
microphone: i2s_mic
stop_after_detection: false
models:
- model: https://github.com/TaterTotterson/microWakeWords/raw/main/microWakeWords/hey_tater.json
id: hey_tater
- model: https://github.com/kahrendt/microWakeWord/releases/download/stop/stop.json
id: stop
internal: true
vad:
on_wake_word_detected:
- script.execute:
id: play_sound
priority: true
sound_file: !lambda return id(wake_word_triggered_sound);
- delay: 250ms
- media_player.stop:
- speaker.stop:
- wait_until:
condition:
speaker.is_stopped: i2s_speaker
- voice_assistant.start:
wake_word: !lambda return wake_word;
voice_assistant:
id: va
microphone: i2s_mic
media_player: speaker_media_player
micro_wake_word: mww
use_wake_word: false
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
on_tts_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
on_end:
# Wait a short amount of time to see if an announcement starts
- wait_until:
condition:
- media_player.is_announcing:
timeout: 0.5s
# Announcement is finished and the I2S bus is free
- wait_until:
- and:
- not:
media_player.is_announcing:
- not:
speaker.is_playing:
# Restart only mWW if enabled; streaming wake words automatically restart
- if:
condition:
- lambda: return id(wake_word_engine_location).state == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start:
- script.execute: set_idle_or_mute_phase
on_error:
# Only set the error phase if the error code is different than duplicate_wake_up_detected or stt-no-text-recognized
# These two are ignored for a better user experience
- if:
condition:
and:
- lambda: return !id(init_in_progress);
- lambda: return code != "duplicate_wake_up_detected";
- lambda: return code != "stt-no-text-recognized";
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- delay: 1s
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
# If the error code is cloud-auth-failed, serve a local audio file guiding the user.
- if:
condition:
- lambda: return code == "cloud-auth-failed";
then:
- script.execute:
id: play_sound
priority: true
sound_file: !lambda return id(error_cloud_expired);
on_client_connected:
- lambda: id(init_in_progress) = false;
- script.execute: start_wake_word
- script.execute: set_idle_or_mute_phase
on_client_disconnected:
- script.execute: stop_wake_word
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
on_timer_finished:
- switch.turn_on: timer_ringing
- wait_until:
media_player.is_announcing:
- lambda: id(voice_assistant_phase) = ${voice_assist_timer_finished_phase_id};
script:
# Starts either mWW or the streaming wake word, depending on the configured location
- id: start_wake_word
then:
- if:
condition:
and:
- not:
- voice_assistant.is_running:
- lambda: return id(wake_word_engine_location).state == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start:
- if:
condition:
and:
- not:
- voice_assistant.is_running:
- lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
# Stops either mWW or the streaming wake word, depending on the configured location
- id: stop_wake_word
then:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "On device";
then:
- micro_wake_word.stop:
# Set the voice assistant phase to idle or muted, depending on if the software mute switch is activated
- id: set_idle_or_mute_phase
then:
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- id: play_sound
parameters:
priority: bool
sound_file: "audio::AudioFile*"
then:
- lambda: |-
if (priority) {
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_STOP)
.set_announcement(true)
.perform();
}
if ( (id(speaker_media_player).state != media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING ) || priority) {
id(speaker_media_player)
->play_file(sound_file, true, false);
}
- script.execute: stop_wake_word
switch:
- platform: gpio
name: Speaker Enable
pin: GPIO18
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
disabled_by_default: true
internal: true
- platform: template
name: Mute
id: mute
icon: "mdi:microphone-off"
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
entity_category: config
on_turn_off:
- microphone.unmute:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
on_turn_on:
- microphone.mute:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- platform: template
id: timer_ringing
optimistic: true
internal: true
restore_mode: ALWAYS_OFF
on_turn_off:
# Turn off the repeat mode and disable the pause between playlist items
- lambda: |-
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_OFF)
.set_announcement(true)
.perform();
id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 0);
# Stop playing the alarm
- media_player.stop:
announcement: true
on_turn_on:
# Turn on the repeat mode and pause for 1000 ms between playlist items/repeats
- lambda: |-
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_ONE)
.set_announcement(true)
.perform();
id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 1000);
- media_player.speaker.play_on_device_media_file:
media_file: timer_finished_sound
announcement: true
- delay: 15min
- switch.turn_off: timer_ringing
select:
- platform: template
entity_category: config
name: Wake word engine location
id: wake_word_engine_location
icon: "mdi:account-voice"
optimistic: true
restore_value: true
options:
- In Home Assistant
- On device
initial_option: On device
on_value:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- wait_until:
lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
- if:
condition:
lambda: return x == "In Home Assistant";
then:
- micro_wake_word.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- if:
condition:
lambda: return x == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- micro_wake_word.start
- platform: template
name: "Wake word sensitivity"
optimistic: true
initial_option: Slightly sensitive
restore_value: true
entity_category: config
options:
- Slightly sensitive
- Moderately sensitive
- Very sensitive
on_value:
# Sets specific wake word probabilities computed for each particular model
# Note probability cutoffs are set as a quantized uint8 value, each comment has the corresponding floating point cutoff
# False Accepts per Hour values are tested against all units and channels from the Dinner Party Corpus.
# These cutoffs apply only to the specific models included in the firmware: [email protected] , hey_jarvis@v2, hey_mycroft@v2
lambda: |-
if (x == "Slightly sensitive") {
id(hey_tater).set_probability_cutoff(250); // 0.97 -> 0.563 FAPH on DipCo (Manifest's default)
} else if (x == "Moderately sensitive") {
id(hey_tater).set_probability_cutoff(245); // 0.92 -> 0.939 FAPH on DipCo
} else if (x == "Very sensitive") {
id(hey_tater).set_probability_cutoff(222); // 0.83 -> 1.502 FAPH on DipCo
}
globals:
- id: init_in_progress
type: bool
restore_value: false
initial_value: "true"
- id: voice_assistant_phase
type: int
restore_value: false
initial_value: ${voice_assist_not_ready_phase_id}
1 Like