Play sound when Wakeword detected

spectralMachina · July 3, 2024, 4:42pm

Managed to get the file down to 7kb, and it seems to play okay — unfortunately, still having an issue actually getting the playback to happen. Sometimes it starts okay, but then after finishing the response it flicks straight back into “assist mode” and flashes the lights, causing a loop.

Normally though, it hears the wake word, plays the sound, and then flashes red after my voice command has finished and struggles to play back the response.

Here’s my config, along with some error logs:

substitutions:
  name: <NAME_HERE>
  friendly_name: Atom Echo
packages:
  m5stack.atom-echo-voice-assistant: github://esphome/firmware/voice-assistant/m5stack-atom-echo.yaml@main
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
api:
  encryption:
    key: <KEY HERE>

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

external_components:
  - source: github://jesserockz/esphome-components
    components: [file]

file:
  - id: ok_sound
    file: awake.raw

voice_assistant:
  on_wake_word_detected:
    - switch.turn_on: play_wakeword_sound
    - wait_until:
        not:
          - voice_assistant.is_running
    - lambda: id(echo_speaker).play(id(ok_sound), sizeof(id(ok_sound)));
    - delay: 1s
    - wait_until:
        not:
          speaker.is_playing:
    - voice_assistant.start_continuous
  on_stt_end:
    - if:
        condition:
          switch.is_on: play_wakeword_sound
        then:
          - switch.turn_off: play_wakeword_sound

switch:
  - platform: template
    name: Play Wakeword sound
    id: play_wakeword_sound
    optimistic: true
    internal: true
    restore_mode: RESTORE_DEFAULT_OFF
    on_turn_on:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);
    on_turn_off:
      - delay: 5s
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous

[17:34:45][D][voice_assistant:636]: Wake word detected
[17:34:45][D][switch:012]: 'Play Wakeword sound' Turning ON.
[17:34:45][D][switch:055]: 'Play Wakeword sound': Sending state ON
[17:34:45][D][voice_assistant:620]: Signaling stop...
[17:34:45][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[17:34:45][D][voice_assistant:510]: Desired state set to IDLE
[17:34:45][D][voice_assistant:627]: Event Type: 3
[17:34:45][D][voice_assistant:641]: STT started
[17:34:45][D][voice_assistant:504]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[17:34:45][D][light:036]: 'Atom Echo' Setting:
[17:34:45][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[17:34:45][D][light:109]:   Effect: 'Slow Pulse'
[17:34:45][D][esp-idf:000]: I (60569) I2S: DMA queue destroyed
[17:34:45]
[17:34:45][D][voice_assistant:504]: State changed from STOPPING_MICROPHONE to IDLE
[17:34:45][D][esp-idf:000][speaker_task]: I (60581) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8
[17:34:45]
[17:34:45][D][i2s_audio.speaker:203]: Starting I2S Audio Speaker
[17:34:45][D][i2s_audio.speaker:206]: Started I2S Audio Speaker
[17:34:46][D][esp-idf:000][speaker_task]: I (60908) I2S: DMA queue destroyed
[17:34:46]
[17:34:46][D][i2s_audio.speaker:210]: Stopping I2S Audio Speaker
[17:34:46][D][i2s_audio.speaker:222]: Stopped I2S Audio Speaker
[17:34:46][I][safe_mode:041]: Boot seems successful; resetting boot loop counter
[17:34:46][D][esp32.preferences:114]: Saving 2 preferences to flash...
[17:34:46][D][esp32.preferences:143]: Saving 2 preferences to flash: 0 cached, 2 written, 0 failed
[17:34:46][D][voice_assistant:504]: State changed from IDLE to START_MICROPHONE
[17:34:46][D][voice_assistant:510]: Desired state set to START_PIPELINE
[17:34:46][D][voice_assistant:221]: Starting Microphone
[17:34:46][D][voice_assistant:504]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[17:34:46][D][esp-idf:000]: I (61586) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

...

[17:37:51][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:37:51][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:37:51][E][voice_assistant:804]: Cannot receive audio, buffer is full
[17:37:51][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:37:51][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:37:51][W][i2s_audio.speaker:042]: Called start while task has been already created.

Tried without delay, a 0.5s and a 1s delay and currently all performing the same “assist start” > “flash red” > “fail to respond” loop.

spectralMachina · July 3, 2024, 4:50pm

And a full log from restarting the device, speaking the wakeword, and asking it to turn on the bedroom light (which it misheard as “battery light” )

[17:47:04][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to WAIT_FOR_VAD
[17:47:04][D][voice_assistant:510]: Desired state set to WAITING_FOR_VAD
[17:47:04][D][voice_assistant:245]: Waiting for speech...
[17:47:04][D][voice_assistant:504]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[17:47:04][D][voice_assistant:258]: VAD detected speech
[17:47:04][D][voice_assistant:504]: State changed from WAITING_FOR_VAD to START_PIPELINE
[17:47:04][D][voice_assistant:510]: Desired state set to STREAMING_MICROPHONE
[17:47:04][D][voice_assistant:275]: Requesting start...
[17:47:04][D][voice_assistant:504]: State changed from START_PIPELINE to STARTING_PIPELINE
[17:47:04][D][voice_assistant:525]: Client started, streaming microphone
[17:47:04][D][voice_assistant:504]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[17:47:04][D][voice_assistant:510]: Desired state set to STREAMING_MICROPHONE
[17:47:04][D][light:036]: 'Atom Echo' Setting:
[17:47:04][D][light:051]:   Brightness: 60%
[17:47:04][D][light:059]:   Red: 100%, Green: 89%, Blue: 71%
[17:47:04][D][voice_assistant:627]: Event Type: 1
[17:47:04][D][voice_assistant:630]: Assist Pipeline running
[17:47:04][D][voice_assistant:627]: Event Type: 9
[17:47:09][D][voice_assistant:627]: Event Type: 10
[17:47:09][D][voice_assistant:636]: Wake word detected
[17:47:09][D][switch:012]: 'Play Wakeword sound' Turning ON.
[17:47:09][D][switch:055]: 'Play Wakeword sound': Sending state ON
[17:47:09][D][voice_assistant:620]: Signaling stop...
[17:47:09][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[17:47:09][D][voice_assistant:510]: Desired state set to IDLE
[17:47:09][D][voice_assistant:627]: Event Type: 3
[17:47:09][D][voice_assistant:641]: STT started
[17:47:09][D][voice_assistant:504]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[17:47:09][D][light:036]: 'Atom Echo' Setting:
[17:47:09][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[17:47:09][D][light:109]:   Effect: 'Slow Pulse'
[17:47:09][D][esp-idf:000]: I (55511) I2S: DMA queue destroyed
[17:47:09]
[17:47:09][D][voice_assistant:504]: State changed from STOPPING_MICROPHONE to IDLE
[17:47:09][D][esp-idf:000][speaker_task]: I (55523) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8
[17:47:09]
[17:47:09][D][i2s_audio.speaker:203]: Starting I2S Audio Speaker
[17:47:09][D][i2s_audio.speaker:206]: Started I2S Audio Speaker
[17:47:09][D][esp-idf:000][speaker_task]: I (55850) I2S: DMA queue destroyed
[17:47:09]
[17:47:09][D][i2s_audio.speaker:210]: Stopping I2S Audio Speaker
[17:47:09][D][i2s_audio.speaker:222]: Stopped I2S Audio Speaker
[17:47:09][D][voice_assistant:504]: State changed from IDLE to START_MICROPHONE
[17:47:09][D][voice_assistant:510]: Desired state set to START_PIPELINE
[17:47:09][D][voice_assistant:221]: Starting Microphone
[17:47:09][D][voice_assistant:504]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[17:47:09][D][esp-idf:000]: I (56026) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4
[17:47:09]
[17:47:09][D][voice_assistant:504]: State changed from STARTING_MICROPHONE to START_PIPELINE
[17:47:09][D][voice_assistant:275]: Requesting start...
[17:47:09][D][voice_assistant:504]: State changed from START_PIPELINE to STARTING_PIPELINE
[17:47:09][D][voice_assistant:525]: Client started, streaming microphone
[17:47:09][D][voice_assistant:504]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[17:47:09][D][voice_assistant:510]: Desired state set to STREAMING_MICROPHONE
[17:47:09][D][voice_assistant:627]: Event Type: 1
[17:47:09][D][voice_assistant:630]: Assist Pipeline running
[17:47:09][D][voice_assistant:627]: Event Type: 3
[17:47:09][D][voice_assistant:641]: STT started
[17:47:09][D][light:036]: 'Atom Echo' Setting:
[17:47:09][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[17:47:11][D][voice_assistant:627]: Event Type: 11
[17:47:11][D][voice_assistant:781]: Starting STT by VAD
[17:47:12][D][voice_assistant:627]: Event Type: 12
[17:47:12][D][voice_assistant:785]: STT by VAD end
[17:47:12][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[17:47:12][D][voice_assistant:510]: Desired state set to AWAITING_RESPONSE
[17:47:12][D][voice_assistant:504]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[17:47:12][D][light:036]: 'Atom Echo' Setting:
[17:47:12][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[17:47:12][D][light:109]:   Effect: 'Fast Pulse'
[17:47:12][D][esp-idf:000]: I (58467) I2S: DMA queue destroyed
[17:47:12]
[17:47:12][D][voice_assistant:504]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[17:47:15][I][safe_mode:041]: Boot seems successful; resetting boot loop counter
[17:47:15][D][esp32.preferences:114]: Saving 2 preferences to flash...
[17:47:15][D][esp32.preferences:143]: Saving 2 preferences to flash: 0 cached, 2 written, 0 failed
[17:47:16][D][voice_assistant:627]: Event Type: 4
[17:47:16][D][voice_assistant:655]: Speech recognised as: " ."
[17:47:16][D][switch:016]: 'Play Wakeword sound' Turning OFF.
[17:47:16][D][switch:055]: 'Play Wakeword sound': Sending state OFF
[17:47:16][D][voice_assistant:627]: Event Type: 5
[17:47:16][D][voice_assistant:660]: Intent started
[17:47:16][D][voice_assistant:627]: Event Type: 6
[17:47:16][D][voice_assistant:627]: Event Type: 7
[17:47:16][D][voice_assistant:683]: Response: "Sorry, I couldn't understand that"
[17:47:16][D][light:036]: 'Atom Echo' Setting:
[17:47:16][D][light:051]:   Brightness: 100%
[17:47:16][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[17:47:16][D][light:109]:   Effect: 'None'
[17:47:16][D][voice_assistant:627]: Event Type: 98
[17:47:16][D][voice_assistant:768]: TTS stream start
[17:47:16][D][esp-idf:000][speaker_task]: I (63323) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8
[17:47:16]
[17:47:16][D][voice_assistant:504]: State changed from IDLE to START_MICROPHONE
[17:47:16][D][voice_assistant:510]: Desired state set to START_PIPELINE
[17:47:16][D][voice_assistant:221]: Starting Microphone
[17:47:16][D][voice_assistant:504]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[17:47:16][D][esp-idf:000][speaker_task]: I (63430) I2S: DMA queue destroyed
[17:47:16]
[17:47:19][D][voice_assistant:627]: Event Type: 4
[17:47:19][D][voice_assistant:655]: Speech recognised as: " Turn on the battery light"
[17:47:19][D][voice_assistant:627]: Event Type: 5
[17:47:19][D][voice_assistant:660]: Intent started
[17:47:20][D][voice_assistant:627]: Event Type: 6
[17:47:20][D][voice_assistant:627]: Event Type: 7
[17:47:20][D][voice_assistant:683]: Response: "Sorry, I am not aware of any device called battery"
[17:47:20][D][light:036]: 'Atom Echo' Setting:
[17:47:20][D][light:051]:   Brightness: 100%
[17:47:20][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[17:47:20][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:47:20][D][voice_assistant:627]: Event Type: 98
[17:47:20][D][voice_assistant:768]: TTS stream start
[17:47:20][D][voice_assistant:627]: Event Type: 8
[17:47:20][D][voice_assistant:703]: Response URL: "http://<IP_ADDR>:8123/api/tts_proxy/c53d8f663e718f89133e7251914040faa6866ede_en-gb_799a32846e_tts.piper.wav"
[17:47:20][D][voice_assistant:504]: State changed from STARTING_MICROPHONE to STREAMING_RESPONSE
[17:47:20][D][voice_assistant:510]: Desired state set to STREAMING_RESPONSE
[17:47:20][D][voice_assistant:627]: Event Type: 2
[17:47:20][D][voice_assistant:717]: Assist Pipeline ended
[17:47:20][D][light:036]: 'Atom Echo' Setting:
[17:47:20][D][light:051]:   Brightness: 60%
[17:47:20][D][light:059]:   Red: 100%, Green: 89%, Blue: 71%
[17:47:21][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:47:21][W][i2s_audio.speaker:042]: Called start while task has been already created.
[17:47:21][W][i2s_audio.speaker:042]: Called start while task has been already created.

umglurf · July 4, 2024, 1:52am

Hi

Not sure why it won’t work for you, I suspect there are some race condition happening. I tried to simply the setup a bit and got it working by just using (not needing the on_stt_end and play_wakeword_sound switch)

  on_wake_word_detected:
    - voice_assistant.stop:
    - lambda: id(va).set_use_wake_word(false);
    - wait_until:
        not:
          - voice_assistant.is_running:
    - lambda: id(echo_speaker).play(id(ok_sound), sizeof(id(ok_sound)));
    - wait_until:
        not:
          speaker.is_playing:
    - voice_assistant.start_continuous:
    - delay: 10s
    - lambda: id(va).set_use_wake_word(true);

You can also try even longer delays before re-enabling the wake word. This whole setup is quite hacky, it would have been nice if wakeword sound was added to voice assistant component in esphome properly.

umglurf · July 4, 2024, 2:34am

Inspired by this comment, I was able to get it working without the stop/start of voice assistant.

  on_wake_word_detected:
    - microphone.stop_capture:
    - wait_until:
        not:
          microphone.is_capturing:
    - lambda: id(echo_speaker).play(id(ok_sound), sizeof(id(ok_sound)));
  on_listening:
    - delay: 600ms
    - wait_until:
        not:
          speaker.is_playing:
    - microphone.capture:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Slow Pulse"

spectralMachina · July 4, 2024, 10:05pm

This one’s worked an absolute treat! Seems to be holding up alright after a few test runs.

Thank you so much for your assistance on getting this to work, it’s really appreciated! It’s such a shame this isn’t built-in functionality, I agree — a basic feature of pretty much every voice assistant is some kind of noise to say they’re listening, so to see it missing from ESPHome is quite frustrating.

Your code is working great for me as a stop-gap though, so thank you again for helping to diagnose and solve this issue!

Merc · July 13, 2024, 12:06pm

Hi @umglurf,
would you mind sharing your full configuration file?

Thanks,

Merc

umglurf · July 13, 2024, 1:11pm

Hi @Merc, here’s the full configuration file

---
substitutions:
  name: "m5stack-atom-echo"
  friendly_name: "M5Stack Atom Echo"

esphome:
  name: "${name}"
  friendly_name: "${friendly_name}"
  name_add_mac_suffix: true
  project:
    name: m5stack.atom-echo-voice-assistant
    version: "1.0"
  min_version: 2024.6.0

file:
  - id: ok_sound
    file: googleok-sound-beep.raw
  - id: timer_finished_wave_file
    file: https://github.com/esphome/firmware/raw/main/voice-assistant/sounds/timer_finished.wav

esp32:
  board: m5stack-atom
  framework:
    type: esp-idf

logger:
api:
  encryption:
    key: "XX"
ota:
  platform: esphome

dashboard_import:
  package_import_url: github://esphome/firmware/voice-assistant/m5stack-atom-echo.yaml@main

wifi:
  ssid: "XX"
  password: "XX"

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

i2s_audio:
  i2s_lrclk_pin: GPIO33
  i2s_bclk_pin: GPIO19

microphone:
  - platform: i2s_audio
    id: echo_microphone
    i2s_din_pin: GPIO23
    adc_type: external
    pdm: true

speaker:
  - platform: i2s_audio
    id: echo_speaker
    i2s_dout_pin: GPIO22
    dac_type: external
    mode: mono

voice_assistant:
  id: va
  microphone: echo_microphone
  speaker: echo_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  vad_threshold: 3
  on_wake_word_detected:
    - microphone.stop_capture:
    - wait_until:
        not:
          microphone.is_capturing:
    - lambda: id(echo_speaker).play(id(ok_sound), sizeof(id(ok_sound)));
  on_listening:
    - delay: 600ms
    - wait_until:
        not:
          speaker.is_playing:
    - microphone.capture:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Slow Pulse"
  on_stt_vad_end:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Fast Pulse"
  on_tts_start:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: none
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led
        red: 100%
        green: 0%
        blue: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:
          - script.execute: reset_led
  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:
          - light.turn_off: led
  on_timer_finished:
    - voice_assistant.stop:
    - switch.turn_on: timer_ringing
    - wait_until:
        not:
          microphone.is_capturing:
    - light.turn_on:
        id: led
        red: 0%
        green: 100%
        blue: 0%
        brightness: 100%
        effect: "Fast Pulse"
    - while:
        condition:
          switch.is_on: timer_ringing
        then:
          - lambda: id(echo_speaker).play(id(timer_finished_wave_file), sizeof(id(timer_finished_wave_file)));
          - delay: 1s
    - wait_until:
        not:
          speaker.is_playing:
    - light.turn_off: led
    - switch.turn_off: timer_ringing
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:
          - script.execute: reset_led

binary_sensor:
  - platform: gpio
    pin:
      number: GPIO39
      inverted: true
    name: Button
    disabled_by_default: true
    entity_category: diagnostic
    id: echo_button
    on_multi_click:
      - timing:
          - ON for at least 50ms
          - OFF for at least 50ms
        then:
          - if:
              condition:
                switch.is_on: timer_ringing
              then:
                - switch.turn_off: timer_ringing
              else:
                - if:
                    condition:
                      switch.is_off: use_wake_word
                    then:
                      - if:
                          condition: voice_assistant.is_running
                          then:
                            - voice_assistant.stop:
                            - script.execute: reset_led
                          else:
                            - voice_assistant.start:
                    else:
                      - voice_assistant.stop
                      - delay: 1s
                      - script.execute: reset_led
                      - script.wait: reset_led
                      - voice_assistant.start_continuous:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn

light:
  - platform: esp32_rmt_led_strip
    id: led
    name: None
    disabled_by_default: true
    entity_category: config
    pin: GPIO27
    default_transition_length: 0s
    chipset: SK6812
    num_leds: 1
    rgb_order: grb
    rmt_channel: 0
    effects:
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

script:
  - id: reset_led
    then:
      - if:
          condition:
            - switch.is_on: use_wake_word
            - switch.is_on: use_listen_light
          then:
            - light.turn_on:
                id: led
                red: 100%
                green: 89%
                blue: 71%
                brightness: 60%
                effect: none
          else:
            - light.turn_off: led

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);
      - script.execute: reset_led
  - platform: template
    name: Use listen light
    id: use_listen_light
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - script.execute: reset_led
  - platform: template
    id: timer_ringing
    optimistic: true
    internal: true
    restore_mode: ALWAYS_OFF
    on_turn_on:
      - delay: 15min
      - switch.turn_off: timer_ringing
    on_turn_off:
      - script.execute: reset_led


external_components:
  - source: github://pr#5230
    components:
      - esp_adf
    refresh: 0s
  - source: github://jesserockz/esphome-components
    components: [file]

esp_adf:

Merc · July 13, 2024, 3:00pm

Hi @umglurf,
awesome. Thanks a lot for the quick reply and for sharing the code.
Took a little while until the sound file I am using had the correct format but now it works. So helpful to get some response from the Echo without having to look for the pulsating light.

One thing that I am never sure with the ESP devices. What happens when the next ESP update is released? Will this overwrite all the settings in the config file and remove the wakeword sound again?

Thanks again!

Merc

umglurf · July 13, 2024, 3:43pm

Hi @Merc, glad to hear it worked for you.
If you update through homeassistant, you will loose the changes. What I did now when the new feature with timer was released, was look through the published config file and apply the new changes to my config and build and upload the firmware.

Merc · July 13, 2024, 4:29pm

Thanks umglurf,
I thought something like that might be the case.
Hope that at some point sooner or later the sound will be included as option in the official release.

Cheers,

Merc

thiagobruch · July 15, 2024, 3:31pm

Hi @umglurf, thanks a lot for the config file.

Sorry if this is a stupid question, but I wanted to check if it would be possible to not receive the answer via the M5Echo Speaker.

Wife uses Alexa for everything (turn on lights, AC, etc) and used to use as well for Shopping List. However Amazon shut down the Shopping List API so I cannot “transfer the shopping list to HA”.
My solution was to use the HA Assistant to add things to the Shopping List directly (and allow the wife to still use Alexa for everything else)

Right now, I have items being added to the shopping list and the answer being played at the M5 Echo and at Alexa. Is there a way to still have the listening beep (googleok-sound-beep.raw) but surpress the response (i.e. “added butter”) ?

Thank you

umglurf · July 16, 2024, 5:49am

Hi @thiagobruch

I don’t think there are any options for this out of the box, maybe someone else knows? A couple of possible ideas that might work are setting speaker in the voice_assistant section to a non valid id, for example no_speaker. Another option could be to create a new voice assistant config in Home Assistant, set speach to text in that to none and assign that to the m5 echo in device settings. A third idea is, if you are using AI, you can modify the prompt and tell it not to respond when adding to the shopping list.

ThreepE0 · July 16, 2024, 10:30pm

I’ve found that commenting out all the speaker stuff really helps to stabilize things. I didn’t like using the esp speaker for feedback, so I switched to using browser_mod if I’m at my computer, a sonos speaker, or another network endpoint.

I’m likely to delete my fork some time soon, but here’s a link for a config that is working for me at the moment.

Note: you’ll need to upload a file to your home assistant’s local media and grab the content ID for that. Mine is ack.mp3 in this example. Also I have the tts entity specified in the config, but that isn’t currently doing anything. Was just experimenting with having tts say “yes?” instead of playing a sound as was described at the top of this post. The sound seems a bit neater/cleaner and faster.

I do notice that on the odd occasion, the sound being played gets picked up on the Echo, and I get an error regarding “no speech detected.” But that happens far less often than the myriad other strange crashes and issues (audio buffer full, reboots, etc) I had when the internal speaker was enabled.

substitutions:
  name: m5stack-atom-echo-studio
  friendly_name: M5Stack Atom Echo Studio
  media_player_entity: "media_player.lantern"
  tts_entity: "tts.tortoise"

packages:
  m5stack.atom-echo-voice-assistant: github://ThreepE0/firmware/voice-assistant/m5stack-atom-echo3.yaml@2871dddff012e20420b4ee77afaf5369948ba93b?v=23
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
api:
  encryption:
    key: your_encryption_key
wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

voice_assistant:
  on_wake_word_detected:
    - homeassistant.service:
         service: media_player.play_media
         data:
           entity_id: ${media_player_entity}
           media_content_id: "media-source://media_source/local/ack.mp3"
           media_content_type: music
           announce: "true"

Thanks to everyone here in this thread that helped me piece this together using their examples.

m4v3r1ck · October 9, 2024, 5:59pm

Since updating to ESPHome 2024.10.0b1 I can’t update any of my Atom Echoes anymore.

INFO ESPHome 2024.10.0b1
INFO Reading configuration /config/esphome/m5stack-atom-echo-803494.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Updating https://github.com/jesserockz/esphome-components.git@None
INFO Unable to import component file: No module named 'magic'
Failed config

file: [source /data/packages/********/voice-assistant/m5stack-atom-echo.yaml:305]
  
  Component not found: file.
  - id: timer_finished_wave_file
    file: |-
      https://github.com/esphome/firmware/raw/main/voice-assistant/sounds/timer_finished.wav

PS Clean Build Files also renders this error.

m4v3r1ck · October 17, 2024, 12:50pm

Resolved with latest ESPHome (beta) 2024.10.0 and latest HA:

Core 2024.10.2
Supervisor 2024.10.2
Operating System 13.2

======================== [SUMMARY] ========================
  - /config/esphome/esp32-s3-box-3.yaml: SUCCESS
  - /config/esphome/everything-presence-lite-301838.yaml: SUCCESS
  - /config/esphome/m5stack-atom-echo-803494.yaml: SUCCESS
  - /config/esphome/m5stack-atom-echo-80b520.yaml: SUCCESS

Wetzel402 · November 20, 2024, 8:42pm

Hey all,

I’m trying this on my build with a wemos d1 mini, max98357, ICS43434, and adafruit 1314 speaker and find my wake sound is heavily distorted. Does anyone else get this and were you able to resolve it?

Modifying auto gain and volume multiplier don’t seem to help.

external_components:
  - source: github://jesserockz/esphome-components
    components: [file]

file:
  - id: wake_word_triggered_sound
    file: sounds_wake_word_triggered.wav
  - id: timer_finished_sound
    file: voice-assistant_sounds_timer_finished.wav

micro_wake_word:
  models:
    - model: hey_jarvis
  on_wake_word_detected:
    - wait_until:
        not:
          microphone.is_capturing:
    - lambda: id(spk).play(id(wake_word_triggered_sound), sizeof(id(wake_word_triggered_sound)));
    - delay: 500ms
    - voice_assistant.start: 

voice_assistant:
  microphone: mic
  noise_suppression_level: 2
  #auto_gain: 31dBFS
  #volume_multiplier: 2.0
  speaker: spk
  id: va

Edit: I find some files don’t play correctly at all. Instead of being distorted I just get a burst of static the duration of the sound file.

Edit2: I found an official page on playing audio and this solved my issue the sound comes through perfect. It’s just a bit more work.

https://esphome.io/guides/audio_clips_for_i2s.html