[Voice Assistant] Have assistant reply on a different media player

I am very excited by the new wake word feature as part of year of the voice and have managed to set up a working demo. However I find it limiting that the reply comes over the device that hears the command. This means that as I transition from google home minis to home assistant I am either going to have to buy expensive speakerphones + raspberry pis. Or build large ESP32s with built-in speakers. Most of my rooms already have some sort of media player in them.

Ideally I would like to have an ESP32 microphone in every room (which could also to other tasks like presence detection) and then stream the voice assistant responses directly to the much superior media players which already exist in each room.

Edit2: @tetele has shown this is possible on ESPHome by editing your config: Year of the Voice - Chapter 5 - #25 by tetele however it remains not possible/not done on dedicated VA hardware satellites.

That really would be great.

1 Like

Me too!

I just bought a bunch of M5 Stack AtomS3U to be used as BLE proxies.

They have a mic, button, and LED, but no speaker.

Most rooms I planned to put these in have decent media players already.

1 Like

+1 I have ESP32’s in most rooms along with SONOS speakers

+1 The ESP32 devices I’m using have crappy speakers. Would like to be able to control where the reply is sent, depending on the device receiving the command.

Edit: I’m sorry about the spam, didn’t realise there was a vote option… shame on me.

2 Likes

Click vote and stop spamming people

1 Like

Related workaround:

Related workaround for esphome devices:

This thread describes how to achieve it on a home assistant satellite:

I will keep this feature request open as:

  1. This functionality is undocumented
  2. This functionality is not configurable via the UI
1 Like

Hello, I bought one of the M5Stack AtomS3U and was wondering if you had a yaml that got it working with HA for a voice control unit with wake words? I am brand new at using the ESP devices and bought this as it was available from Amazon, but also bought some boards and items to build some customs units to learn this new skill :slightly_smiling_face: I found a esp32-s3-box-3.yaml but it doesn’t work as it references different pin out, I believe… So if you have a working one, I would love to get a copy. Comparing yours and the online will also help me to learn how yaml coding works with these as I am also learning all new skills in that arena also. thank you in advance.

It’s not just you :roll_eyes:.

I’ve had luck with Esphome, but the speech detection wants esp_adf which got me into trouble with ESP32s3. I just figured out that esp32/framework/sdkconfig_options needs to be an “S3” option or else there’s liable to be a build issue.

This is a work in progress; the mic doesn’t seem to do anything. But the light and button are working :stuck_out_tongue:.

# see https://github.com/esphome/firmware/blob/main/voice-assistant/m5stack-atom-echo.yaml
esphome:
  name: m5atom-test
  friendly_name: M5Atom-Test
  min_version: 2023.10.0
  on_boot:
    - priority: -100
      then:
        - wait_until: api.connected
        - delay: 1s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32_S3_KORVO2_V3_BOARD: y

# Enable logging
logger:
  level: VERY_VERBOSE

# Enable Home Assistant API
api:
  encryption:
    key: "<redacted>"

ota:
  password: "<redacted>"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

i2s_audio:
  i2s_lrclk_pin: GPIO38 # mic_clk https://docs.m5stack.com/en/core/AtomS3U

microphone:
  - platform: i2s_audio
    id: builtin_mic
    adc_type: external
    i2s_din_pin: GPIO39 # mic_data https://docs.m5stack.com/en/core/AtomS3U
    pdm: true

voice_assistant:
  id: va
  microphone: builtin_mic
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  vad_threshold: 3
  on_listening:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: pulse
  on_tts_start:
    - light.turn_on:
        id: led
        blue: 0%
        red: 0%
        green: 100%
        brightness: 100%
        effect: pulse
  on_end:
    - delay: 100ms
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" ||
          code == "wake-engine-missing")
        {
          id(use_wake_word).turn_off();
        }

binary_sensor:
  - platform: gpio
    pin:
      number: GPIO41 # BUTTON SW https://docs.m5stack.com/en/core/AtomS3U
      inverted: true
    name: Button
    disabled_by_default: true
    entity_category: diagnostic
    id: echo_button
    on_click:
      - if:
          condition:
            switch.is_off: use_wake_word
          then:
            - if:
                condition: voice_assistant.is_running
                then:
                  - voice_assistant.stop:
                  - script.execute: reset_led
                else:
                  - voice_assistant.start:
          else:
            - voice_assistant.stop
            - delay: 1s
            - script.execute: reset_led
            - script.wait: reset_led
            - voice_assistant.start_continuous:

light:
  - platform: esp32_rmt_led_strip
    id: led
    name: None
    disabled_by_default: true
    entity_category: config
    pin: GPIO35 # WS2812 DIN https://docs.m5stack.com/en/core/AtomS3U
    default_transition_length: 0s
    chipset: ws2812
    num_leds: 1
    rgb_order: grb
    rmt_channel: 0
    effects:
      - pulse:
          transition_length: 250ms
          update_interval: 250ms

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: none
          else:
            - light.turn_off: led

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);
      - script.execute: reset_led

external_components:
  - source: github://pr#5230
    components:
      - esp_adf
    refresh: 0s

esp_adf:

EspHome, in general, is super cool. You (probably) won’t regret picking up a few boards, even if you are having a rocky start now.

[edit: My (long neglected) config YAML was wrong in a different way; I changed it back to report the real stumbling block I ran into.]
[edit #2: I replaced the non-compiling YAML and error; that’s not really useful to the community.]

I now use this config on ESPhome to achieve this

substitutions:
  name: m5stack-atom-echo-b836b0
  friendly_name: M5Stack Atom Echo b836b0
# pin version Until https://github.com/esphome/firmware/pull/233 is fixed/merged
packages:
  m5stack.atom-echo-voice-assistant: github://esphome/firmware/voice-assistant/m5stack-atom-echo.yaml@cd57ca6f951d44cc5bf61de124a15b349ef1f9a4
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
api:
  encryption:
    key: +someapikey=

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

# HarvsG's customisations
speaker:
  - platform: i2s_audio
    id: !extend echo_speaker
    i2s_dout_pin: GPIO21 # <- It is actually on 22, so this disables the speaker

voice_assistant:
  on_tts_end:
     - homeassistant.service:
         service: media_player.play_media
         data:
           entity_id: media_player.study_speaker # <- this is hard-coded
           media_content_id: !lambda 'return x;'
           media_content_type: music
           announce: "true"

Here’s a link to three versions I’ve been fiddling with