Far field satellite with an Elegant 3d printed enclosures

No problem at all. Think I found an odd issue, although not that odd that I think about it. At one point I kept getting a “no wake word” error message. It was because I had stopped OpenWakeWord. While I have never had it fall back to Open, the yaml is written so it does if micro doesn’t work. Since I had stopped Open, THAT is where the error came from. Would be nice if someone could verify though. All you have to do is stop Open, then open the logs in esphome on the korvo over wifi and try to use it. Just want a sanity check but as soon as I started Open it started working so it appears to be checking both.

I will be printing another case, especially since I missed the mark on the first hole for the 3.5mm jack so there are 2 right next to each other. Might not have to scale it up but a right angle cable is going to be needed at a minimum to route it under or around the board to come out where the micro usb power is and I simply don’t have a 3.5mm to 3.5mm cable with one. I don’t have any non passive speakers (needs an amp) so I ordered a 5 dollar amp of Ali and it works surprisingly well. It obviously doesn’t get super loud but it has BT so music on it sounded better then expected and more then enough for a voice assistant.

Also, probably spent way to much time playing around with the new sentences/respone automations. If you haven’t heard about them, create a new automation, for the trigger type sentence then type whatever you want to say and then for the action type in response or reply and type in what you want it to say back. Takes 2 minutes if you know what the sentene and response is going to be. The also put a kids voice as one of the options (AnaNeural) and it sounds super creepy. Going to have some fun freaking friends out with that because no filters. The kids voice just sounds like a creepy young girl to me. Reminds me of those twin girls in The Shining movie for some reason which is probably why it sounds so creepy to me.

Weird. With this new yaml im getting a “not enough space” error when installing. I wonder if the Alexa microwakeword is bigger than the hey_jarvis one? I’ll try that and report back.

Same with the hey_jarvis wake word. Is there a way to determine exactly which Korvo i have? on the back of it, it says ESP32-Korvo v1.1 so i presume thats the one you have? Unsure why its stating i dont have space. Are you using an sd card?

I had that issue, you are going to have to hook it up to a computer, I created bin file for backup and it’s only 2MB. It seems like ESPHome creates partitions based on the bin file size and when I originally installed it there was basically nothing in the config. I also have the board as esp32s3box so that is probably another reason. Need to change any references to the S3 board also. It’s also pointing to another repository so the below will need to be changed. I don’t think you have to specify a board for esp_adf but I saw some that did. That and the part under esp32 is esp32s3box. Under components it needs to point to the V1.1 repository. The one it’s currently pointed to is specifically for the S3. You do have the V1.1 version and not the korvo-1 (s3) version, correct? Before I added the flash size part below I got the same message, then when I added that it wouldn’t work over wifi because the ROM had to be formatted. The bin file is only 2MB. Also comment out or remove the below, it is specific to the repo I am pointing to

I can take a look at the 1.1 yaml and fix it but not until later tonight and I can’t test it if I do.

CONFIG_ESP32_S3_KORVO1_BOARD

image

board and component source need to be changed

esp32:
board: esp32s3box
flash_size: 16MB
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: “y”
CONFIG_ESP32S3_DATA_CACHE_64KB: “y”
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: “y”
CONFIG_AUDIO_BOARD_CUSTOM: “y”
CONFIG_ESP32_S3_KORVO1_BOARD: “y”
components:
- name: esp32_s3_korvo1_board
#source: github://espressif/components/hardware_driver@main
source: github://abmantis/esphome_custom_audio_boards@main
refresh: 0s

then the esp_adf part

esp_adf:
board: esp32s3korvo1

No SD card and the V1.1 has 18MB of ROM and 8MB of PSRAM. On the bottom of mine it says korvo-1 V5. They both use the same mic/led top board so it says Korvo V1.1 on that but if the bottom board says Korvo v1.1 then it’s the not the korvo-1. Both work though because that particular chip has PSRAM which seems to be the main requirement.

Are you doing it over wifi or connected to a computer? I have to go but I can see where that orginal code was causing issues and your pointing to different repos for some hardware/codecs so that is what needs to be fixed. They have the voice pipline starting in the reset_led script which is executed four or five times, if not more in the latest yaml above for the V1.1.

Yeah my firmware bin is only 2mb as well. I tried flashing it connected to wifi and connected to my machine, both failed. Im certain this is not the S3 variant of the korvo. Im not sure im following you with regard to what i need to change. This stuff is a bit over my head. What would i change those bolded portions to? In my previous yaml i have board: esp-wrover-kit and a bunch of external_components but nothing under esp32.components.

This is what im trying now and i get a compilation error about pins in use:

substitutions:
  name: alexa-living-room
  friendly_name: Alexa_Living_Room

  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "1.0"
  min_version: 2023.11.5
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            red: 0%
            blue: 0%
            green: 100%
            brightness: 100%
            effect: random
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;

esp32:
  board: esp-wrover-kit #GUESSING HERE BASED ON PREVIOUS SETTINGS
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
      CONFIG_ESP32_S3_KORVO1_BOARD: "y"
    components:
      - name: esp32_korvo_v1_1_board #GUESSING HERE
        source: github://espressif/components/hardware_driver@main #SWAPPED THIS OUT
        #source: github://abmantis/esphome_custom_audio_boards@main
        refresh: 0s

psram:
  mode: octal
  speed: 80MHz

external_components:
  - source: github://pr#5230
    components: esp_adf

ota:
  password: "REDACTED"

logger:
api:
  encryption:
     key: "REDACTED"
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - delay: 20ms
            - ble.disable:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 50%
          effect: connecting              
  on_client_disconnected:
    then:
      - ble.enable
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting

# dashboard_import:
#   package_import_url: github://esphome/firmware/voice-assistant/esp32-s3-korvo1.yaml@main

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  ap:
  on_connect:
    then:
      - delay: 20ms # Gives time for improv results to be transmitted
      - ble.disable:
      - delay: 5s
  on_disconnect:
    then:
      - ble.enable:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

esp_adf:
  # board: esp32s3korvo1 #COMMENTED OUT

microphone:
  - platform: esp_adf
    id: korvo_mic

speaker:
  - platform: esp_adf
    id: korvo_speaker

micro_wake_word:
  model: alexa 
# model: okay_nabu
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
    - light.turn_on:
        id: led_ring      
        red: 30%
        green: 30%
        blue: 70%
        brightness: 100%
        effect: wakeword
  

voice_assistant:
  id: voice_asst
  microphone: korvo_mic
  speaker: korvo_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 3.0
  vad_threshold: 3
  on_listening:     
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};    
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: pulse
  #  - script.execute: reset_led    
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 50%
        green: 50%
        brightness: 100%        
        effect: working
  #  - script.execute: reset_led    
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 50%
        red: 50%
        green: 0%
        brightness: 100%        
        effect: pulse
    #- script.execute: reset_led    
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - delay: 300ms
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 20%        
        effect: connecting
   # - script.execute: reset_led    
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - delay: 1s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
                - script.execute: reset_led
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - wait_until:
              not: ble.enabled
          - voice_assistant.start_continuous:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - script.execute: reset_led
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false;
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};;
 #   - lambda: |-
#        if (code == "wake-provider-missing" || code == "wake-#engine-missing") {
 #         id(use_wake_word).turn_off();
 #       }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: connecting
          else:
            - light.turn_off: led_ring

switch:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO38
    
    name: "${friendly_name} Speaker Mute"
    restore_mode: ALWAYS_ON

  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_asst).set_use_wake_word(true);
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: id(voice_asst).set_use_wake_word(false);
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};

  - platform: restart
    name: "${friendly_name} Restart"

  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      #- script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    is_rgbw: true
    rgb_order: GRB    
    pin: GPIO19
    num_leds: 12
    rmt_channel: 0
    chipset: WS2812
    name: "${friendly_name} Light"
    default_transition_length: 1s
    effects:
      - addressable_scan:
          name: "led_12"
          move_interval: 10ms
          scan_width: 12
      - pulse:
          name: "pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "wakeword"
          colors:
            - red: 0%
              green: 0%
              blue: 100%
              num_leds: 12
          add_led_interval: 20ms
          reverse: false
      - addressable_color_wipe:
          name: "connecting"
          colors:
            - red: 40%
              green: 30%
              blue: 30%
              num_leds: 12
          add_led_interval: 50ms
          reverse: true

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
    on_multi_click:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn    
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    on_press:
      - voice_assistant.start_continuous:
  - platform: template
    name: "${friendly_name} Record" 
    id: btn_record
    on_press:
      #- voice_assistant.start_continuous:
      - lambda: id(voice_asst).set_use_wake_word(true);
          
sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 1.01
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.3
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Tried just commenting out just the esp_adf.board line and the same configuration as yours above and it installed but is throwing a ton of errors:

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
invalid header: 0x41000a6d
invalid header: 0x41000a6d
invalid header: 0x41000a6d
invalid header: 0x41000a6d
invalid header: 0x41000a6d
invalid header: 0x41000a6d
invalid header: 0x41000a6d
ets Jul 29 2019 12:21:46

Trying out a few other configurations but i honestly have no idea what im doing. lol.

Yeah, I was lucky in that the repo I was pointing to already had the GPIO pins defined for everything. This looks more like it’s pointing to repos for codecs. Here, give this a try, I think it will work but can’t test it out. Change name and friendly name to whatever you want, add api key if using encryption, make sure wifi secret format is the same. I commented out the line where I statically IP’d mine so if your using DHCP leave it commented out. If using static IP remove the comment for “use_address” line and set the IP. Please note this is a mixture of what I found on another forum post for the korvo-1 and the link below for the korvo v1.1…

substitutions:
  name: esp32-korvo-1
  friendly_name: esp32-korvo-1
  
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
  
  micro_wake_word_model: okay_nabu
 
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "1.0"
  min_version: 2023.11.5
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            red: 0%
            blue: 0%
            green: 100%
            brightness: 100%
            effect: random
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;
              
    sdkconfig_options:
      CONFIG_IDF_TARGET_ESP32: y
      CONFIG_ESPTOOLPY_FLASHMODE_QIO: y
      CONFIG_ESPTOOLPY_FLASHFREQ_80M: y
      CONFIG_ESPTOOLPY_FLASHSIZE_16MB: y
      CONFIG_PARTITION_TABLE_CUSTOM: y
      CONFIG_PARTITION_TABLE_CUSTOM_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_OFFSET: "0x8000"
      CONFIG_ESP32_DEFAULT_CPU_FREQ_240: y
      CONFIG_ESP32_SPIRAM_SUPPORT: y
      CONFIG_SPIRAM_SPEED_80M: y
      CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT: y
      CONFIG_I2S_ENABLE_DEBUG_LOG: y
#psram:
#  mode: octal
#  speed: 80MHz
external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf     

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: 
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - delay: 20ms
            - ble.disable:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 50%
          effect: connecting              
  on_client_disconnected:
    then:
      - ble.enable
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting    

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  #use_address: 192.168.x.x
  on_connect:
    then:
      - delay: 20ms # Gives time for improv results to be transmitted
      - ble.disable:
      - delay: 5s
  on_disconnect:
    then:
      - ble.enable:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false
    
micro_wake_word:
  model: hey_jarvis 
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
    - light.turn_on:
        id: led_ring      
        red: 30%
        green: 30%
        blue: 70%
        brightness: 100%
        effect: wakeword    
    
  voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 3.0
 # use_wake_word: false  
 on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  #  - script.execute: reset_led    
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: working
    #- script.execute: reset_led    
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - delay: 300ms
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 20%        
        effect: pulse
    - script.execute: reset_led    
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - delay: 1s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
                - script.execute: reset_led
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - wait_until:
              not: ble.enabled
          - voice_assistant.start_continuous:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - script.execute: reset_led
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false;
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};;

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_off: mute
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: connecting
          else:
            - light.turn_off: led_ring   

  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_asst).set_use_wake_word(true);
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: id(voice_asst).set_use_wake_word(false);
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            
  - platform: restart
    name: "korvo restart"            
            
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}


light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
#    on_release:
#      - voice_assistant.stop:
#      - light.turn_off:
#          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF
    ```

That is the original config that i was using that i had the issues with i think. Or did you modify it using what you’ve learned from the original github issue you linked?

Yeah it doesnt seem like this is valid yaml. I’ll look closer tomorrow here. Gummy kickin in. :rofl:

Ok, this compiles and installs, but its not responding to any wake words. I had to do a bit of formatting and added a couple pointers for lighting effects. It looks like I get to the rainbow effect spinning around the LEDs. Might also be the random effect.

Edit: Ok it is showing up in HA and responds to changing settings in HA. I added the Select section to toggle between where the wake word is happening and if i toggle back and forth so that its on device it is responding to my wake word but doesnt kick in the action. Maybe that’ll help get things figured out. I also change the project version to 1.0 and min_version to 2023.11.5 like you had in your yaml above. Unsure if that has anything to do with it but the original from the github issue had what is showing below (2.0 and 2023.12.8).

Edit 2: Looks like there was a missing lambda in the wake word. Updated the config below.

substitutions:
  name: alexa-living-room
  friendly_name: "Alexa Living Room"
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
  micro_wake_word_model: alexa
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "2.0"
  min_version: 2023.12.8
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            red: 0%
            blue: 0%
            green: 100%
            brightness: 100%
            effect: random
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;
esp32:
  board: esp-wrover-kit
  flash_size: 16MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_IDF_TARGET_ESP32: y
      CONFIG_ESPTOOLPY_FLASHMODE_QIO: y
      CONFIG_ESPTOOLPY_FLASHFREQ_80M: y
      CONFIG_ESPTOOLPY_FLASHSIZE_16MB: y
      CONFIG_PARTITION_TABLE_CUSTOM: y
      CONFIG_PARTITION_TABLE_CUSTOM_FILENAME: default_16MB.csv
      CONFIG_PARTITION_TABLE_FILENAME: default_16MB.csv
      CONFIG_PARTITION_TABLE_OFFSET: "0x8000"
      CONFIG_ESP32_DEFAULT_CPU_FREQ_240: y
      CONFIG_ESP32_SPIRAM_SUPPORT: y
      CONFIG_SPIRAM_SPEED_80M: y
      CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT: y
      CONFIG_I2S_ENABLE_DEBUG_LOG: y
psram:
  mode: octal
  speed: 80MHz
external_components:
  - source: github://rpatel3001/esphome@es8311
    components:
      - es8311
  - source: github://rpatel3001/esphome@es7210
    components:
      - es7210
  - source: github://pr#5230
    components:
      - esp_adf
logger:

ota:
  password: 425e052cb19d6228eb28eefe88323296

api:
  encryption:
    key: jdmqLzc1DtEZxDFibBWFqx3XNy+alBqo5W2la4mutHA=
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - delay: 20ms
            - ble.disable:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 50%
          effect: connecting
  on_client_disconnected:
    then:
      - ble.enable
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  use_address: 192.168.1.180
  on_connect:
    then:
      - delay: 20ms
      - ble.disable:
      - delay: 5s
  on_disconnect:
    then:
      - ble.enable:

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "${name} Hotspot"
    password: "3U5roBZH^Gnwh7"

improv_serial:
esp32_improv:
  authorizer: none
button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset
i2c:
  sda: GPIO19
  scl: GPIO32
  scan: true
  frequency: 400kHz
es8311:
  address: 24
es7210:
  address: 64
output:
  - platform: gpio
    id: pa_ctrl
    pin:
      number: GPIO12
      ignore_strapping_warning: true
i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22
    i2s_bclk_pin: GPIO25
    i2s_mclk_pin:
      number: GPIO0
      allow_other_uses: true
      ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26
    i2s_bclk_pin: GPIO27
    i2s_mclk_pin:
      number: GPIO0
      allow_other_uses: true
      ignore_strapping_warning: true
speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13
    mode: mono
microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36
    pdm: false
micro_wake_word:
  model: ${micro_wake_word_model}
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word; #Was missing !lambda
    - light.turn_on:
        id: led_ring
        red: 30%
        green: 30%
        blue: 70%
        brightness: 100%
        effect: wakeword
voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 3
  on_listening:     
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};    
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: pulse
  #  - script.execute: reset_led
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: working
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - delay: 300ms
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 20%
        effect: pulse
    - script.execute: reset_led
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - delay: 1s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
                - script.execute: reset_led
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - wait_until:
              not: ble.enabled
          - voice_assistant.start_continuous:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - script.execute: reset_led
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - lambda: id(init_in_progress) = false;
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_off: mute
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: connecting
          else:
            - light.turn_off: led_ring

switch:
  # - platform: gpio
  #   id: pa_ctrl
  #   pin: GPIO38
    
  #   name: "${friendly_name} Speaker Mute"
  #   restore_mode: ALWAYS_ON

  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_asst).set_use_wake_word(true);
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: id(voice_asst).set_use_wake_word(false);
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
  - platform: restart
    name: korvo restart
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led
select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - wait_until:
          lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
      - if:
          condition:
            lambda: return x == "In Home Assistant";
          then:
            - micro_wake_word.stop
            - delay: 500ms
            - if:
                condition:
                  switch.is_off: mute
                then:
                  - lambda: id(voice_asst).set_use_wake_word(true);
                  - voice_assistant.start_continuous:
      - if:
          condition:
            lambda: return x == "On device";
          then:
            - lambda: id(voice_asst).set_use_wake_word(false);
            - voice_assistant.stop
            - delay: 500ms
            - micro_wake_word.start   
globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}
light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: ${friendly_name} Light
    pin: GPIO33
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - addressable_rainbow:
          name: connecting
          speed: 10
          width: 50
      - random:
          name: random
          transition_length: 5s
          update_interval: 7s
      - pulse:
          name: pulse
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: working
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: wakeword
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false
binary_sensor:
  - platform: template
    name: ${friendly_name} Volume Up
    id: btn_volume_up
  - platform: template
    name: ${friendly_name} Volume Down
    id: btn_volume_down
  - platform: template
    name: ${friendly_name} Set
    id: btn_set
  - platform: template
    name: ${friendly_name} Play
    id: btn_play
  - platform: template
    name: ${friendly_name} Mode
    id: btn_mode
  - platform: template
    name: ${friendly_name} Record
    id: btn_record
    on_press:
      - voice_assistant.start: null
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: wakeword
sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF
    
[08:53:36][D][light:036]: 'Alexa Living Room Light' Setting:
[08:53:36][D][light:051]:   Brightness: 50%
[08:53:36][D][light:059]:   Red: 0%, Green: 100%, Blue: 0%
[08:53:36][C][logger:166]: Logger:
[08:53:36][C][logger:167]:   Level: DEBUG
[08:53:36][C][logger:169]:   Log Baud Rate: 115200
[08:53:36][C][logger:170]:   Hardware UART: UART0
[08:53:36][C][i2c.idf:061]: I2C Bus:
[08:53:36][C][i2c.idf:062]:   SDA Pin: GPIO19
[08:53:36][C][i2c.idf:063]:   SCL Pin: GPIO32
[08:53:36][C][i2c.idf:064]:   Frequency: 400000 Hz
[08:53:36][C][i2c.idf:067]:   Recovery: bus successfully recovered
[08:53:36][I][i2c.idf:077]: Results from i2c bus scan:
[08:53:36][I][i2c.idf:083]: Found i2c device at address 0x18
[08:53:36][I][i2c.idf:083]: Found i2c device at address 0x40
[08:53:36][C][gpio.output:010]: GPIO Binary Output:
[08:53:36][C][gpio.output:011]:   Pin: GPIO12
[08:53:36][C][template.select:065]: Template Select 'Wake word engine location'
[08:53:36][C][template.select:066]:   Update Interval: 60.0s
[08:53:36][C][template.select:069]:   Optimistic: YES
[08:53:36][C][template.select:070]:   Initial Option: On device
[08:53:36][C][template.select:071]:   Restore Value: YES
[08:53:36][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[08:53:36][C][esp32_rmt_led_strip:176]:   Pin: 33
[08:53:36][C][esp32_rmt_led_strip:177]:   Channel: 0
[08:53:36][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[08:53:36][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[08:53:36][C][esp32_rmt_led_strip:204]:   Number of LEDs: 12
[08:53:36][C][template.binary_sensor:028]: Template Binary Sensor 'Alexa Living Room Volume Up'
[08:53:36][C][template.binary_sensor:028]: Template Binary Sensor 'Alexa Living Room Volume Down'
[08:53:36][C][template.binary_sensor:028]: Template Binary Sensor 'Alexa Living Room Set'
[08:53:36][C][template.binary_sensor:028]: Template Binary Sensor 'Alexa Living Room Play'
[08:53:36][C][template.binary_sensor:028]: Template Binary Sensor 'Alexa Living Room Mode'
[08:53:36][C][template.binary_sensor:028]: Template Binary Sensor 'Alexa Living Room Record'
[08:53:36][C][light:103]: Light 'Alexa Living Room Light'
[08:53:36][C][light:105]:   Default Transition Length: 0.0s
[08:53:36][C][light:106]:   Gamma Correct: 2.80
[08:53:36][C][template.switch:068]: Template Switch 'Use wake word'
[08:53:36][C][template.switch:091]:   Restore Mode: restore defaults to ON
[08:53:36][C][template.switch:057]:   Optimistic: YES
[08:53:36][C][template.switch:068]: Template Switch 'Mute'
[08:53:36][C][template.switch:091]:   Restore Mode: restore defaults to OFF
[08:53:36][C][template.switch:057]:   Optimistic: YES
[08:53:36][C][psram:020]: PSRAM:
[08:53:36][C][psram:021]:   Available: YES
[08:53:36][C][psram:024]:   Size: 4095 KB
[08:53:36][C][factory_reset.button:011]: Factory Reset Button 'Factory reset'
[08:53:36][C][factory_reset.button:011]:   Icon: 'mdi:restart-alert'
[08:53:37][C][restart:068]: Restart Switch 'korvo restart'
[08:53:37][C][restart:070]:   Icon: 'mdi:restart'
[08:53:37][C][restart:091]:   Restore Mode: always OFF
[08:53:37][C][adc:097]: ADC Sensor 'button_adc'
[08:53:37][C][adc:097]:   Device Class: 'voltage'
[08:53:37][C][adc:097]:   State Class: 'measurement'
[08:53:37][C][adc:097]:   Unit of Measurement: 'V'
[08:53:37][C][adc:097]:   Accuracy Decimals: 2
[08:53:37][C][adc:107]:   Pin: GPIO39
[08:53:37][C][adc:122]:  Attenuation: 11db
[08:53:37][C][adc:142]:   Update Interval: 0.015s
[08:53:37][C][esp32_ble:379]: ESP32 BLE: bluetooth stack is not enabled
[08:53:37][C][esp32_ble_server:200]: ESP32 BLE Server:
[08:53:37][C][esp32_improv.component:261]: ESP32 Improv:
[08:53:37][C][esp32_improv.component:266]:   Status Indicator: 'NO'
[08:53:37][C][mdns:115]: mDNS:
[08:53:37][C][mdns:116]:   Hostname: alexa-living-room-9c8afc
[08:53:37][C][ota:096]: Over-The-Air Updates:
[08:53:37][C][ota:097]:   Address: 192.168.1.180:3232
[08:53:37][C][ota:100]:   Using Password.
[08:53:37][C][ota:103]:   OTA version: 2.
[08:53:37][C][api:139]: API Server:
[08:53:37][C][api:140]:   Address: 192.168.1.180:6053
[08:53:37][C][api:142]:   Using noise encryption: YES
[08:53:37][C][improv_serial:032]: Improv Serial:
[08:53:37][C][micro_wake_word:057]: microWakeWord:
[08:53:37][C][micro_wake_word:058]:   Wake Word: alexa
[08:53:37][C][micro_wake_word:059]:   Probability cutoff: 0.660
[08:53:37][C][micro_wake_word:060]:   Sliding window size: 10
[08:53:37][C][es8311:167]: ES8311 Audio Codec:
[08:53:37][C][es8311:168]:   Use MCLK: YES
[08:53:37][C][es7210:035]: ES7210 Audio Codec:
[08:53:51][D][micro_wake_word:362]: Wake word sliding average probability is 0.700 and most recent probability is 1.000
[08:53:51][D][micro_wake_word:128]: Wake Word Detected
[08:53:51][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[08:53:51][D][micro_wake_word:134]: Stopping Microphone
[08:53:51][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[08:53:51][D][esp-idf:000]: I (30406) I2S: DMA queue destroyed

[08:53:51][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[08:53:51][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[08:53:51][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[08:53:51][D][light:036]: 'Alexa Living Room Light' Setting:
[08:53:51][D][light:051]:   Brightness: 100%
[08:53:51][D][light:059]:   Red: 43%, Green: 43%, Blue: 100%
[08:53:51][D][light:109]:   Effect: 'Wakeword'
[08:53:51][D][voice_assistant:118]: microphone not running
[08:53:51][D][voice_assistant:202]: Requesting start...
[08:53:51][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[08:53:51][D][voice_assistant:437]: Client started, streaming microphone
[08:53:51][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE
[08:53:51][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[08:53:51][D][voice_assistant:155]: Starting Microphone
[08:53:51][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[08:53:51][D][esp-idf:000]: I (30509) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[08:53:51][D][esp-idf:000]: I (30518) I2S: I2S1, MCLK output by GPIO0

[08:53:51][D][voice_assistant:523]: Event Type: 1
[08:53:51][D][voice_assistant:526]: Assist Pipeline running
[08:53:51][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[08:53:51][D][voice_assistant:523]: Event Type: 3
[08:53:51][D][voice_assistant:537]: STT started
[08:53:51][D][esp-idf:000]: I (30560) wifi:
[08:53:51][D][esp-idf:000]: <ba-add>idx:1 (ifx:0, 46:ed:00:12:08:0e), tid:7, ssn:0, winSize:64
[08:53:51][D][esp-idf:000]: 

[08:54:16][D][esp32.preferences:114]: Saving 1 preferences to flash...
[08:54:16][D][esp32.preferences:143]: Saving 1 preferences to flash: 1 cached, 0 written, 0 failed

Hmmm. I know what went wrong but not sure how it happened. For the address’s for i2c there should be a 0x in front of the number, i2c has to have an address for each device. I don’t believe any other interface needs that but that’s the first error in the logs. Because of the missing 0x in front of the address it doesn’t see either the speaker or microphone. Add that and it should run.

You should also comment out the PSRAM part, that or comment out 2 of the sdkcondig options at the top. The third and fourth ones from the bottom of the list define that it has PSRAM and what speed it runs out. I just know it works that way which is why I combined 2 of the configs from that link. One clearly defined the partition size so there would be any issues with it saying there wasn’t enough space on the ROM.

It also has that microwakeword substitution and I thought I commented it out but obviously I didn’t as you had mentioned wanting to use okay_jarvis and * hadn’t tried using that substitution although it should work either way. So really all you need to do is buy a 0x in front of the two addresses. Below would word using Openeakeword but shows what I mean.

I’m just confused because I coped and pasted from Notepad++ to here so I don’t know why it was removed… Probably something I did. That and comment out or remove that substitution but should work with zero issues the way you have above. The only issue now is the address like in the error. I’m pretty sure the one 2 differences between the “mode” for PSRAM is all chips with 8MB use octal while versions with 2MB have a different kind, can’t remember the name.

You have just 40.amd 18 in your config. It should be 0x18 and 0x40

EDIT: they have the correct address’s in what I originally posted. Not sure if they were manually removed. If not, and your using something like notepad on Windows then download something like Notepad++ because notepad and code don’t work well together. The encoding can get messed up among other things.nyou can also download plugins like compareDiff that will compare 2 files and shows every difference between the 2 files.

The logs show it has PSRAM but only seeing 4MB, I’m betting it will see all 8MB of you comment it out. Maybe not because the odd thing is they don’t make an S3 model with 4MB of PSRAM, it’s either 2MB or 8MB. The chip you have has 8MB.Ther are 16MB and 32MB versions on paper it only for people buying in bulk.

CONFIG_IDF_TARGET_ESP32: y
CONFIG_ESP32_SPIRAM_SUPPORT: y
CONFIG_SPIRAM_SPEED_80M: y

08:53:36][I][i2c.idf:083]: Found i2c device at address 0x18
[08:53:36][I][i2c.idf:083]: Found i2c device at address 0x40

I ran the yaml through a linter and im wondering if the linter removed the 0x stuff or something? Thanks for pointing those out. I’ll see how it goes and report back.

Edit: Making some progress. Now i gotta debug this:

[09:08:30][C][api:142]:   Using noise encryption: YES
[09:08:30][C][improv_serial:032]: Improv Serial:
[09:08:30][C][micro_wake_word:057]: microWakeWord:
[09:08:30][C][micro_wake_word:058]:   Wake Word: alexa
[09:08:30][C][micro_wake_word:059]:   Probability cutoff: 0.660
[09:08:30][C][micro_wake_word:060]:   Sliding window size: 10
[09:08:30][C][es8311:167]: ES8311 Audio Codec:
[09:08:30][C][es8311:168]:   Use MCLK: YES
[09:08:30][C][es8311:170]:   Failed to initialize!
[09:08:30][C][es7210:035]: ES7210 Audio Codec:
[09:08:30][C][es7210:037]:   Failed to initialize!

Those are both tied to the addresses missing the 0x stuff but they also point to 2 components in the external_components section. The other difference here is that i changed the board to esp-wrover-kit from the s3box one you are using. Should i try arduino or some other setting there?

Edit 2: For some reason i had different addresses. Went back to 0x18 and 0x40 from 24 and 64 and i dont see the error anymore.

Edit 3: Ok, this is ridiculous. I had my HA docker container using a bridged network and flipped it to host and now things are working as far as connecting and executing commands. Below is the config. I still havent gotten around to why its not starting properly without a reset press, but this is progress.

substitutions:
  name: kitchen-alexa
  friendly_name: Kitchen Alexa
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
  micro_wake_word_model: alexa
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2023.12.8
  platformio_options:
    board_build.flash_mode: dio
  project:
    name: esphome.voice-assistant
    version: "2.0"
  on_boot:
    - priority: -100
      then:
        - light.turn_on:
            id: led_ring
            blue: 0%
            red: 100%
            green: 0%
            effect: Fast Pulse
        - delay: 1s
        - wait_until:
            condition:
              wifi.connected:
        - light.turn_on:
            id: led_ring
            blue: 0%
            red: 100%
            green: 50%
            effect: Slow Pulse
        - wait_until: 
            condition:
              api.connected
        - lambda: id(init_in_progress) = false;
        - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        - script.execute: reset_led
esp32:
  board: esp-wrover-kit
  flash_size: 16MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_IDF_TARGET_ESP32: y
      CONFIG_ESPTOOLPY_FLASHMODE_QIO: y
      CONFIG_ESPTOOLPY_FLASHFREQ_80M: y
      CONFIG_ESPTOOLPY_FLASHSIZE_16MB: y
      CONFIG_PARTITION_TABLE_CUSTOM: y
      CONFIG_PARTITION_TABLE_CUSTOM_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_OFFSET: "0x8000"
      CONFIG_ESP32_DEFAULT_CPU_FREQ_240: y
      CONFIG_ESP32_SPIRAM_SUPPORT: y
      CONFIG_SPIRAM_SPEED_80M: y
      CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT: y
      CONFIG_I2S_ENABLE_DEBUG_LOG: y
#psram:
#  mode: octal
#  speed: 80MHz
external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "[redacted]"

ota:
  password: "[redacted]"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  use_address: 192.168.1.180
  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Korvo-1 Fallback Hotspot"
    password: "[redacted]"

captive_portal:

i2c:
  - id: bus
    sda: GPIO19
    scl: GPIO32
    scan: true
    frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin:
      number: GPIO12
      ignore_strapping_warning: true
i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 
    i2s_bclk_pin: GPIO25 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 
    i2s_bclk_pin: GPIO27 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

esp_adf:

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36
    pdm: false


micro_wake_word:
  model: ${micro_wake_word_model}  #okay_nabu
  on_wake_word_detected:
    then:
      - voice_assistant.start:
          wake_word: !lambda 'return wake_word;'

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 3.0
  vad_threshold: 3

  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: pulse
    # - script.execute: reset_led
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 50%
        green: 50%
        brightness: 100%        
        effect: working
    # - script.execute: reset_led
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 100%
        brightness: 60%
        effect: working
  # on_stt_end: 
  #   - homeassistant.service:
  #       service: media_player.play_media
  #       data:
  #         entity_id: media_player.ke_ting
  #         media_content_id: !lambda return x;
  #         media_content_type: music
  #         announce: "true"

  on_tts_stream_start:
    - output.turn_on: pa_ctrl
    - delay: 100ms
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 50%
        red: 50%
        green: 0%
        brightness: 100%        
        effect: pulse
    # - script.execute: reset_led
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - delay: 300ms
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 20%        
        effect: connecting
  on_end:
    - wait_until:
        not:
          speaker.is_playing:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: reset_led
    - if:
        condition:
          and:
            - switch.is_off: mute
            - lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start:

  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          # - script.execute: reset_led
          - delay: 2s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
                - script.execute: reset_led
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: reset_led

  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - if:
              condition:
                lambda: return id(wake_word_engine_location).state == "In Home Assistant";
              then:
                - lambda: id(voice_asst).set_use_wake_word(true);
                - voice_assistant.start_continuous:
          - if:
              condition:
                lambda: return id(wake_word_engine_location).state == "On device";
              then:
                - micro_wake_word.start
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false;
    - script.execute: reset_led

  on_client_disconnected:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "In Home Assistant";
        then:
          - lambda: id(voice_asst).set_use_wake_word(false);
          - voice_assistant.stop:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - micro_wake_word.stop
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: reset_led


script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: connecting
          else:
            - light.turn_off: led_ring

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - addressable_scan:
          name: "led_12"
          move_interval: 10ms
          scan_width: 12
      - pulse:
          name: "pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "wakeword"
          colors:
            - red: 0%
              green: 0%
              blue: 100%
              num_leds: 12
          add_led_interval: 20ms
          reverse: false
      - addressable_color_wipe:
          name: "connecting"
          colors:
            - red: 40%
              green: 30%
              blue: 30%
              num_leds: 12
          add_led_interval: 50ms
          reverse: true

switch:
  # - platform: gpio
  #   id: pa_ctrl
  #   pin: GPIO38
  #   name: "${friendly_name} Speaker Mute"
  #   restore_mode: ALWAYS_ON

  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      #- script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led
  
  
  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_asst).set_use_wake_word(true);
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: id(voice_asst).set_use_wake_word(false);
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};

  - platform: restart
    name: "${name} Restart"
select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - wait_until:
          lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
      - if:
          condition:
            lambda: return x == "In Home Assistant";
          then:
            - micro_wake_word.stop
            - delay: 500ms
            - if:
                condition:
                  switch.is_off: mute
                then:
                  - lambda: id(voice_asst).set_use_wake_word(true);
                  - voice_assistant.start_continuous:
      - if:
          condition:
            lambda: return x == "On device";
          then:
            - lambda: id(voice_asst).set_use_wake_word(false);
            - voice_assistant.stop
            - delay: 500ms
            - micro_wake_word.start

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    publish_initial_state : True
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 100%
          effect: "wakeword"
#    on_release:
#      - voice_assistant.stop:
#      - output.turn_off: pa_ctrl
#      - light.turn_off:
#          id: led_ring
sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF
   

Two things I see that I would change. While I don’t know if it’s a requirement for YAML, I have never seen a configuration file that didn’t have at least one blank line between domains (light, switch, ect…).Anything that doesn’t have a space beginning at the line. Also, I would change the formatting for the MLBK part and codec to how it’s formatted in this post. That and a comment doesn’t count Change the below. Should just need to copy and paste the below into the config or add a blank line between domains Im going to Google that because I’m unsure if it’s a requirement but YAML is very specific about spacing so I’m betting it is.

Are you flashing over WiFi and if so, can I get a full copy of the logs? Start copying from the bottom up a few lines then stop, hold shift and PgUp key. Unless it lets you download them. I always get not allowed message or something similar.

Edit: it appears that a blank line used to be required but now there aren’t but still seems like best practice. Still seems like best practice. Without the full logs it’s hard to say why both the I2C device and codec both fail. It also never hurts to try and clean and build files and start from scratch. That can cause issues when you change things around.

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin: GPIO0 #GPIO42
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin: GPIO0 #GPIO20

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false

Yeah as far as i can tell, its all valid yaml. I dont believe spacing is necessary. The codec parts throw an error about the pins being used in other places which is why i have the properties broken out like this:

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 
    i2s_bclk_pin: GPIO25 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 
    i2s_bclk_pin: GPIO27 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

The codecs were failing because i was using the incorrect pin values so i think we’re good to go there. Now i just need to determine how i can keep the device working/listening. The setup you provided helped with allowed me to get the device working after reboot though which is a step in the right direction. It just seems like its still getting stuck somewhere after a while. Im flashing over wifi as well. Initial flash required me to do it over UART but after that its working fine over wifi. I do believe there is something wrong with the on_error scenario. If it fails once, it doesnt seem to continue listening. If it succeeds it does so i think that’s a good pointer in the logic there. Will look more closely this weekend.

Also, im seeing an error with piping out to an external HA media player.

it seems as though with microWakeWord, media_content_id: !lambda 'return x;' is the actual response STT and not a url pointing to the the speakers wav file. I wonder if instead of doing media_player.play_media, i can do a STT service call instead. I’ll test that out.

Logger: pysqueezebox.player
Source: components/squeezebox/media_player.py:496
First occurred: 2:49:59 PM (1 occurrences)
Last logged: 2:49:59 PM

Timed out waiting for playlist_urls to have value [{'url': ' Turn on the living room lights.'}]

Edit: Sweet, I got some audio going to my speaker now but its effectively just what i asked it to do and not a response which is odd. It used to be “Ok. Turned off the switch” or something along those lines. It would seem that return x is in fact the STT, and now i feel like an idiot. I am working in the on_stt_end action and not the on_tts_end section…lol brainfart.

This works for audio feedback via HA media player:

  on_tts_end:
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.the_kitchen
          media_content_id: !lambda 'return x;'
          media_content_type: music
          announce: "true"

Here are the logs:

INFO ESPHome 2024.4.0
INFO Reading configuration /config/kitchen-alexa.yaml...
WARNING GPIO12 is a strapping PIN and should only be used for I/O with care.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
INFO Generating C++ source...
INFO Updating https://github.com/espressif/[email protected]
INFO Updating submodules (components/esp-sr, components/esp-adf-libs) for https://github.com/espressif/[email protected]
INFO Updating https://github.com/espressif/[email protected]
INFO Updating https://github.com/espressif/esp-tflite-micro@None
INFO Compiling app...
Processing kitchen-alexa (board: esp-wrover-kit; framework: espidf; platform: platformio/[email protected])
--------------------------------------------------------------------------------
HARDWARE: ESP32 240MHz, 320KB RAM, 4MB Flash
 - framework-espidf @ 3.40406.240122 (4.4.6) 
 - tool-cmake @ 3.16.4 
 - tool-ninja @ 1.7.1 
 - toolchain-esp32ulp @ 2.35.0-20220830 
 - toolchain-xtensa-esp32 @ 8.4.0+2021r2-patch5
Reading CMake configuration...
Dependency Graph
|-- noise-c @ 0.1.4
Compiling .pioenvs/kitchen-alexa/src/main.o
Linking .pioenvs/kitchen-alexa/firmware.elf
/config/.esphome/platformio/packages/toolchain-xtensa-esp32/bin/../lib/gcc/xtensa-esp32-elf/8.4.0/../../../../xtensa-esp32-elf/bin/ld: missing --end-group; added as last command line option
RAM:   [=         ]  12.0% (used 39168 bytes from 327680 bytes)
Flash: [==        ]  19.9% (used 1616317 bytes from 8126464 bytes)
Building .pioenvs/kitchen-alexa/firmware.bin
Creating esp32 image...
Successfully created esp32 image.
esp32_create_combined_bin([".pioenvs/kitchen-alexa/firmware.bin"], [".pioenvs/kitchen-alexa/firmware.elf"])
Wrote 0x19c250 bytes to file /config/.esphome/build/kitchen-alexa/.pioenvs/kitchen-alexa/firmware-factory.bin, ready to flash to offset 0x0
========================= [SUCCESS] Took 23.21 seconds =========================
INFO Successfully compiled program.
INFO Connecting to 192.168.1.180
INFO Uploading /config/.esphome/build/kitchen-alexa/.pioenvs/kitchen-alexa/firmware.bin (1622608 bytes)
Uploading: [============================================================] 100% Done...

INFO Upload took 6.52 seconds, waiting for result...
INFO OTA successful
INFO Successfully uploaded program.
INFO Starting log output from 192.168.1.180 using esphome API
INFO Successfully connected to kitchen-alexa @ 192.168.1.180 in 7.136s
INFO Successful handshake with kitchen-alexa @ 192.168.1.180 in 0.060s
[14:41:44][I][app:100]: ESPHome version 2024.4.0 compiled on Apr 19 2024, 14:41:04
[14:41:44][I][app:102]: Project esphome.voice-assistant version 2.0
[14:41:44][C][wifi:580]: WiFi:
[redacted]
[14:41:44][C][logger:166]: Logger:
[14:41:44][C][logger:167]:   Level: DEBUG
[14:41:44][C][logger:169]:   Log Baud Rate: 115200
[14:41:44][C][logger:170]:   Hardware UART: UART0
[14:41:44][C][i2c.idf:075]: I2C Bus:
[14:41:44][C][i2c.idf:076]:   SDA Pin: GPIO19
[14:41:44][C][i2c.idf:077]:   SCL Pin: GPIO32
[14:41:44][C][i2c.idf:078]:   Frequency: 400000 Hz
[14:41:44][C][i2c.idf:084]:   Recovery: bus successfully recovered
[14:41:44][I][i2c.idf:094]: Results from i2c bus scan:
[14:41:44][I][i2c.idf:100]: Found i2c device at address 0x18
[14:41:44][I][i2c.idf:100]: Found i2c device at address 0x40
[14:41:44][C][gpio.output:010]: GPIO Binary Output:
[14:41:44][C][gpio.output:011]:   Pin: GPIO12
[14:41:44][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[14:41:44][C][esp32_rmt_led_strip:176]:   Pin: 33
[14:41:44][C][esp32_rmt_led_strip:177]:   Channel: 0
[14:41:44][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[14:41:44][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[14:41:44][C][esp32_rmt_led_strip:204]:   Number of LEDs: 12
[14:41:44][C][template.select:065]: Template Select 'Wake word engine location'
[14:41:44][C][template.select:066]:   Update Interval: 60.0s
[14:41:44][C][template.select:069]:   Optimistic: YES
[14:41:44][C][template.select:070]:   Initial Option: On device
[14:41:44][C][template.select:071]:   Restore Value: YES
[14:41:44][C][template.binary_sensor:028]: Template Binary Sensor 'Kitchen Alexa Volume Up'
[14:41:44][C][template.binary_sensor:028]: Template Binary Sensor 'Kitchen Alexa Volume Down'
[14:41:45][C][template.binary_sensor:028]: Template Binary Sensor 'Kitchen Alexa Set'
[14:41:45][C][template.binary_sensor:028]: Template Binary Sensor 'Kitchen Alexa Play'
[14:41:45][C][template.binary_sensor:028]: Template Binary Sensor 'Kitchen Alexa Mode'
[14:41:45][C][template.binary_sensor:028]: Template Binary Sensor 'Kitchen Alexa Record'
[14:41:45][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[14:41:45][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:41:45][D][light:051]:   Brightness: 100%
[14:41:45][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[14:41:45][D][micro_wake_word:115]: Starting Microphone
[14:41:45][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[14:41:45][D][esp-idf:000]: I (8419) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[14:41:45][D][esp-idf:000]: I (8428) I2S: I2S1, MCLK output by GPIO0

[14:41:45][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[14:41:45][C][light:103]: Light 'Kitchen Alexa Light'
[14:41:45][C][light:105]:   Default Transition Length: 0.0s
[14:41:45][C][light:106]:   Gamma Correct: 2.80
[14:41:45][C][template.switch:068]: Template Switch 'Use wake word'
[14:41:45][C][template.switch:091]:   Restore Mode: restore defaults to ON
[14:41:45][C][template.switch:057]:   Optimistic: YES
[14:41:45][C][template.switch:068]: Template Switch 'Mute'
[14:41:45][C][template.switch:091]:   Restore Mode: restore defaults to OFF
[14:41:45][C][template.switch:057]:   Optimistic: YES
[14:41:45][W][micro_wake_word:157]: Wake word is already running
[14:41:45][C][restart:068]: Restart Switch 'kitchen-alexa Restart'
[14:41:45][C][restart:070]:   Icon: 'mdi:restart'
[14:41:45][C][restart:091]:   Restore Mode: always OFF
[14:41:45][C][adc:097]: ADC Sensor 'button_adc'
[14:41:45][C][adc:097]:   Device Class: 'voltage'
[14:41:45][C][adc:097]:   State Class: 'measurement'
[14:41:45][C][adc:097]:   Unit of Measurement: 'V'
[14:41:45][C][adc:097]:   Accuracy Decimals: 2
[14:41:45][C][adc:107]:   Pin: GPIO39
[14:41:45][C][adc:122]:  Attenuation: 11db
[14:41:45][C][adc:142]:   Update Interval: 0.015s
[14:41:45][C][captive_portal:088]: Captive Portal:
[14:41:45][C][mdns:115]: mDNS:
[14:41:45][C][mdns:116]:   Hostname: kitchen-alexa
[14:41:45][C][ota:096]: Over-The-Air Updates:
[14:41:45][C][ota:097]:   Address: 192.168.1.180:3232
[14:41:45][C][ota:100]:   Using Password.
[14:41:45][C][ota:103]:   OTA version: 2.
[14:41:45][C][api:139]: API Server:
[14:41:45][C][api:140]:   Address: 192.168.1.180:6053
[14:41:45][C][api:142]:   Using noise encryption: YES
[14:41:45][C][micro_wake_word:057]: microWakeWord:
[14:41:45][C][micro_wake_word:058]:   Wake Word: alexa
[14:41:45][C][micro_wake_word:059]:   Probability cutoff: 0.660
[14:41:45][C][micro_wake_word:060]:   Sliding window size: 10
[14:41:45][C][es8311:167]: ES8311 Audio Codec:
[14:41:45][C][es8311:168]:   Use MCLK: YES
[14:41:45][C][es7210:035]: ES7210 Audio Codec:
[14:42:20][D][esp32.preferences:114]: Saving 1 preferences to flash...
[14:42:20][D][esp32.preferences:143]: Saving 1 preferences to flash: 1 cached, 0 written, 0 failed
[14:42:27][D][micro_wake_word:362]: Wake word sliding average probability is 0.700 and most recent probability is 1.000
[14:42:27][D][micro_wake_word:128]: Wake Word Detected
[14:42:27][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[14:42:27][D][micro_wake_word:134]: Stopping Microphone
[14:42:27][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[14:42:27][D][esp-idf:000]: I (50789) I2S: DMA queue destroyed

[14:42:27][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[14:42:27][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[14:42:27][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[14:42:27][D][voice_assistant:126]: microphone not running
[14:42:27][D][voice_assistant:210]: Requesting start...
[14:42:27][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[14:42:27][D][voice_assistant:476]: Client started, streaming microphone
[14:42:27][D][voice_assistant:439]: State changed from STARTING_PIPELINE to START_MICROPHONE
[14:42:27][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[14:42:27][D][voice_assistant:163]: Starting Microphone
[14:42:27][D][voice_assistant:439]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[14:42:27][D][esp-idf:000]: I (50871) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[14:42:27][D][esp-idf:000]: I (50881) I2S: I2S1, MCLK output by GPIO0

[14:42:27][D][voice_assistant:563]: Event Type: 1
[14:42:27][D][voice_assistant:566]: Assist Pipeline running
[14:42:27][D][voice_assistant:439]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[14:42:27][D][voice_assistant:563]: Event Type: 3
[14:42:27][D][voice_assistant:577]: STT started
[14:42:27][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:27][D][light:051]:   Brightness: 100%
[14:42:27][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[14:42:27][D][light:109]:   Effect: 'pulse'
[14:42:28][D][voice_assistant:563]: Event Type: 11
[14:42:28][D][voice_assistant:717]: Starting STT by VAD
[14:42:30][D][voice_assistant:563]: Event Type: 12
[14:42:30][D][voice_assistant:721]: STT by VAD end
[14:42:30][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[14:42:30][D][voice_assistant:445]: Desired state set to AWAITING_RESPONSE
[14:42:30][D][voice_assistant:439]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[14:42:30][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:30][D][light:051]:   Brightness: 100%
[14:42:30][D][light:059]:   Red: 100%, Green: 100%, Blue: 0%
[14:42:30][D][light:109]:   Effect: 'working'
[14:42:30][D][esp-idf:000]: I (53838) I2S: DMA queue destroyed

[14:42:30][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[14:42:31][D][voice_assistant:563]: Event Type: 4
[14:42:31][D][voice_assistant:591]: Speech recognised as: " Turn on the living room lights."
[14:42:31][D][voice_assistant:563]: Event Type: 5
[14:42:31][D][voice_assistant:596]: Intent started
[14:42:31][D][voice_assistant:563]: Event Type: 6
[14:42:31][D][voice_assistant:563]: Event Type: 7
[14:42:31][D][voice_assistant:619]: Response: "Turned on the switch"
[14:42:31][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:31][D][light:051]:   Brightness: 60%
[14:42:31][D][light:059]:   Red: 100%, Green: 100%, Blue: 0%
[14:42:31][D][voice_assistant:563]: Event Type: 8
[14:42:31][D][voice_assistant:639]: Response URL: "http://192.168.1.68:8123/api/tts_proxy/f2d7f52c511073b44cb4de9f96586d9479fd3630_en-gb_bafa2b33d1_tts.piper.wav"
[14:42:31][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[14:42:31][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[14:42:31][D][esp-idf:000]: I (54714) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[14:42:31][D][esp-idf:000]: I (54716) I2S: I2S0, MCLK output by GPIO0

[14:42:31][D][i2s_audio.speaker:164]: Started I2S Audio Speaker
[14:42:31][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:31][D][light:051]:   Brightness: 100%
[14:42:31][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[14:42:31][D][light:109]:   Effect: 'pulse'
[14:42:32][D][voice_assistant:563]: Event Type: 99
[14:42:32][D][voice_assistant:712]: TTS stream end
[14:42:32][D][voice_assistant:310]: End of audio stream received
[14:42:32][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[14:42:32][D][voice_assistant:445]: Desired state set to RESPONSE_FINISHED
[14:42:32][D][esp-idf:000]: I (56103) I2S: DMA queue destroyed

[14:42:32][D][i2s_audio.speaker:178]: Stopped I2S Audio Speaker
[14:42:32][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:32][D][light:051]:   Brightness: 100%
[14:42:32][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[14:42:32][D][light:109]:   Effect: 'connecting'
[14:42:32][D][voice_assistant:342]: Speaker has finished outputting all audio
[14:42:32][D][voice_assistant:439]: State changed from RESPONSE_FINISHED to IDLE
[14:42:32][D][voice_assistant:445]: Desired state set to IDLE
[14:42:32][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[14:42:32][D][micro_wake_word:115]: Starting Microphone
[14:42:32][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[14:42:32][D][esp-idf:000]: I (56152) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[14:42:32][D][esp-idf:000]: I (56153) I2S: I2S1, MCLK output by GPIO0

[14:42:32][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[14:42:33][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:33][D][light:051]:   Brightness: 20%
[14:42:33][D][light:059]:   Red: 0%, Green: 100%, Blue: 0%
[14:42:50][D][micro_wake_word:362]: Wake word sliding average probability is 0.755 and most recent probability is 1.000
[14:42:50][D][micro_wake_word:128]: Wake Word Detected
[14:42:50][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[14:42:50][D][micro_wake_word:134]: Stopping Microphone
[14:42:50][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[14:42:50][D][esp-idf:000]: I (73377) I2S: DMA queue destroyed

[14:42:50][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[14:42:50][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[14:42:50][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[14:42:50][D][voice_assistant:126]: microphone not running
[14:42:50][D][voice_assistant:210]: Requesting start...
[14:42:50][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[14:42:50][D][voice_assistant:476]: Client started, streaming microphone
[14:42:50][D][voice_assistant:439]: State changed from STARTING_PIPELINE to START_MICROPHONE
[14:42:50][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[14:42:50][D][voice_assistant:163]: Starting Microphone
[14:42:50][D][voice_assistant:439]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[14:42:50][D][esp-idf:000]: I (73460) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[14:42:50][D][esp-idf:000]: I (73469) I2S: I2S1, MCLK output by GPIO0

[14:42:50][D][voice_assistant:563]: Event Type: 1
[14:42:50][D][voice_assistant:566]: Assist Pipeline running
[14:42:50][D][voice_assistant:439]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[14:42:50][D][voice_assistant:563]: Event Type: 3
[14:42:50][D][voice_assistant:577]: STT started
[14:42:50][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:50][D][light:051]:   Brightness: 100%
[14:42:50][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[14:42:50][D][light:109]:   Effect: 'pulse'
[14:42:52][D][voice_assistant:563]: Event Type: 11
[14:42:52][D][voice_assistant:717]: Starting STT by VAD
[14:42:54][D][voice_assistant:563]: Event Type: 12
[14:42:54][D][voice_assistant:721]: STT by VAD end
[14:42:54][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[14:42:54][D][voice_assistant:445]: Desired state set to AWAITING_RESPONSE
[14:42:54][D][voice_assistant:439]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[14:42:54][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:54][D][light:051]:   Brightness: 100%
[14:42:54][D][light:059]:   Red: 100%, Green: 100%, Blue: 0%
[14:42:54][D][light:109]:   Effect: 'working'
[14:42:54][D][esp-idf:000]: I (77450) I2S: DMA queue destroyed

[14:42:54][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[14:42:54][D][voice_assistant:563]: Event Type: 4
[14:42:54][D][voice_assistant:591]: Speech recognised as: " Turn off the living room lights."
[14:42:54][D][voice_assistant:563]: Event Type: 5
[14:42:54][D][voice_assistant:596]: Intent started
[14:42:54][D][voice_assistant:563]: Event Type: 6
[14:42:54][D][voice_assistant:563]: Event Type: 7
[14:42:54][D][voice_assistant:619]: Response: "Turned off the switch"
[14:42:54][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:54][D][light:051]:   Brightness: 60%
[14:42:54][D][light:059]:   Red: 100%, Green: 100%, Blue: 0%
[14:42:54][D][voice_assistant:563]: Event Type: 8
[14:42:54][D][voice_assistant:639]: Response URL: "http://192.168.1.68:8123/api/tts_proxy/27e798e3b325b45707f464b7c25d263324181d1d_en-gb_bafa2b33d1_tts.piper.wav"
[14:42:54][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[14:42:54][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[14:42:54][D][esp-idf:000]: I (78264) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[14:42:54][D][esp-idf:000]: I (78266) I2S: I2S0, MCLK output by GPIO0

[14:42:54][D][i2s_audio.speaker:164]: Started I2S Audio Speaker
[14:42:55][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:55][D][light:051]:   Brightness: 100%
[14:42:55][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[14:42:55][D][light:109]:   Effect: 'pulse'
[14:42:56][D][voice_assistant:563]: Event Type: 99
[14:42:56][D][voice_assistant:712]: TTS stream end
[14:42:56][D][voice_assistant:310]: End of audio stream received
[14:42:56][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[14:42:56][D][voice_assistant:445]: Desired state set to RESPONSE_FINISHED
[14:42:56][D][esp-idf:000]: I (79813) I2S: DMA queue destroyed

[14:42:56][D][i2s_audio.speaker:178]: Stopped I2S Audio Speaker
[14:42:56][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:56][D][light:051]:   Brightness: 100%
[14:42:56][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[14:42:56][D][light:109]:   Effect: 'connecting'
[14:42:56][D][voice_assistant:342]: Speaker has finished outputting all audio
[14:42:56][D][voice_assistant:439]: State changed from RESPONSE_FINISHED to IDLE
[14:42:56][D][voice_assistant:445]: Desired state set to IDLE
[14:42:56][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[14:42:56][D][micro_wake_word:115]: Starting Microphone
[14:42:56][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[14:42:56][D][esp-idf:000]: I (79861) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[14:42:56][D][esp-idf:000]: I (79863) I2S: I2S1, MCLK output by GPIO0

[14:42:56][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[14:42:56][D][light:036]: 'Kitchen Alexa Light' Setting:
[14:42:56][D][light:051]:   Brightness: 20%
[14:42:56][D][light:059]:   Red: 0%, Green: 100%, Blue: 0%

You mean you followed the instructions…

Correct. It had been working for a while but i changed my network mode for some other reason not associated to voice. Then i read host mode only and took a peek.

1 Like

Here’s the korvo-1 (not V1. 1) with continued conversation. This allows it to listen for multiple commands. The slider is for how many seconds pass after hearing the last command then it stops. I need to snag the URL of the site I got the yaml for. You have to use a subscription for the wake word model or it doesn’t work as it’s used in scripts and other stuff.

micro_wake_word:
model: ${micro_wake_word_model}

substitutions:
  name: "korvo"
  friendly_name: korvo

  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"

  micro_wake_word_model: hey_jarvis

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "1.0"
  min_version: 2023.11.5
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            red: 0%
            blue: 0%
            green: 100%
            brightness: 100%
            effect: random
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;

esp32:
  board: esp32-s3-devkitc-1
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
      CONFIG_ESP32_S3_KORVO1_BOARD: "y"
    components:
      - name: esp32_s3_korvo1_board
        #source: github://espressif/components/hardware_driver@main
        source: github://abmantis/esphome_custom_audio_boards@main
        refresh: 0s

psram:
  mode: octal
  speed: 80MHz

external_components:
  - source: github://pr#5230
    components: esp_adf

ota:
logger:
api:
  encryption:
     key: YOU_API_KEY
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_off: mute
          then:
            - delay: 20ms
            - ble.disable:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 50%
          effect: led12              
  on_client_disconnected:
    then:
      - ble.enable
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting

dashboard_import:
  package_import_url: github://esphome/firmware/voice-assistant/esp32-s3-korvo1.yaml@main

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  use_address: 192.168.0.48
  ap:
  on_connect:
    then:
      - delay: 20ms # Gives time for improv results to be transmitted
      - ble.disable:
      - delay: 5s
  on_disconnect:
    then:
      - ble.enable:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

esp_adf:
  board: esp32s3korvo1

microphone:
  - platform: esp_adf
    id: korvo_mic

speaker:
  - platform: esp_adf
    id: korvo_speaker

micro_wake_word:
  model: ${micro_wake_word_model}
  on_wake_word_detected:
    - if:
        condition:
          switch.is_on: continued_conversation
        then:
          - lambda: id(va).set_use_wake_word(false);
          - voice_assistant.start_continuous: # wake_word: !lambda return wake_word; not yet supported for continuous
        else:
          - lambda: id(va).set_use_wake_word(false);
          - voice_assistant.start:
              wake_word: !lambda return wake_word;
    - light.turn_on:
        id: led_ring      
        red: 30%
        green: 30%
        blue: 70%
        brightness: 100%
        effect: wakeword
  
voice_assistant:
  id: va
  microphone: korvo_mic
  speaker: korvo_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 3.0
  vad_threshold: 3
  on_listening:     
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};    
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: led12
  #   - script.execute: reset_led    
    - if:
        condition:
          - switch.is_on: continued_conversation
        then:
          - script.execute: stt_timeout_to_idle
          - lambda: id(va).set_use_wake_word(false);  
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: reset_led
    - if:
        condition:
          - switch.is_on: continued_conversation
        then:
          - script.execute: stt_timeout_to_idle    
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 50%
        green: 50%
        brightness: 100%        
        effect: working
  #  - script.execute: reset_led    
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 50%
        red: 50%
        green: 0%
        brightness: 100%        
        effect: pulse
    #- script.execute: reset_led    
  on_tts_stream_end:
    - if:
        condition:
          - switch.is_on: continued_conversation
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
          - script.execute: stt_timeout_to_idle
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 20%        
        effect: connecting
   # - script.execute: reset_led    
  on_end:
    - if:
        condition:
          and:
            - switch.is_off: mute
            - lambda: return id(wake_word_engine_location).state == "On device";
            - switch.is_off: continued_conversation
        then:
          - script.execute: return_to_idle
  on_error:
    - if:
        condition:
          and:
            - lambda: return !id(init_in_progress);
            - lambda: return !(code == "stt-no-text-recognized");
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: reset_led
          - delay: 1s
          - script.stop: stt_timeout_to_idle
          - if:
              condition:
                switch.is_off: mute
              then:
                # - logger.log: "(on_client_connected) Returning to idle by script"
                - script.execute: return_to_idle
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: reset_led
  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - wait_until:
              not: ble.enabled
          - script.execute: return_to_idle
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false;
    - script.execute: reset_led
  on_client_disconnected:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "In Home Assistant";
        then:
          - lambda: id(va).set_use_wake_word(false);
          - voice_assistant.stop:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - micro_wake_word.stop
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    
    
script:
  - id: stt_timeout_to_idle
    mode: restart     # restart the timer if called before timeout
    then:
      if:
        condition:
          switch.is_on: continued_conversation
        then:
          - delay: !lambda "return id(continued_timeout).state * 1000;"
          - if:
              condition:
                lambda: return (id(voice_assistant_phase) == ${voice_assist_replying_phase_id});
              then:
                - wait_until:
                    condition:
                      lambda: return !(id(voice_assistant_phase) == ${voice_assist_replying_phase_id});
                      # normally this would complete and move to next phase with on_tts_stream_end,
                      # but sometimes this is missed so put a time limit on the wait
                    timeout: 10s
                - delay: 1s  # Give time for the stream to end and the phase to be switched back to listening and this timeout to be reset
          - script.execute: return_to_idle
          
  - id: return_to_idle
    then:
      - if:
          condition:
            lambda: return id(wake_word_engine_location).state == "On device";
          then:
            - script.stop: stt_timeout_to_idle
            - if:
                condition:
                  voice_assistant.is_running
                then:
                  - lambda: id(va).set_use_wake_word(false);
                  - voice_assistant.stop
                  - wait_until:
                      condition:
                        not:
                          voice_assistant.is_running
                      timeout: 5s
            - if:
                condition:
                  not:
                    micro_wake_word.is_running
                then:
                  - micro_wake_word.start:
      - if:
          condition:
            lambda: return id(wake_word_engine_location).state == "In Home Assistant";
          then:
            - if:
                condition:
                  micro_wake_word.is_running
                then:
                  - micro_wake_word.stop:
                  - wait_until:
                      condition:
                        not:
                          micro_wake_word.is_running
                      timeout: 5s
            - wait_until:
                condition:
                  not:
                    voice_assistant.is_running
                timeout: 5s
            - lambda: id(va).set_use_wake_word(true);
            - if:
                condition:
                  - not:
                      voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous:
      - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};      
      
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_off: mute
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: connecting
          else:
            - light.turn_off: led_ring

switch:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO38
    name: "${friendly_name} Speaker Mute"
    restore_mode: ALWAYS_ON

  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - script.execute: return_to_idle
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(va).set_use_wake_word(false);
            - if:
                condition:
                  - voice_assistant.is_running
                then:
                  - voice_assistant.stop
            - if:
                condition:
                  - micro_wake_word.is_running
                then:
                  - micro_wake_word.stop
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: reset_led
            
  - platform: template
    name: Continued conversation
    id: continued_conversation
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - script.execute: return_to_idle
    on_turn_on:
      - script.execute: return_to_idle
      
  - platform: restart
    name: "korvo restart"        

number:
  - platform: template
    entity_category: config
    name: Continued timeout
    id: continued_timeout
    icon: mdi:clock
    optimistic: true
    restore_value: true
    initial_value: 8
    min_value: 1
    step: 1
    max_value: 20
    unit_of_measurement: s
    mode: slider
 

select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - wait_until:
          lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
      - script.execute: return_to_idle

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    is_rgbw: true
    rgb_order: GRB    
    pin: GPIO19
    num_leds: 12
    rmt_channel: 0
    chipset: WS2812
    name: "${friendly_name} Light"
    default_transition_length: 1s
    effects:
      - addressable_scan:
          name: "led12"
          move_interval: 10ms
          scan_width: 12
      - pulse:
          name: "pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "wakeword"
          colors:
            - red: 0%
              green: 0%
              blue: 100%
              num_leds: 12
          add_led_interval: 20ms
          reverse: false
      - addressable_color_wipe:
          name: "connecting"
          colors:
            - red: 40%
              green: 30%
              blue: 30%
              num_leds: 12
          add_led_interval: 50ms
          reverse: true

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
    on_multi_click:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn    
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    on_press:
      - voice_assistant.start_continuous:
  - platform: template
    name: "${friendly_name} Record" 
    id: btn_record
    on_press:
      - voice_assistant.start:
      #- lambda: id(va).set_use_wake_word(true);
          
sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 1.01
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.3
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Have you added the media_player itself? It appears ID is a required field also according to the docs but I have not tried to use it as a media player either

All media_player actions can be used without specifying an id if you have only one media_player in your configuration YAML

So continued conversation is both awesome and amazing depending on the situation. It works flawlessly with no audio source near it like a TV or music. If you do, the Korvo picks up that audio and kind of gets stuck in a loop of “sorry, I didn’t understand that” because it picks up audio from TV/Music and will just start saying that over and over. There was an enhancement request for full support but the ESPHome team has been busy fixing issues and bug requests always get fixed before engagement requests for obvious reasons. That and most the examples are for the S3 box variants which have a screen so you have to remove all that stuff. If they added something like a “stop listening” command, which should be somewhat easy to do, it would be way better because it would fix the external audio loop issue. It is nice to be able to tell it multiple commands without having to say the trigger word every single time.

One small issue is I e had it lock up twice but only after applying the latest core update yesterday, before that it didn’t freeze up so I’m not sure if I got lucky and didn’t have any lock ups before the update or if the core update caused the issue.

Enhancement request

ESPHome code that I merged into my working config file.