"ReSpeaker Lite" - new Seeed Studio Voice Assistant Development Kit hardware combine ESP32 with XMOS XU316 DSP chip for advanced audio processing as a ESPHome-based Home Assistant Assist Satellite voice devkit

  1. I doubt it’ll play m4a. Take flac or mp3.
  2. IP is still there, you need IP address there.

P.S on this stage I already don’t know which YAML you use. :frowning:

sorry got yours from .

I did change the IP to my HA IP but just did not want to put in the forum here so just put IP
I used another mp3 I had but with the same result

tried this too

http://192.168.0.110:8123/local/sounds/Message-Notification.mp3
image

Maybe because I am on latest esphome?

Then the problem is somewhere else.
Expecting colon - probably, you got colon deleted from somewhere. There’s plenty of colons in YAML. :slight_smile:
Check lines 340 and 342 for syntax.

Here is my full working config with the API, OTA and HS removed. Timers are not setup as I created some custom timers via HA sentence/response automations. I’m not sure what all audio formats are supported. Mp3 and wav formats are.

To get it up and running just leave as it and change later. The defaults.dont cause issues or warnings. I’m betting there is just no audio output for a timer and an error in the log files when it finishes about not finding the file. Just update the API key and OTA and HS passwords to match what’s in your current file. This isn’t required but API is probably not considered a strong encryption key :wink:

If not updating directly in the web ui I suggest a decent file editor like Notepad++ , Notepad can cause encoding and other weird issues like line indention. I think it’s pretty rare but it happens.

I would cut and paste your current config to notepad (we just need 5 things), paste.the below directly into ESPHome and change name, friendly name, and the API/OTA/HS info

substitutions:
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
  voice_assist_timer_finished_phase_id: "20"
esphome:
  name: respeakerv3
  friendly_name: Respeakerv3
  min_version: 2024.9.0
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    priority: 600
    then:
      - script.execute: draw_display
      - delay: 30s
      - if:
          condition:
            lambda: return id(init_in_progress);
          then:
            - lambda: id(init_in_progress) = false;
            - script.execute: draw_display
esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  flash_size: 8MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_ESP32S3_INSTRUCTION_CACHE_32KB: "y"
      CONFIG_ESP32_S3_BOX_BOARD: "y"
      CONFIG_SPIRAM_ALLOW_STACK_EXTERNAL_MEMORY: "y"
      
      CONFIG_SPIRAM_TRY_ALLOCATE_WIFI_LWIP: "y"

      # Settings based on https://github.com/espressif/esp-adf/issues/297#issuecomment-783811702
      CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
      CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
      CONFIG_ESP32_WIFI_STATIC_TX_BUFFER: "y"
      CONFIG_ESP32_WIFI_TX_BUFFER_TYPE: "0"
      CONFIG_ESP32_WIFI_STATIC_TX_BUFFER_NUM: "8"
      CONFIG_ESP32_WIFI_CACHE_TX_BUFFER_NUM: "32"
      CONFIG_ESP32_WIFI_AMPDU_TX_ENABLED: "y"
      CONFIG_ESP32_WIFI_TX_BA_WIN: "16"
      CONFIG_ESP32_WIFI_AMPDU_RX_ENABLED: "y"
      CONFIG_ESP32_WIFI_RX_BA_WIN: "32"
      CONFIG_LWIP_MAX_ACTIVE_TCP: "16"
      CONFIG_LWIP_MAX_LISTENING_TCP: "16"
      CONFIG_TCP_MAXRTX: "12"
      CONFIG_TCP_SYNMAXRTX: "6"
      CONFIG_TCP_MSS: "1436"
      CONFIG_TCP_MSL: "60000"
      CONFIG_TCP_SND_BUF_DEFAULT: "65535"
      CONFIG_TCP_WND_DEFAULT: "65535"  # Adjusted from linked settings to avoid compilation error
      CONFIG_TCP_RECVMBOX_SIZE: "512"
      CONFIG_TCP_QUEUE_OOSEQ: "y"
      CONFIG_TCP_OVERSIZE_MSS: "y"
      CONFIG_LWIP_WND_SCALE: "y"
      CONFIG_TCP_RCV_SCALE: "3"
      CONFIG_LWIP_TCPIP_RECVMBOX_SIZE: "512"

      CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST: "y"
      CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY: "y"

psram:
  mode: octal  # quad for N8R2 and octal for N16R8
  speed: 80MHz

external_components:
  - source:
      type: git
      url: https://github.com/esphome/voice-kit
      ref: dev
    components:
      - aic3204
      - audio_dac
      - media_player
      - micro_wake_word
      - microphone
      - nabu
      - nabu_microphone
      - voice_assistant
      - voice_kit
    refresh: 0s

api:
  encryption:
    key: "API"  
  on_client_connected:
    - script.execute: draw_display
  on_client_disconnected:
    - script.execute: draw_display

ota:
  - platform: esphome
    id: ota_esphome
    password: "OTA"

logger:

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  on_connect:
    - script.execute: draw_display
  on_disconnect:
    - script.execute: draw_display
  #ssid: !secret wifi_ssid
  #password: !secret wifi_password
  #manual_ip:
  #  static_ip: !secret respeaker_satellite_1_ip
  #  gateway: !secret gateway
  #  subnet: !secret subnet
  #  dns1: !secret dns1
    
  ap:
    ssid: "Respeakerv3 Fallback Hotspot"
    password: "HS"      
  

captive_portal:

button:
  - platform: safe_mode
    id: button_safe_mode
    name: Safe Mode Boot
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset
  - platform: restart
    name: Restart
    id: but_rest

light:
  - platform: esp32_rmt_led_strip
    id: led_ww
    rgb_order: grb
    pin: GPIO1
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: none
    disabled_by_default: true
    entity_category: config
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 1
          add_led_interval: 40ms
          reverse: false


 # Audio and Voice Assistant Config  

i2s_audio:
  - id: i2s_output
    i2s_lrclk_pin: 
      number: GPIO7
      allow_other_uses: true
    i2s_bclk_pin:  
      number: GPIO8
      allow_other_uses: true
    i2s_mclk_pin:  
      number: GPIO9
      allow_other_uses: true

  - id: i2s_input
    i2s_lrclk_pin:  
      number: GPIO7
      allow_other_uses: true
    i2s_bclk_pin:  
      number: GPIO8
      allow_other_uses: true
    i2s_mclk_pin:  
      number: GPIO9
      allow_other_uses: true

microphone:
  - platform: nabu_microphone
    i2s_din_pin: GPIO44
    adc_type: external
    pdm: false
    sample_rate: 16000
    bits_per_sample: 32bit
    i2s_mode: secondary
    i2s_audio_id: i2s_input
    channel_0:
      id: nabu_mic_mww
    channel_1:
      id: nabu_mic_va
      

media_player:
  - platform: nabu
    id: nabu_media_player
    name: Media Player
    internal: false
    sample_rate: 16000
    i2s_dout_pin: GPIO43
    bits_per_sample: 32bit
    i2s_mode: secondary
    i2s_audio_id: i2s_output
    volume_increment: 0.05
    volume_min: 0.4
    volume_max: 0.85
    on_announcement:
      - nabu.set_ducking:
          decibel_reduction: 20
          duration: 0.0s
    on_state:
      if:
        condition:
          and:
            - not:
                voice_assistant.is_running:
            - not:
                lambda: return id(nabu_media_player)->state == media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING;
        then:
          - nabu.set_ducking:
              decibel_reduction: 0
              duration: 1.0s

micro_wake_word:
  models:
    - model: https://github.com/kahrendt/microWakeWord/releases/download/okay_nabu/okay_nabu.json
  vad:
  microphone: nabu_mic_mww
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;

voice_assistant:
  id: va
  microphone: nabu_mic_va
  media_player: nabu_media_player
  noise_suppression_level: 4
  auto_gain: 31dBFS
  volume_multiplier: 1
  on_client_connected:
    - lambda: id(init_in_progress) = false;
    - micro_wake_word.start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: draw_display
  on_client_disconnected:
    - voice_assistant.stop:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: draw_display
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: draw_display
          - delay: 1s
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - script.execute: draw_display
  on_start:
    - nabu.set_ducking:
        decibel_reduction: 20   # Number of dB quieter; higher implies more quiet, 0 implies full volume
        duration: 0.0s          # The duration of the transition (default is 0)
  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: draw_display
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: draw_display
  on_tts_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: draw_display
  on_end:
    - wait_until:
        not:
          voice_assistant.is_running:
    - nabu.set_ducking:
        decibel_reduction: 0   # 0 dB means no reduction
        duration: 1.0s
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: draw_display
  on_timer_started:
    - script.execute: draw_display
  on_timer_cancelled:
    - script.execute: draw_display
  on_timer_updated:
    - script.execute: draw_display
  on_timer_tick:
    - script.execute: draw_display
  on_timer_finished:
    - lambda: id(voice_assistant_phase) = ${voice_assist_timer_finished_phase_id};
    - script.execute: draw_display
    - wait_until:
        not:
          microphone.is_capturing: nabu_mic_va
    - media_player.play_media: http://ha_ip:8123/local/sounds/timer-ding.mp3
    - delay: 2s
    - media_player.play_media: http://ha_ip:8123/local/sounds/timer-ding.mp3
    - delay: 2s
    - media_player.play_media: http://ha_ip:8123/local/sounds/timer-ding.mp3
    - delay: 2s
    - wait_until:
        not:
          media_player.is_playing:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: draw_display

script:
  - id: draw_display
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                  wifi.connected:
                then:
                  - if:
                      condition:
                        api.connected:
                      then:
                        - lambda: |
                            switch(id(voice_assistant_phase)) {
                              case ${voice_assist_listening_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(1.0)
                                  .set_rgb(1.0, 0.2, 1.0)
                                  .set_effect("Wakeword")
                                  .perform();
                                break;
                              case ${voice_assist_thinking_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(1.0)
                                  .set_rgb(1.0, 0.2, 1.0)
                                  .set_effect("Fast Pulse")
                                  .perform();
                                break;
                              case ${voice_assist_replying_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(1.0)
                                  .set_rgb(0.2, 1.0, 1.0)
                                  .set_effect("Slow Pulse")
                                  .perform();
                                break;
                              case ${voice_assist_error_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(1.0)
                                  .set_rgb(1.0, 1.0, 0.2)
                                  .set_effect("Fast Pulse")
                                  .perform();
                                break;
                              case ${voice_assist_muted_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(0.3)
                                  .set_rgb(1.0, 0.0, 0.0)
                                  .perform();
                                break;
                              case ${voice_assist_not_ready_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(0.3)
                                  .set_rgb(1.0, 1.0, 0.2)
                                  .perform();
                                   break;
                              case ${voice_assist_timer_finished_phase_id}:
                                id(led_ww).turn_on()
                                  .set_brightness(1.0)
                                  .set_rgb(0.0, 1.0, 0.0)
                                  .set_effect("Fast Pulse")
                                  .perform();
                                break;
                              default:
                                id(led_ww).turn_off()
                                  .perform();
                            }
                      else:
                        - light.turn_on:
                            id: led_ww           
                            red: 100%
                            green: 0%
                            blue: 0%
                            brightness: 60%
                            effect: fast pulse 
                else:
                  - light.turn_on:
                      id: led_ww           
                      red: 100%
                      green: 0%
                      blue: 0%
                      brightness: 60%
                      effect: slow pulse
          else:
            - light.turn_on:
                id: led_ww           
                red: 100%
                green: 100%
                blue: 0%
                brightness: 50%
                effect: slow pulse

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}

Your config seems to be merge of mine and old Box config, right? Timer is still there, it’s just beeping 3 times. It will need file too. Also the LED there is updated each second while timer is running. I had this before, since copied from Box config (there it’s updating screen, so makes sense). Here we can clear that, freeing some CPU time.

BTW with new timers (like in my config) they behave closer to how Alexa does it. Beeping for 15 minutes, unless stopped either by saying wake word, or tapping the button on device.

I’m using the yaml from your first post

I assume you mean this yaml when referring to your YAML

I just compared what I had vs your Github and there were quite a few changes. Updated to your yaml (bellow, with obvious removed), cleaned build files and did a new install, without the few “shadowed declaration here” warnings you always get ran with zero issues.

substitutions:
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
esphome:
  name: respeakerv3
  friendly_name: Respeakerv3
  min_version: 2024.9.0
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    priority: 600
    then:
      - script.execute: adjust_led
      - delay: 30s
      - if:
          condition:
            lambda: return id(init_in_progress);
          then:
            - lambda: id(init_in_progress) = false;
            - script.execute: adjust_led
esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  flash_size: 8MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_ESP32S3_INSTRUCTION_CACHE_32KB: "y"
      CONFIG_ESP32_S3_BOX_BOARD: "y"
      CONFIG_SPIRAM_ALLOW_STACK_EXTERNAL_MEMORY: "y"
      
      CONFIG_SPIRAM_TRY_ALLOCATE_WIFI_LWIP: "y"

      # Settings based on https://github.com/espressif/esp-adf/issues/297#issuecomment-783811702
      CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
      CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
      CONFIG_ESP32_WIFI_STATIC_TX_BUFFER: "y"
      CONFIG_ESP32_WIFI_TX_BUFFER_TYPE: "0"
      CONFIG_ESP32_WIFI_STATIC_TX_BUFFER_NUM: "8"
      CONFIG_ESP32_WIFI_CACHE_TX_BUFFER_NUM: "32"
      CONFIG_ESP32_WIFI_AMPDU_TX_ENABLED: "y"
      CONFIG_ESP32_WIFI_TX_BA_WIN: "16"
      CONFIG_ESP32_WIFI_AMPDU_RX_ENABLED: "y"
      CONFIG_ESP32_WIFI_RX_BA_WIN: "32"
      CONFIG_LWIP_MAX_ACTIVE_TCP: "16"
      CONFIG_LWIP_MAX_LISTENING_TCP: "16"
      CONFIG_TCP_MAXRTX: "12"
      CONFIG_TCP_SYNMAXRTX: "6"
      CONFIG_TCP_MSS: "1436"
      CONFIG_TCP_MSL: "60000"
      CONFIG_TCP_SND_BUF_DEFAULT: "65535"
      CONFIG_TCP_WND_DEFAULT: "65535"  # Adjusted from linked settings to avoid compilation error
      CONFIG_TCP_RECVMBOX_SIZE: "512"
      CONFIG_TCP_QUEUE_OOSEQ: "y"
      CONFIG_TCP_OVERSIZE_MSS: "y"
      CONFIG_LWIP_WND_SCALE: "y"
      CONFIG_TCP_RCV_SCALE: "3"
      CONFIG_LWIP_TCPIP_RECVMBOX_SIZE: "512"

      CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST: "y"
      CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY: "y"

psram:
  mode: octal  # quad for N8R2 and octal for N16R8
  speed: 80MHz

external_components:
  - source:
      type: git
      url: https://github.com/esphome/voice-kit
      ref: dev
    components:
      - aic3204
      - audio_dac
      - media_player
      - micro_wake_word
      - microphone
      - nabu
      - nabu_microphone
      - voice_assistant
      - voice_kit
    refresh: 0s

api:
  encryption:
    key: "API"  
  on_client_connected:
    - script.execute: adjust_led
  on_client_disconnected:
    - script.execute: adjust_led

ota:
  - platform: esphome
    id: ota_esphome
    password: "OTA"

logger:

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  on_connect:
    - script.execute: adjust_led
  on_disconnect:
    - script.execute: adjust_led
  #ssid: !secret wifi_ssid
  #password: !secret wifi_password
  #manual_ip:
  #  static_ip: !secret respeaker_satellite_1_ip
  #  gateway: !secret gateway
  #  subnet: !secret subnet
  #  dns1: !secret dns1
    
  ap:
    ssid: "Respeakerv3 Fallback Hotspot"
    password: "HS"      
  

captive_portal:

switch:
  - platform: template
    id: timer_ringing
    optimistic: true
    internal: true
    restore_mode: ALWAYS_OFF
    on_turn_on:
      # Duck audio
      - nabu.set_ducking:
          decibel_reduction: 20
          duration: 0.0s
      # Ring timer
      - script.execute: ring_timer
      # Refresh LED
      - script.execute: adjust_led
      # If 15 minutes have passed and the timer is still ringing, stop it.
      - delay: 15min
      - switch.turn_off: timer_ringing
    on_turn_off:
      # Stop any current annoucement (ie: stop the timer ring mid playback)
      - if:
          condition:
            lambda: return id(nabu_media_player)->state == media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING;
          then:
            lambda: |-
              id(nabu_media_player)
                ->make_call()
                .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_STOP)
                .set_announcement(true)
                .perform();
      # Set back ducking ratio to zero
      - nabu.set_ducking:
          decibel_reduction: 0
          duration: 1.0s
      # Refresh the LED ring
      - script.execute: adjust_led

button:
  - platform: safe_mode
    id: button_safe_mode
    name: Safe Mode Boot
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset
  - platform: restart
    name: Restart
    id: but_rest
  

binary_sensor:
  - platform: gpio
    pin: 
      number: GPIO4 # D3
      inverted: true
    id: mute
    name: "Mute"
  - platform: gpio
    pin: 
      number: GPIO3 # D2
      inverted: true
    id: user_button
    name: "User button"
    on_multi_click:
      - timing:
          - ON for at most 1s
          - OFF for at least 0.25s
        then:
          - if:
              condition:
                lambda: return !id(init_in_progress);
              then:
                - if:
                    condition:
                      switch.is_on: timer_ringing
                    then:
                      - switch.turn_off: timer_ringing
                    else:
                      - if:
                          condition:
                            lambda: return id(nabu_media_player)->state == media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING;
                          then:
                            - lambda: |
                                id(nabu_media_player)
                                  ->make_call()
                                  .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_STOP)
                                  .set_announcement(true)
                                  .perform();
                          else:
                            - if:
                                condition:
                                  voice_assistant.is_running:
                                then:
                                  - voice_assistant.stop:
                                else:
                                  - if:
                                      condition:
                                        media_player.is_playing:
                                      then:
                                        - media_player.pause:
                                      else:
                                        - if:
                                            condition:
                                              and:
                                                # - switch.is_off: master_mute_switch
                                                - not:
                                                    voice_assistant.is_running
                                            then:
                                              - voice_assistant.start:

light:
  - platform: esp32_rmt_led_strip
    id: led_ww
    rgb_order: GRB
    pin: GPIO1
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: none
    disabled_by_default: true
    entity_category: config
    default_transition_length: 0s

    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%


 # Audio and Voice Assistant Config  

i2s_audio:
  - id: i2s_output
    i2s_lrclk_pin: 
      number: GPIO7
      allow_other_uses: true
    i2s_bclk_pin:  
      number: GPIO8
      allow_other_uses: true
    i2s_mclk_pin:  
      number: GPIO9
      allow_other_uses: true

  - id: i2s_input
    i2s_lrclk_pin:  
      number: GPIO7
      allow_other_uses: true
    i2s_bclk_pin:  
      number: GPIO8
      allow_other_uses: true
    i2s_mclk_pin:  
      number: GPIO9
      allow_other_uses: true

microphone:
  - platform: nabu_microphone
    i2s_din_pin: GPIO44
    adc_type: external
    pdm: false
    sample_rate: 16000
    bits_per_sample: 32bit
    i2s_mode: secondary
    i2s_audio_id: i2s_input
    channel_0:
      id: nabu_mic_mww
    channel_1:
      id: nabu_mic_va
      

media_player:
  - platform: nabu
    id: nabu_media_player
    name: Media Player
    internal: false
    sample_rate: 16000
    i2s_dout_pin: GPIO43
    bits_per_sample: 32bit
    i2s_mode: secondary
    i2s_audio_id: i2s_output
    volume_increment: 0.05
    volume_min: 0.4
    volume_max: 0.85
    on_announcement:
      - nabu.set_ducking:
          decibel_reduction: 20
          duration: 0.0s
    on_state:
      if:
        condition:
          and:
            - switch.is_off: timer_ringing
            - not:
                voice_assistant.is_running:
            - not:
                lambda: return id(nabu_media_player)->state == media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING;
        then:
          - nabu.set_ducking:
              decibel_reduction: 0
              duration: 1.0s

micro_wake_word:
  models:
    - model: https://github.com/kahrendt/microWakeWord/releases/download/okay_nabu/okay_nabu.json
  vad:
  microphone: nabu_mic_mww
  on_wake_word_detected:
    # If a timer is ringing: Stop it, do not start the voice assistant (We can stop timer from voice!)
    - if:
        condition:
          switch.is_on: timer_ringing
        then:
          - switch.turn_off: timer_ringing
        # Start voice assistant, stop current announcement.
        else:
          - if:
              condition:
                lambda: return id(nabu_media_player)->state == media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING;
              then:
                lambda: |-
                  id(nabu_media_player)
                    ->make_call()
                    .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_STOP)
                    .set_announcement(true)
                    .perform();
          - voice_assistant.start:
              wake_word: !lambda return wake_word;

voice_assistant:
  id: va
  microphone: nabu_mic_va
  media_player: nabu_media_player
  noise_suppression_level: 0
  auto_gain: 0dBFS
  volume_multiplier: 1
  on_client_connected:
    - lambda: id(init_in_progress) = false;
    - micro_wake_word.start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: adjust_led
  on_client_disconnected:
    - voice_assistant.stop:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: adjust_led
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: adjust_led
          - delay: 1s
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - script.execute: adjust_led
  on_start:
    - nabu.set_ducking:
        decibel_reduction: 20   # Number of dB quieter; higher implies more quiet, 0 implies full volume
        duration: 0.0s          # The duration of the transition (default is 0)
  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: adjust_led
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: adjust_led
  on_tts_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: adjust_led
  on_end:
    - wait_until:
        not:
          voice_assistant.is_running:
    - nabu.set_ducking:
        decibel_reduction: 0   # 0 dB means no reduction
        duration: 1.0s
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: adjust_led
  on_timer_finished:
    - switch.turn_on: timer_ringing

script:
  - id: ring_timer
    then:
      - while:
          condition:
            switch.is_on: timer_ringing
          then:
            - media_player.play_media: http://{ha_ip}:8123/local/sounds/timer_finished.flac
            - delay: 1s
            - wait_until:
                not:
                  media_player.is_playing:
  - id: adjust_led
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if: 
                condition:
                  switch.is_on: timer_ringing
                then:
                  - light.turn_on:
                      id: led_ww           
                      red: 0%
                      green: 100%
                      blue: 0%
                      brightness: 60%
                      effect: fast pulse 
                else:
                  - if:
                      condition:
                        wifi.connected:
                      then:
                        - if:
                            condition:
                              api.connected:
                            then:
                              - lambda: |
                                  switch(id(voice_assistant_phase)) {
                                    case ${voice_assist_listening_phase_id}:
                                      id(led_ww).turn_on()
                                        .set_brightness(0.6)
                                        .set_rgb(1.0, 0.2, 1.0)
                                        .set_effect("Slow Pulse")
                                        .perform();
                                      break;
                                    case ${voice_assist_thinking_phase_id}:
                                      id(led_ww).turn_on()
                                        .set_brightness(0.6)
                                        .set_rgb(1.0, 0.2, 1.0)
                                        .set_effect("Fast Pulse")
                                        .perform();
                                      break;
                                    case ${voice_assist_replying_phase_id}:
                                      id(led_ww).turn_on()
                                        .set_brightness(0.6)
                                        .set_rgb(0.2, 1.0, 1.0)
                                        .set_effect("Slow Pulse")
                                        .perform();
                                      break;
                                    case ${voice_assist_error_phase_id}:
                                      id(led_ww).turn_on()
                                        .set_brightness(0.6)
                                        .set_rgb(1.0, 1.0, 0.2)
                                        .set_effect("Fast Pulse")
                                        .perform();
                                      break;
                                    case ${voice_assist_muted_phase_id}:
                                      id(led_ww).turn_on()
                                        .set_brightness(0.3)
                                        .set_rgb(1.0, 0.0, 0.0)
                                        .perform();
                                      break;
                                    case ${voice_assist_not_ready_phase_id}:
                                      id(led_ww).turn_on()
                                        .set_brightness(0.3)
                                        .set_rgb(1.0, 1.0, 0.2)
                                        .perform();
                                        break;
                                    default:
                                      id(led_ww).turn_off()
                                        .perform();
                                  }
                            else:
                              - light.turn_on:
                                  id: led_ww           
                                  red: 100%
                                  green: 0%
                                  blue: 0%
                                  brightness: 40%
                                  effect: fast pulse 
                      else:
                        - light.turn_on:
                            id: led_ww           
                            red: 100%
                            green: 0%
                            blue: 0%
                            brightness: 40%
                            effect: slow pulse
          else:
            - light.turn_on:
                id: led_ww           
                red: 100%
                green: 100%
                blue: 0%
                brightness: 30%
                effect: slow pulse

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}
1 Like

Oh right, I posted very first version here directly…

GitHub version is the successor, with many improvements. I’d recommend using it with Respeaker, as it has many things related to Respeaker specifically, and closer to Voice-kit (which means it will be easier to update, when PE software releases).
P.S. I guess I’ll remove old one. It’s misleading.

Ah yes think I found the issue. Memory issue. Had to stop all addons to get ginandbacon version up as it gave me errors as well. Will try yours again

Nope yours I just copy pasted and hit compile and got this

INFO ESPHome 2024.9.2
INFO Reading configuration /config/esphome/respeaker.yaml…
ERROR Error while reading config: Invalid YAML syntax:

Secret ‘ota_password’ not defined
in “/config/esphome/respeaker.yaml”, line 102, column 15

oh right gotta add the password. lol

Oh, I feel so sorry for everyone having ESPHome as HA add-on on Raspberry. Esp-idf is so CPU and RAM consuming, that starts to give errors here and there…

I am running the generic x86 image on a roughly 3 year old mini PC with AMD CPU, nvme ssd and it still maxes out the CPU. I have to much RAM honestly but still, it gets high. I would hate to see this compile on a raspberry pi 4 with 4GB of RAM using an sdcard. Heck, 8GB might be required.

1 Like

lol yeah. I have it on a 13th gen Intel machine inn VMware but allocating 2 Cpus and limited ram is the issue. All good for everything except voice assistants it seems

Update: Stupid thing crashed again. Restart repeat and finally it compiled. Yay!
Flashed it.

BTW thanks for helping me with this. Very much appreciated.

Finally may I ask what I need to do to change the wakeword to Jarvis?

1 Like

model: hey_jarvis
But I would advise against it. I want to use Jarvis myself - but okay Nabu is new gen model with croudfunded real voice samples, unlike Jarvis (it’s trained on just TTS-generated samples). So Nabu is very, very much better.

BTW, HA Discord is great tool for support. :slight_smile: I use it more than forum lately.

Ok got it. So wish it was jarvis though. lol

Ah yes I am on the HA discord but didnt even think to ask there.

This explains a lot, I saw originally it was pointing to a URL but to a json file so I just left it as is. I believe others are different format if memory serves correctly. I think I’ve had one or two false positives from the TV, maybe just one but it also triggers easily from my voice with the TV on. Now, stopping listening and listening to the TV is a different yet expected story at this stage of things. I use okay_nabu in S3 box and there are far more false positives from TV/Music on the S3 box

1 Like

So I have just powered it with the board USB and also swapped to ESP for power to use it. There is a lot of flashing of leds and then the led just stops. Tried saying hey nabu like fifty times but it did not return any acknowledgement of any sort.

Am I meant to power it using the usb of the board or the ESP32? Also any thoughts on whats wrong?

Delete the device from ESPHome integration (not the yaml file in the add on) and restart HA, it should just detect it and work after adding it. You may need the API key but probably not.I had to do the same when changing from Seeed’s yaml with the listen light switch and other one that’s greyed out. Worked perfectly after a reboot. I also have a custom voice pipeline when using Microwakeword with no OpenWakeWord defined. Not needed but confirms microwakeword is working as it should be.

Regardling power you can plug it into either port. Didn’t know that. Had it plugged into the XIAO but tried the board and it works with zero issues. I also don’t have to worry about any yanking on the soldiering of the XIAO chip by the cable this way but both work for me. When you plug it in the green light and the red light on the XIAO will come on. The red light goes off but the green light stays on. The RGB in the middle is broke on mine but will see it flash if working when it hears “ok nabu”

hmm removed and added after reboot.

Not having any luck with making it work

Should I flash it Seeed yaml to see if it still works?

When I plug it in, RGB light is light green, then flashing then eventually it turns red. If I say “Ok nabu” I dont hear anything back. I saying turn on light…nothing.