ESP32 S3 and Voice Assistant config

Does anyone know a guide to install VOICE ASSISTANT on an ESP32 S3 updated? Most of the videos I’ve seen have old configurations and some of the configuration is deprecated. Are there any new updated guides?

Thanks

I was just playing with and working on it this past week. This works for me with an INMP441 and Max98357. I have some packages that I use to separate common configuration but they are not related to the VA stuff

substitutions:
  #pins
  i2s_out_lrclk_pin: GPIO6 # LRC on Max98357
  i2s_out_bclk_pin: GPIO7 # BCLK on Max98357
  i2s_in_lrclk_pin: GPIO2 # WS on INMP441
  i2s_in_bclk_pin: GPIO1 # SLK on INMP441
  light_pin: GPIO48 # on-board LED
  speaker_pin: GPIO8 # DIN on Max98357
  mic_pin: GPIO4 # SD on INMP441

  ota_password: <redacted>
  api_key: <redacted>
  ip: <redacted>

  dev_name: s3test
  friendly: S3Test

  # Phases of the Voice AssistantGPIO8
  # IDLE: The voice assistant is ready to be triggered by a wake-word
  voice_assist_idle_phase_id: '1'
  # LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word)
  voice_assist_listening_phase_id: '2'
  # THINKING: The voice assistant is currently processing the command
  voice_assist_thinking_phase_id: '3'
  # REPLYING: The voice assistant is replying to the command
  voice_assist_replying_phase_id: '4'
  # NOT_READY: The voice assistant is not ready 
  voice_assist_not_ready_phase_id: '10'
  # ERROR: The voice assistant encountered an error
  voice_assist_error_phase_id: '11'  
  # MUTED: The voice assistant is muted and will not reply to a wake-word
  voice_assist_muted_phase_id: '12'

psram:
  mode: octal
  speed: 80MHz

esp32:
  board: esp32-s3-devkitc-1
  variant: ESP32S3
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      # ESP32-S3 N16R8
      CONFIG_ESP32_S3_BOX_BOARD: "y"
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB:      "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B:  "y"
      CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST: "y"
      CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY: "y"
      CONFIG_MBEDTLS_EXTERNAL_MEM_ALLOC: "y"
      CONFIG_MBEDTLS_SSL_PROTO_TLS1_3: "y" 

      # ESP32-S3-Zero
      # CONFIG_ESP32_S3_BOX_BOARD: "y"
      # CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
      # CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
      # CONFIG_TCPIP_RECVMBOX_SIZE: "512"
      # CONFIG_TCP_SND_BUF_DEFAULT: "65535"
      # CONFIG_TCP_WND_DEFAULT: "512000"
      # CONFIG_TCP_RECVMBOX_SIZE: "512"
      # CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      # CONFIG_ESP32S3_DATA_CACHE_64KB:      "y"
      # CONFIG_ESP32S3_DATA_CACHE_LINE_64B:  "y"
      # CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST: "y"
      # CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY: "y"
      # CONFIG_MBEDTLS_EXTERNAL_MEM_ALLOC: "y"
      # CONFIG_MBEDTLS_SSL_PROTO_TLS1_3: "y" 

packages:
  common: !include common/common.yaml
  wifi: !include common/static_wifi.yaml
  base: !include common/esp32dev-base.yaml


esphome:
  on_boot:
      priority: 600
      then: 
        # Run the script to refresh the LED status
        - script.execute: control_led
        # - output.turn_off: set_low_speaker
        # If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status
        - delay: 30s
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;
              - script.execute: control_led

microphone:
  - platform: i2s_audio
    i2s_din_pin: ${mic_pin}
    adc_type: external
    pdm: false
    i2s_audio_id: i2s_in
    id: comm_mic
    channel: left

speaker:
  - platform: i2s_audio
    id: i2s_audio_speaker
    dac_type: external
    sample_rate: 48000
    i2s_dout_pin: 
      number: ${speaker_pin}
    bits_per_sample: 32bit
    i2s_audio_id: i2s_output
    timeout: never
    buffer_duration: 100ms
    channel: left
    # sample_rate: 16000
    # bits_per_sample: 32bit


  # Virtual speakers to combine the announcement and media streams together into one output
  - platform: mixer
    id: mixing_speaker
    output_speaker: i2s_audio_speaker
    num_channels: 2
    source_speakers:
      - id: announcement_mixing_input
        timeout: never
      - id: media_mixing_input
        timeout: never

  # Vritual speakers to resample each pipelines' audio, if necessary, as the mixer speaker requires the same sample rate
  - platform: resampler
    id: announcement_resampling_speaker
    output_speaker: announcement_mixing_input
    sample_rate: 48000
    bits_per_sample: 16
  - platform: resampler
    id: media_resampling_speaker
    output_speaker: media_mixing_input
    sample_rate: 48000
    bits_per_sample: 16

media_player:
  - platform: speaker
    id: external_media_player
    name: Media Player
    internal: False
    volume_increment: 0.05
    volume_min: 0.4
    volume_max: 0.85
    announcement_pipeline:
      speaker: announcement_resampling_speaker
      format: FLAC     # FLAC is the least processor intensive codec
      num_channels: 1  # Stereo audio is unnecessary for announcements
      sample_rate: 48000
    media_pipeline:
      speaker: media_resampling_speaker
      format: FLAC     # FLAC is the least processor intensive codec
      num_channels: 2
      sample_rate: 48000
    on_announcement:
      - mixer_speaker.apply_ducking:
          id: media_mixing_input
          decibel_reduction: 20
          duration: 0.0s
    on_state:
      if:
        condition:
          and:
            - not:
                voice_assistant.is_running:
            - not:
                media_player.is_announcing:
        then:
          - mixer_speaker.apply_ducking:
              id: media_mixing_input
              decibel_reduction: 0
              duration: 1.0s
    files:
      - id: timer_finished_sound
        file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/timer_finished.flac
      - id: wake_word_triggered_sound
        file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/wake_word_triggered.flac
 
i2s_audio:
  - id: i2s_output
    i2s_lrclk_pin: ${i2s_out_lrclk_pin}
    i2s_bclk_pin: ${i2s_out_bclk_pin}
  - id: i2s_in
    i2s_lrclk_pin: ${i2s_in_lrclk_pin}
    i2s_bclk_pin: ${i2s_in_bclk_pin}

voice_assistant:
  id: va
  microphone: comm_mic
  media_player: external_media_player
  noise_suppression_level: 4 
  auto_gain: 31dBFS 
  volume_multiplier: 8


  # When the voice assistant connects to HA:
  # Set init_in_progress to false (Initialization is over).
  # If the switch is on, start the voice assistant
  # In any case: Set the correct phase and run the script to refresh the LED status
  on_client_connected:
    - lambda: id(init_in_progress) = false; 
    - if:
        condition:
          switch.is_on: voice_enabled
        then:
          - micro_wake_word.start
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - script.execute: control_led

  # When the voice assistant disconnects to HA: 
  # Stop the voice assistant
  # Set the correct phase and run the script to refresh the LED status
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};  
    - micro_wake_word.stop
    - script.execute: control_led

  # When the voice assistant starts to listen: Set the correct phase and run the script to refresh the LED status
  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: control_led

  # When the voice assistant starts to think: Set the correct phase and run the script to refresh the LED status
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: control_led

  # When the voice assistant starts to reply: Set the correct phase and run the script to refresh the LED status
  # on_tts_stream_start:
  on_tts_start: 
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: control_led
  
  on_end:
    - if:
        condition:
          - switch.is_on: voice_enabled
        then:
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start
  # When the voice assistant finished to reply: Set the correct phase and run the script to refresh the LED status
  # on_tts_stream_end:
  # on_stt_end: 
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: control_led

  # When the voice assistant encounters an error: 
  # Set the error phase and run the script to refresh the LED status
  # Wait 1 second and set the correct phase (idle or muted depending on the state of the switch) and run the script to refresh the LED status 
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};  
          - script.execute: control_led
          - delay: 1s
          - if:
              condition:
                switch.is_on: voice_enabled
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: control_led

micro_wake_word:
  models: 
    - okay_nabu
  vad:
  id: mww
  on_wake_word_detected:
    - if:
        condition:
          switch.is_on: voice_enabled
        then:
          - if:
              condition:
                voice_assistant.is_running:
              then:
                voice_assistant.stop:
              # Stop any other media player announcement
              else:
                - if:
                    condition:
                      media_player.is_announcing:
                    then:
                      - media_player.stop:
                          announcement: true
                    # Start the voice assistant and play the wake sound, if enabled
                    else:
                      - script.execute:
                          id: play_sound
                          priority: true
                          sound_file: !lambda return id(wake_word_triggered_sound);
                      - delay: 300ms
                      - voice_assistant.start:
                          wake_word: !lambda return wake_word;

wifi:
  # If the device connects, or disconnects, to the Wifi: Run the script to refresh the LED status
  on_connect:
    - script.execute: control_led
  on_disconnect:
    - script.execute: control_led

globals:
  # Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  # Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}

light:
  - platform: esp32_rmt_led_strip
    id: led
    pin: ${light_pin}
    chipset: WS2812
    max_refresh_rate: 15ms
    num_leds: 1
    rgb_order: GRB
    rmt_symbols: 192
    default_transition_length: 0ms
    effects:
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

script:
  # Master script controlling the LED, based on different conditions: initialization in progress, wifi and API connected, and the current voice assistant phase.
  # For the sake of simplicity and re-usability, the script calls child scripts defined below.
  # This script will be called every time one of these conditions is changing.
  - id: control_led
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                  wifi.connected:
                then:
                  - if:
                      condition:
                        api.connected:
                      then:
                        - lambda: |
                            switch(id(voice_assistant_phase)) {
                              case ${voice_assist_listening_phase_id}:
                                id(control_led_voice_assist_listening_phase).execute();
                                break;
                              case ${voice_assist_thinking_phase_id}:
                                id(control_led_voice_assist_thinking_phase).execute();
                                break;
                              case ${voice_assist_replying_phase_id}:
                                id(control_led_voice_assist_replying_phase).execute();
                                break;
                              case ${voice_assist_error_phase_id}:
                                id(control_led_voice_assist_error_phase).execute();
                                break;
                              case ${voice_assist_muted_phase_id}:
                                id(control_led_voice_assist_muted_phase).execute();
                                break;
                              case ${voice_assist_not_ready_phase_id}:
                                id(control_led_voice_assist_not_ready_phase).execute();
                                break;
                              default:
                                id(control_led_voice_assist_idle_phase).execute();
                                break;
                            }
                      else:
                        - script.execute: control_led_no_ha_connection_state
                else:
                  - script.execute: control_led_no_ha_connection_state
          else:
            - script.execute: control_led_init_state


  # Script executed during initialisation: In this example: Turn the LED in green with a slow pulse 🟢
  - id: control_led_init_state
    then:
      - light.turn_on:
          id: led
          blue: 0%
          red: 0%
          green: 100%
          effect: "Fast Pulse"
  

  # Script executed when the device has no connection to Home Assistant: In this example: Turn off the LED 
  - id: control_led_no_ha_connection_state
    then:
      - light.turn_off:
          id: led  


  # Script executed when the voice assistant is idle (waiting for a wake word): In this example: Turn the LED in white with 20% of brightness ⚪
  - id: control_led_voice_assist_idle_phase
    then:
      - light.turn_on:
          id: led
          blue: 100%
          red: 100%
          green: 100%
          brightness: 20%
          effect: "none"


  # Script executed when the voice assistant is listening to a command: In this example: Turn the LED in blue with a slow pulse 🔵
  - id: control_led_voice_assist_listening_phase
    then:
      - light.turn_on:
          id: led
          blue: 100%
          red: 0%
          green: 0%
          effect: "Slow Pulse"


  # Script executed when the voice assistant is processing the command: In this example: Turn the LED in blue with a fast pulse 🔵         
  - id: control_led_voice_assist_thinking_phase
    then:
      - light.turn_on:
          id: led
          blue: 100%
          red: 0%
          green: 0%
          effect: "Fast Pulse"


  # Script executed when the voice assistant is replying to a command: In this example: Turn the LED in blue, solid (no pulse) 🔵         
  - id: control_led_voice_assist_replying_phase
    then:
      - light.turn_on:
          id: led
          blue: 100%
          red: 0%
          green: 0%
          brightness: 100%
          effect: "none"


  # Script executed when the voice assistant encounters an error: In this example: Turn the LED in red, solid (no pulse) 🔴        
  - id: control_led_voice_assist_error_phase
    then:
      - light.turn_on:
          id: led
          blue: 0%
          red: 100%
          green: 0%
          brightness: 100%
          effect: "none"


  # Script executed when the voice assistant is muted: In this example: Turn off the LED 
  - id: control_led_voice_assist_muted_phase
    then:
      - light.turn_off:
          id: led


  # Script executed when the voice assistant is not ready: In this example: Turn off the LED 
  - id: control_led_voice_assist_not_ready_phase
    then:
      - light.turn_off:
          id: led

  # Script executed when we want to play sounds on the device.
  - id: play_sound
    parameters:
      priority: bool
      sound_file: "audio::AudioFile*"
    then:
      - lambda: |-
          if (priority) {
            id(external_media_player)
              ->make_call()
              .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_STOP)
              .set_announcement(true)
              .perform();
          }
          if ( (id(external_media_player).state != media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING ) || priority) {
            id(external_media_player)
              ->play_file(sound_file, true, false);
          }

switch:
  - platform: template
    name: Enable Voice Assistant
    id: voice_enabled
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    icon: mdi:assistant
    # When the switch is turned on (on Home Assistant):
    # Start the voice assistant component
    # Set the correct phase and run the script to refresh the LED status
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:      
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - micro_wake_word.start
            - script.execute: control_led
    # When the switch is turned off (on Home Assistant):
    # Stop the voice assistant component
    # Set the correct phase and run the script to refresh the LED status
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:      
            - voice_assistant.stop
            - micro_wake_word.stop
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: control_led
6 Likes

Thanks steriku.

That code is just what I needed to update my voice setups. That saved me a lot of work.

did you try it?
I haven’t yet and I think I’ll do it this week… in my previous configuration I had many deprecated configurations

Surely one of my issues that it will fix was with the LED (rtm_symbols should fix my first issue) but I don’t know if the rest is updated or deprecated

Yep I updated my voice stuff with the code from above. I did mess about with the light section as sometimes the LED didn’t work. Not sure if it was the code or not. But it is working with this for me right now.


light:
  - platform: esp32_rmt_led_strip
    id: led
    name: ${node_name} LED
    pin: GPIO14
    rgb_order: RGB
    num_leds: 1
    chipset: ws2812

    effects:
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%
1 Like

I have just tried updating this code with esphome 2025.4.1 and it is now no longer compiling. I get this error.

Reading CMake configuration...
-- git rev-parse returned 'fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).'
-- Building ESP-IDF components for target esp32s3
Processing 1 dependencies:
[1/1] idf (5.1.5)
-- Project sdkconfig file /config/.esphome/build/office-assist/sdkconfig.office-assist
-- Configuring incomplete, errors occurred!
See also "/config/.esphome/build/office-assist/.pioenvs/office-assist/CMakeFiles/CMakeOutput.log".

fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
/root/.platformio/penv/.espidf-5.1.5/bin/python: No module named kconfgen
CMake Error at /config/.esphome/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/tools/cmake/kconfig.cmake:176 (message):
  Failed to run kconfgen
  (/root/.platformio/penv/.espidf-5.1.5/bin/python;-m;kconfgen;--list-separator=semicolon;--kconfig;/config/.esphome/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/Kconfig;--sdkconfig-rename;/config/.esphome/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/sdkconfig.rename;--config;/config/.esphome/build/office-assist/sdkconfig.office-assist;--env-file;/config/.esphome/build/office-assist/.pioenvs/office-assist/config.env).
  Error 1
Call Stack (most recent call first):
  /config/.esphome/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/tools/cmake/build.cmake:619 (__kconfig_generate_config)
  /config/.esphome/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/tools/cmake/project.cmake:604 (idf_build_process)
  CMakeLists.txt:3 (project)



========================= [FAILED] Took 16.53 seconds =========================

Any body have any clues why this maybe happening, works fine with version 2025.3

thank you (I don’t even have a suit) for that config, made my media_player work without jitter and stuff