Looking for a working voice assistant config for ESP32 with INMP441

Anybody have a working config for a
ESP32 Development Board CP2102 with a
INMP441 Omnidirectional Microphone?

Just playing around and don’t have a speaker on it yet. Really struggling to get any of the findings posted here working though.

Figured somebody must have one worrying right???

Thanks for any help.

1 Like

See SpotPear "DeepSeek Voice Chat" config and GitHub - RealDeco/xiaozhi-esphome: Alternative code to use xiaozhi ai devices in esphome/home assistant., those devices are esp32-s3+INMP441+max98357a+display, just adapt/ strip down the example to your needs (don’t expect the quality to be sufficient, though).

Between Gemini and grok I came up with this. Seems to work so far. :grin:

Blue light turns on when it hears the wake word and blinks when listening for my voice request.


substitutions:
  voice_assist_idle_phase_id: '1'
  voice_assist_listening_phase_id: '2'
  voice_assist_thinking_phase_id: '3'
  voice_assist_replying_phase_id: '4'
  voice_assist_not_ready_phase_id: '10'
  voice_assist_error_phase_id: '11'
  voice_assist_muted_phase_id: '12'

esphome:
  name: voice-box
  friendly_name: voice
  platformio_options:
    board_build.flash_mode: dio

  on_boot:
      priority: 600
      then:
        - delay: 30s
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;
        # Boot LED Test
        - logger.log: "Built-in LED Boot Test: ON"
        - light.turn_on:
            id: va_led
            brightness: 1.0
        - delay: 1s
        - logger.log: "Built-in LED Boot Test: OFF"
        - light.turn_off:
            id: va_led
        - delay: 1s
        - logger.log: "Built-in LED Boot Test Complete."

esp32:
  board: esp32dev
  #variant: esp32s3
  framework:
    type: esp-idf
    version: recommended
    components:

# Enable logging
logger:
  level: DEBUG

# Enable Home Assistant API
api:
  encryption:
    key: "" # REPLACE with your own key

# Enable OTA updates
ota:
  - platform: esphome
    password: "" # REPLACE with your own password

wifi:
  ssid: "" # REPLACE with your SSID
  password: "" # REPLACE with your password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Mic-Speaker"
    password: "9vYvAFzzPjuc"

captive_portal:

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO33 ##INMP441-WS
    i2s_bclk_pin: GPIO18  ##INMP441-SCK
  - id: i2s_out
    i2s_lrclk_pin: GPIO16  ##Max98357 - LRC
    i2s_bclk_pin: GPIO20   ###Max98357 - BCLK
    ## l/R pin on ##INMP441 is connected to ground

microphone:
  platform: i2s_audio
  id: external_microphone
  adc_type: external
  i2s_audio_id: i2s_in
  i2s_din_pin: GPIO13
  channel: left
  pdm: false

speaker:
  platform: i2s_audio
  id: external_speaker
  dac_type: external
  i2s_audio_id: i2s_out
  i2s_dout_pin: GPIO14  ###Max98357 - DIN
  #mode: mono

output:
  - platform: ledc
    pin: GPIO2
    id: builtin_led_output
    frequency: 1000Hz
    inverted: true

light:
  - platform: monochromatic
    name: "Built-in LED"
    id: va_led
    output: builtin_led_output
    default_transition_length: 0.2s
    restore_mode: ALWAYS_OFF

voice_assistant:
  id: va
  microphone: external_microphone
  speaker: external_speaker
  use_wake_word: true
  noise_suppression_level: 4
  auto_gain: 26dBFS
  volume_multiplier: 1.5

  on_wake_word_detected:
    - lambda: |-
        id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
        ESP_LOGD("custom", "on_wake_word_detected - Phase: %d, turning on LED for wake word", id(voice_assistant_phase));
        id(va_led).turn_on().set_brightness(1.0).perform();

  on_stt_end:
    then:
      - logger.log:
          format: "==> Received voice command: %s"
          args: [ 'x.c_str()' ]

  on_listening:
    - lambda: |-
        ESP_LOGD("custom", "on_listening triggered - Phase: %d, starting blink for listening", id(voice_assistant_phase));
    - script.execute: blink_listening

  on_stt_vad_end:
    - lambda: |-
        id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
        ESP_LOGD("custom", "on_stt_vad_end triggered - Phase: %d, setting LED dim (waiting to speak)", id(voice_assistant_phase));
        id(va_led).turn_on().set_brightness(0.5).perform();

  on_tts_stream_start:
    - lambda: |-
        id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
        ESP_LOGD("custom", "on_tts_stream_start triggered - Phase: %d, setting LED bright (replying)", id(voice_assistant_phase));
    - light.turn_on:
        id: va_led
        brightness: 1.0

  on_tts_stream_end:
    - lambda: |-
        id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        ESP_LOGD("custom", "on_tts_stream_end triggered - Phase: %d, turning off LED", id(voice_assistant_phase));
        id(va_led).turn_off();


  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: |-
              id(voice_assistant_phase) = ${voice_assist_error_phase_id};
              ESP_LOGD("custom", "on_error triggered - Phase: %d, starting flash_error script", id(voice_assistant_phase));
          - script.execute: flash_error
          - delay: 1s
          - if:
              condition:
                switch.is_on: use_wake_word
              then:
                - lambda: |-
                    id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
                    id(va_led).turn_off();
              else:
                - lambda: |-
                    id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
                    id(va_led).turn_on().set_brightness(0.3).perform();


  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous
          - lambda: |-
              id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              id(va_led).turn_off();
              ESP_LOGD("custom", "on_client_connected wake word branch - Phase: %d, LED off", id(voice_assistant_phase));
        else:
          - lambda: |-
              id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
              ESP_LOGD("custom", "on_client_connected muted branch - Phase: %d, setting LED dim (muted)", id(voice_assistant_phase));
              id(va_led).turn_on().set_brightness(0.3).perform();


  on_client_disconnected:
    - lambda: |-
        id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
        ESP_LOGD("custom", "on_client_disconnected triggered - Phase: %d, starting flash_disconnected script", id(voice_assistant_phase));
    - script.execute: flash_disconnected


switch:
  - platform: template
    name: Use Wake Word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: |-
                id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
                id(va_led).turn_off();
                ESP_LOGD("custom", "switch on_turn_on - Phase: %d, LED off", id(voice_assistant_phase));
            - if:
                condition:
                    not:
                      - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous

    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: |-
                id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
                id(va_led).turn_on().set_brightness(0.3).perform();
                ESP_LOGD("custom", "switch on_turn_off - Phase: %d, setting LED dim (muted)", id(voice_assistant_phase));


globals:
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}

# ====================================================================
# SCRIPTS FOR NON-BLOCKING FLASHING AND BLINKING
# ====================================================================
script:
  - id: flash_error
    mode: restart
    then:
      - repeat:
          count: 3
          then:
            - light.turn_on:
                id: va_led
                brightness: 1.0
            - delay: 200ms
            - light.turn_off:
                id: va_led
            - delay: 200ms

  - id: flash_disconnected
    mode: restart
    then:
      - repeat:
          count: 5
          then:
            - light.turn_on:
                id: va_led
                brightness: 1.0
            - delay: 100ms
            - light.turn_off:
                id: va_led
            - delay: 100ms

  - id: blink_listening
    mode: restart
    then:
      - repeat:
          count: 10
          then:
            - light.turn_on:
                id: va_led
                brightness: 1.0
            - delay: 200ms
            - light.turn_off:
                id: va_led
            - delay: 200ms

For voice to work well you will need at least some psram. Preferably a n16r8 board. As far as I can tell this board does not have that.

And the best place for working code is the examples in this thread.

Does the sound work? I’m still practicing, I found your code, but the trouble is, the sound doesn’t even crack.

That is probably due to the board you are trying to use. You may be better off basing your code on this.

wake-word-voice-assistants/m5stack-atom-echo at main · esphome/wake-word-voice-assistants · GitHub

My yaml works fine on the esp32 n16r8, sound plays perfectly.

Unfortunately, I only have esp-wroom-32 available + inmp441 + MAX98357A

In my experience you will have audio issues unless you use the N16R8 boards. They tend to crash if using esp32 without psram.

Thank you. For the advice.