ESP32 voice assist restarts pipeline every 5 seconds

I have a ESP32 with a INMP441 microphone that I have flashed with ESPHome, using the following config:

esphome:
  name: esp32-voice-assist-2
  friendly_name: "ESP32 Voice Assist 2"
  on_boot:
     - priority: -100
       then:
         - wait_until: api.connected
         - delay: 1s
         - if:
             condition:
               switch.is_on: use_wake_word
             then:
               - voice_assistant.start_continuous:

esp32:
  board: esp32dev
  framework:
    type: esp-idf
    version: recommended

# Enable logging
logger:
  
# Enable Home Assistant API
api:

ota:

wifi:
  ssid: mywifi
  password: password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Voice-Assist-2"
    password: "password"

i2s_audio:
  i2s_lrclk_pin: GPIO25
  i2s_bclk_pin: GPIO14

microphone:
  - platform: i2s_audio
    id: mic
    adc_type: external
    i2s_din_pin: GPIO26
    pdm: false

speaker:
  - platform: i2s_audio
    id: big_speaker
    dac_type: external
    i2s_dout_pin: GPIO27
    mode: mono

voice_assistant:
  microphone: mic
  use_wake_word: false
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  speaker: big_speaker
  id: assist
  on_start:
    - switch.turn_on:
        id: notif_light
  on_end:
    - switch.turn_off:
        id: notif_light

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(assist).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(assist).set_use_wake_word(false);
  - platform: gpio
    id: notif_light
    pin: 
      number: GPIO2
      mode: OUTPUT
    name: "Notification Light"
    restore_mode: ALWAYS_OFF

Based on the yaml created by Everything Smart Home (https://www.youtube.com/watch?v=zhlIaBG3Ldo)

This works. However, every five seconds the pipeline shuts down and restarts, which interrupts listening for wake words significantly. Log from the ESP device:

[18:07:04][D][voice_assistant:418]: Desired state set to IDLE
[18:07:04][D][voice_assistant:412]: State changed from IDLE to START_PIPELINE
[18:07:04][D][voice_assistant:418]: Desired state set to START_MICROPHONE
[18:07:04][D][voice_assistant:118]: microphone not running
[18:07:05][D][voice_assistant:200]: Requesting start...
[18:07:05][D][voice_assistant:412]: State changed from START_PIPELINE to STARTING_PIPELINE
[18:07:05][D][voice_assistant:118]: microphone not running
[18:07:05][D][voice_assistant:433]: Client started, streaming microphone
[18:07:05][D][voice_assistant:412]: State changed from STARTING_PIPELINE to START_MICROPHONE
[18:07:05][D][voice_assistant:418]: Desired state set to STREAMING_MICROPHONE
[18:07:05][D][voice_assistant:153]: Starting Microphone
[18:07:05][D][voice_assistant:412]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[18:07:05][D][esp-idf:000]: I (711516) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[18:07:05][D][voice_assistant:519]: Event Type: 1
[18:07:05][D][voice_assistant:522]: Assist Pipeline running
[18:07:05][D][voice_assistant:412]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[18:07:05][D][switch:012]: 'Notification Light' Turning ON.
[18:07:05][D][switch:055]: 'Notification Light': Sending state ON
[18:07:05][D][voice_assistant:519]: Event Type: 9
[18:07:10][D][voice_assistant:519]: Event Type: 0
[18:07:10][D][voice_assistant:519]: Event Type: 2
[18:07:10][D][voice_assistant:609]: Assist Pipeline ended
[18:07:10][D][voice_assistant:412]: State changed from STREAMING_MICROPHONE to IDLE
[18:07:10][D][voice_assistant:418]: Desired state set to IDLE
[18:07:10][D][voice_assistant:412]: State changed from IDLE to START_PIPELINE
[18:07:10][D][voice_assistant:418]: Desired state set to START_MICROPHONE
[18:07:10][D][switch:016]: 'Notification Light' Turning OFF.
[18:07:10][D][switch:055]: 'Notification Light': Sending state OFF
[18:07:10][D][voice_assistant:200]: Requesting start...
[18:07:10][D][voice_assistant:412]: State changed from START_PIPELINE to STARTING_PIPELINE
[18:07:10][D][voice_assistant:433]: Client started, streaming microphone
[18:07:10][D][voice_assistant:412]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[18:07:10][D][voice_assistant:418]: Desired state set to STREAMING_MICROPHONE
[18:07:10][D][voice_assistant:519]: Event Type: 1
[18:07:10][D][voice_assistant:522]: Assist Pipeline running
[18:07:10][D][switch:012]: 'Notification Light' Turning ON.
[18:07:10][D][switch:055]: 'Notification Light': Sending state ON
[18:07:10][D][voice_assistant:519]: Event Type: 9
[18:07:15][D][voice_assistant:519]: Event Type: 0
[18:07:15][D][voice_assistant:519]: Event Type: 2
[18:07:15][D][voice_assistant:609]: Assist Pipeline ended
[18:07:15][D][voice_assistant:412]: State changed from STREAMING_MICROPHONE to IDLE
[18:07:15][D][voice_assistant:418]: Desired state set to IDLE
[18:07:15][D][voice_assistant:412]: State changed from IDLE to START_PIPELINE
[18:07:15][D][voice_assistant:418]: Desired state set to START_MICROPHONE
[18:07:15][D][switch:016]: 'Notification Light' Turning OFF.
[18:07:15][D][switch:055]: 'Notification Light': Sending state OFF
[18:07:15][D][voice_assistant:200]: Requesting start...
[18:07:15][D][voice_assistant:412]: State changed from START_PIPELINE to STARTING_PIPELINE
[18:07:15][D][voice_assistant:433]: Client started, streaming microphone
[18:07:15][D][voice_assistant:412]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[18:07:15][D][voice_assistant:418]: Desired state set to STREAMING_MICROPHONE
[18:07:15][D][voice_assistant:519]: Event Type: 1
[18:07:15][D][voice_assistant:522]: Assist Pipeline running
[18:07:15][D][switch:012]: 'Notification Light' Turning ON.
[18:07:15][D][switch:055]: 'Notification Light': Sending state ON
[18:07:15][D][voice_assistant:519]: Event Type: 9

I would expect the pipeline to just stay open, like the Pi Zero 2 W does. Is the above expected behavior? Can I change something in the yaml to prevent it?

I am having the exact same issue, I’ve tried 4 different mic’s now, with one remaining unsoldered. I’m not sure if I’m destroying them when soldering the pins on, or if I’m doing something else wrong, but I get the identical logs.

My only difference is I hooked the L/R pin to a GPIO so I could toggle between left/right channel for testing. This didn’t do anything for me either.

I’ve been doing more reading, and it sounds like the 5 second thing is actually normal behavior. You can add the following to your configuration.yaml, with only one mic connected and it will dump the five second wav recordings into a folder where you can listen to them to manage your mic settings. This was very helpful for me troubleshooting my mic.

assist_pipeline:
    debug_recording_dir: /share/assist_pipeline

Don’t leave it running for too long, or it will fill up with thousands of clips. I noticed that when I did the same thing with a Raspi Zero 2 W USB mic (voice satellite), it records continuously in one big file. I wonder why the ESP32 setup can’t do that. It’s clearly possible.

I’m literally just now realizing this on my own. I soldered up the last of my 5 mic’s as a final test and got it to work with an Arduino test sketch, so I knew 100% the mic was working, and it was either a HA or ESPHome issue.

While pulling my hair out and talking to myself, while having the logs window up, I noticed that it wouldn’t stop the pipeline as long as I was talking, so I said the wake word… It worked!

This is NOT intuitive, and having the pipeline error out (Event type 2 is an error IIRC) and restart is indicative of an error, not of normal operation!

Regardless, I now have to go back and test the other 4 mics, as I’m sure some of them will have been fine too :expressionless:

Almost sounds like a VAD thing, I recall some folks have kept the pipeline alive by setting VAD to 0, but on a non-S3 esp32, the build will error out, and I haven’t figured out why.

Side note, listening to the debug wav files, I was pleasantly surprised by how good the $1 inmp441 mic sounds.

Agreed! For the price, the quality of the recording is impressive. I just wish it didn’t cause so many issues, or at least that the documentation explained how things are supposed to work.

Slight aside, I wanted to have a visual indicator of when the wake word is recognized, and my 30-pin dev board has a red power LED and a blue LED connected to GPIO2. I struggled through lots of yaml crap (I’m just starting to use yaml configs and ESPHome, coming from Tasmota) and I finally got it to work. Had to switch dev frameworks and was surprised when it actually compiled. There’s probably some redundant stuff, but it does what it’s supposed to. Dim blue light when the pipeline is running, then turns on bright when wake word is recognized, then back to dim after command is over or timeout.

esphome:
  name: esp32-voice-assist-2
  friendly_name: "ESP32 Voice Assist 2"
  on_boot:
     - priority: -100
       then:
         - wait_until: api.connected
         - delay: 1s
         - if:
            condition:
              switch.is_on: use_wake_word
            then:
                - light.turn_on:
                    id: notif_light
                    brightness: "25%"
                - voice_assistant.start_continuous:


interval:
  - interval: 1s
    then:
      - if:
          condition:
            api.connected:
          then:
            - if:
                condition:
                  and:
                    - switch.is_on: use_wake_word
                    - not:
                      - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous:
                  

#esp32:
#  board: esp32dev
#  framework:
#    type: esp-idf
#    version: recommended

esp32:
  board: esp32dev
  framework:
    type: arduino
    version: 2.0.4
    platform_version: 5.1.1

external_components:
  - source: github://pr#3820
    components: [ ledc ]

# Enable logging
logger:
  
# Enable Home Assistant API
api:

ota:

wifi:
  ssid: Roatan-2
  password: floraissocute
  #on_connect:
  #  - delay: 5s # Gives time for improv results to be transmitted

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Voice-Assist-2"
    password: "floraissocute"

output:
  - platform: ledc
    pin:
      number: GPIO2
      mode: OUTPUT
    id: notif_gpio

light:
  - platform: monochromatic
    name: "On Board LED"
    output: notif_gpio
    id: notif_light
    default_transition_length: 0.3s
    restore_mode: ALWAYS_OFF

i2s_audio:
  i2s_lrclk_pin: GPIO25
  i2s_bclk_pin: GPIO14

microphone:
  - platform: i2s_audio
    id: mic
    adc_type: external
    i2s_din_pin: GPIO26
    pdm: false

speaker:
  - platform: i2s_audio
    id: big_speaker
    dac_type: external
    i2s_dout_pin: GPIO27
    mode: mono

voice_assistant:
  microphone: mic
  use_wake_word: false
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  speaker: big_speaker
  id: assist
  on_wake_word_detected:
    - light.turn_on:
        id: notif_light
        brightness: "100%"
  on_end:
    - light.turn_on:
        id: notif_light
        brightness: "25%"

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(assist).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
            - light.turn_on:
                id: notif_light
                brightness: "25%"
    on_turn_off:
      - voice_assistant.stop
      - light.turn_off:
          id: notif_light
      - lambda: id(assist).set_use_wake_word(false);