Local Voice Assistant Creation Issues with ESPHome

Hello,

I have been trying to create a voice assistant using ESPHome for some time now. Unfortunately, all my attempts have ended in failure.

To do this, I am using an ESP32, an INMP441 microphone, and a Max98357 amplifier. I am unable to recognize a wake word or trigger a voice command.

I tested my assistant pipeline with my phone, and the chain openwakeword => whisper => piper works correctly. I also tested playing an audio file on my ESP32 from Home Assistant, and it works very well. To check the microphone, I used an ESP32-S3 and microwakeword, and it worked.

When I turn off the wake word, a 44-byte MP3 file is created on my assistant. The same goes for the STT part after the recognition of the microwakeword.

Home Assistant, ESPHome, OpenWwakeWord, Whisper, and Piper are installed via the Helm Charts from Truecharts on my NAS running on Truenas.

I believe that the hardware configuration of my ESP32 is correct, as I have followed all the tutorials available. However, I am still unable to identify the problem.

I am attaching the YAML of my ESP32, as well as some logs.

Esp32 log :

INFO ESPHome 2024.3.1
INFO Reading configuration /config/test-2.yaml...
INFO Starting log output from 192.168.1.145 using esphome API
INFO Successfully connected to test-2 @ 192.168.1.145 in 0.008s
INFO Successful handshake with test-2 @ 192.168.1.145 in 0.116s
[13:22:38][I][app:102]: ESPHome version 2024.3.1 compiled on Mar 29 2024, 17:14:23
[13:22:38][C][wifi:580]: WiFi:
[13:22:38][C][wifi:408]:   Local MAC: EC:64:C9:81:04:5C
[13:22:38][C][wifi:413]:   SSID: [redacted]
[13:22:38][C][wifi:416]:   IP Address: 192.168.1.145
[13:22:38][C][wifi:420]:   BSSID: [redacted]
[13:22:38][C][wifi:421]:   Hostname: 'test-2'
[13:22:38][C][wifi:423]:   Signal strength: -46 dB ▂▄▆█
[13:22:38][C][wifi:427]:   Channel: 11
[13:22:38][C][wifi:428]:   Subnet: 255.255.255.0
[13:22:38][C][wifi:429]:   Gateway: 192.168.1.1
[13:22:38][C][wifi:430]:   DNS1: 0.0.0.0
[13:22:38][C][wifi:431]:   DNS2: 0.0.0.0
[13:22:38][C][logger:166]: Logger:
[13:22:38][C][logger:167]:   Level: DEBUG
[13:22:38][C][logger:169]:   Log Baud Rate: 115200
[13:22:38][C][logger:170]:   Hardware UART: UART0
[13:22:38][C][light:103]: Light 'Light'
[13:22:38][C][light:105]:   Default Transition Length: 0.5s
[13:22:38][C][light:106]:   Gamma Correct: 2.80
[13:22:38][C][template.switch:068]: Template Switch 'Use wake word'
[13:22:38][C][template.switch:091]:   Restore Mode: restore defaults to ON
[13:22:38][C][template.switch:057]:   Optimistic: YES
[13:22:38][C][status:034]: Status Binary Sensor 'API Connection'
[13:22:38][C][status:034]:   Device Class: 'connectivity'
[13:22:38][C][captive_portal:088]: Captive Portal:
[13:22:38][C][mdns:115]: mDNS:
[13:22:38][C][mdns:116]:   Hostname: test-2
[13:22:38][C][ota:096]: Over-The-Air Updates:
[13:22:38][C][ota:097]:   Address: 192.168.1.145:3232
[13:22:38][C][ota:100]:   Using Password.
[13:22:38][C][ota:103]:   OTA version: 2.
[13:22:38][C][api:139]: API Server:
[13:22:38][C][api:140]:   Address: 192.168.1.145:6053
[13:22:38][C][api:142]:   Using noise encryption: YES
[13:22:38][C][audio:203]: Audio:
[13:22:38][C][audio:225]:   External DAC channels: 1
[13:22:38][C][audio:226]:   I2S DOUT Pin: 27
[13:22:39][D][binary_sensor:036]: 'API Connection': Sending state ON
[13:22:39][E][voice_assistant:462]: No API client connected
[13:22:39][D][voice_assistant:416]: State changed from IDLE to IDLE
[13:22:39][D][voice_assistant:422]: Desired state set to IDLE
[13:22:43][D][api:102]: Accepted 192.168.1.199
[13:22:43][D][api.connection:1159]: Home Assistant 2024.3.3 (192.168.1.199): Connected successfully
[13:22:44][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[13:22:44][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[13:22:44][D][voice_assistant:118]: microphone not running
[13:22:44][D][voice_assistant:202]: Requesting start...
[13:22:44][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[13:22:44][D][voice_assistant:118]: microphone not running
[13:22:44][D][voice_assistant:118]: microphone not running
[13:22:44][D][voice_assistant:118]: microphone not running
[13:22:44][D][voice_assistant:437]: Client started, streaming microphone
[13:22:44][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE
[13:22:44][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[13:22:44][D][voice_assistant:155]: Starting Microphone
[13:22:44][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[13:22:44][D][voice_assistant:523]: Event Type: 1
[13:22:44][D][voice_assistant:526]: Assist Pipeline running
[13:22:44][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[13:22:44][D][voice_assistant:523]: Event Type: 9	

OpenWakeWord log

2024-03-30 13:22:38.243418+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17151971570163
2024-03-30 13:22:44.218530+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17157946263179
2024-03-30 13:22:44.221683+01:00DEBUG:wyoming_openwakeword.handler:Receiving audio from client: 17157946263179
2024-03-30 13:22:48.241967+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17161970642708
2024-03-30 13:22:48.242159+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17161970642708
2024-03-30 13:22:48.242295+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17161971045295
2024-03-30 13:22:48.242345+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17161971045295
2024-03-30 13:22:58.242131+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17171970801862
2024-03-30 13:22:58.242203+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17171970871843
2024-03-30 13:22:58.242218+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17171970801862
2024-03-30 13:22:58.242242+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17171970871843
2024-03-30 13:23:08.242657+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17181971269457
2024-03-30 13:23:08.242747+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17181971444115
2024-03-30 13:23:08.242967+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17181971269457
2024-03-30 13:23:08.243014+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17181971444115
2024-03-30 13:23:18.242360+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17191971019149
2024-03-30 13:23:18.242410+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17191971085970
2024-03-30 13:23:18.242578+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17191971085970
2024-03-30 13:23:18.242611+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17191971019149
2024-03-30 13:23:28.242603+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17201971272722
2024-03-30 13:23:28.242651+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17201971272722
2024-03-30 13:23:28.242750+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17201971461949
2024-03-30 13:23:28.242800+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17201971461949
2024-03-30 13:23:38.242925+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17211971613947
2024-03-30 13:23:38.243114+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17211971613947
2024-03-30 13:23:38.243129+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17211971769141
2024-03-30 13:23:38.243137+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17211971769141
2024-03-30 13:23:48.244203+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17221972428548
2024-03-30 13:23:48.245311+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17221972428548
2024-03-30 13:23:48.245970+01:00DEBUG:wyoming_openwakeword.handler:Client connected: 17221974391938
2024-03-30 13:23:48.246365+01:00DEBUG:wyoming_openwakeword.handler:Client disconnected: 17221974391938

Le yaml de mon esp32

esphome:
  name: test-2
  friendly_name: test-2

esp32:
  board: esp32dev
  framework:
    type: arduino

# Enable logging
logger:
  level: DEBUG

# Enable Home Assistant API
api:
  encryption:
    key: ""

ota:
  password: ""

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  power_save_mode: none
  manual_ip:
      # Set this to the IP of the ESP
      static_ip: 192.168.1.145
      # Set this to the IP address of the router. Often ends with .1
      gateway: 192.168.1.1
      # The subnet of the network. 255.255.255.0 works for most home networks.
      subnet: 255.255.255.0

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Test-2 Fallback Hotspot"
    password: "q9JwgSja5m2G"

captive_portal:

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO26 #WS 
    i2s_bclk_pin: GPIO25 #SCK

microphone:
  - platform: i2s_audio
    adc_type: external
    pdm: false
    id: mic_i2s
    channel: right
    bits_per_sample: 32bit
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO33  #SD Pin from the INMP441 Microphone


media_player:
  - platform: i2s_audio
    name: "esp_speaker"
    id: media_player_speaker
    i2s_audio_id: i2s_in
    dac_type: external
    i2s_dout_pin: GPIO27   #  DIN Pin of the MAX98357A Audio Amplifier
    mode: mono


voice_assistant:
  microphone: mic_i2s
  id: va
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 4.0
  use_wake_word: false
  media_player: media_player_speaker
  
  on_wake_word_detected: 
    - light.turn_on:
        id: led_light
  on_listening: 
    - light.turn_on:
        id: led_light
        effect: "Scan Effect With Custom Values"
        red: 63%
        green: 13%
        blue: 93%
  
  on_stt_end:
    - light.turn_on:
        id: led_light
        effect: "None"
        red: 0%
        green: 100%
        blue: 0%

  on_error: 
    - light.turn_on:
        id: led_light
        effect: "None"
    - if:
        condition:
          switch.is_on: use_wake_word
        then:

          - switch.turn_off: use_wake_word
          - delay: 1sec 
          - switch.turn_on: use_wake_word      
  


  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:

  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:
 
  on_end:
    - light.turn_off:
        id: led_light



binary_sensor:
  - platform: status
    name: API Connection
    id: api_connection
    filters:
      - delayed_on: 1s
    on_press:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.start_continuous:
    on_release:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.stop:


switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);

light:
  - platform: neopixelbus
    id: led_light
    type: grb
    pin: GPIO32      # DIN pin of the LED Strip
    num_leds: 9      # change the Number of LEDS according to your LED Strip.
    name: "Light"
    variant: sk6812
    default_transition_length: 0.5s
      
    effects:
      - addressable_scan:
          name: Scan Effect With Custom Values
          move_interval: 50ms
          scan_width: 2

Would anyone have an idea to solve my problem, please?

I’m in no way an expert at this but I would think this:
use_wake_word: false
Would be a place to start.
I don’t have that line in my config, try deleting it or making it “true”

Thank you for the response.
I have tested it, but unfortunately, the result is the same.

The only other suggestion I would have is, every time I make a change to my ESP32, I have to go to the device and turn off “use wake word”, then turn it back on. Then go to my Assistant and click “Update”. After a minute or so it starts working.

Thank you, I will test this suggestion and get back to you afterwards.

Hello @Demusman, I tested your solution and unfortunately it doesn’t change anything :face_exhaling: