So, I’m at my wits’ end. I haven’t been able to get this wake word to work for days.
This is my next test step before I want to experiment with speech-to-phrase.
My first step was to test the microphone. I was able to stream it to my PC via UDP, so the microphone hardware and I2S connection are OK.
The next step was to test the speaker. Streaming internet radio to the “Marvin Media Player” works.
But the wake word won’t work!! (Although the name “Marvin” appears everywhere, everything was tested with the wake word “hey jarvis”.)
What am I doing wrong?
ESPHome code:
esphome:
name: marvin
friendly_name: Marvin
on_boot:
- priority: -100
then:
- wait_until: api.connected
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.start_continuous:
esp32:
board: esp32dev
framework:
type: esp-idf
version: recommended
# Enable logging
logger:
# level: DEBUG
# Enable Home Assistant API
api:
encryption:
key: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
# key: !secret XXXXXX
# reboot_timeout: 0s
ota:
- platform: esphome # Your existing OTA method
password: !secret XXXXXXXXXXXXXXXX
wifi:
ssid: !secret XXXXXXXXX
password: !secret XXXXXXXXXXXXX
fast_connect: true
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: !secret XXXXXXXXXXXXX
password: !secret XXXXXXXXXXXXXXXXXX
captive_portal:
i2s_audio:
- id: i2s_out
i2s_lrclk_pin: GPIO22 #LRC on MAX98357A
i2s_bclk_pin: GPIO23 #BCL on MAX98357A
- id: i2s_in
i2s_lrclk_pin: GPIO27 #WS on microphone INMP441
i2s_bclk_pin: GPIO26 #SCK on microphone INMP441
microphone:
- platform: i2s_audio
i2s_audio_id: i2s_in
id: marvin_mic
adc_type: external
i2s_din_pin: GPIO25
pdm: false
channel: left #L/R Pin INMP441 => GND
sample_rate: 16000
bits_per_sample: 16bit
speaker:
- platform: i2s_audio
id: marvin_speaker
dac_type: external
i2s_audio_id: i2s_out
i2s_dout_pin: GPIO18
channel: mono
sample_rate: 16000 #Defaults to 16000
bits_per_sample: 16bit #One of 8bit, 16bit, 24bit, or 32bit. Defaults to 16bit.
media_player:
- platform: speaker
name: "Marvin Media Player"
id: marvin_media_player
buffer_size: 10000 #Must be between 4000 and 4000000. Defaults to 100000
codec_support_enabled: true # set to false and specify format to save recources
announcement_pipeline:
speaker: marvin_speaker
# format: MP3 # One of FLAC, MP3, WAV, or NONE.
num_channels: 1
voice_assistant:
id: marvin_va
microphone: marvin_mic
use_wake_word: false
noise_suppression_level: 2
auto_gain: 31dBFS #Between 0dBFS and 31dBFS inclusive. Defaults to 0 (disabled).
volume_multiplier: 4.0
speaker: marvin_speaker
on_start:
then:
- logger.log: "=>> on_start: Voice assist pipeline is started"
on_listening:
then:
- logger.log: "=>> on_listening: Voice assistant is listening..."
on_wake_word_detected:
then:
- logger.log: "=>> on_wake_word_detected: Voice assistant has detected the wakeword!!"
on_end:
then:
- logger.log: "=>> on_end: Voice assistant has finished all tasks."
on_error:
then:
- logger.log: "=>> on_error: Voice assistant error occurred!"
switch:
- platform: template
name: "Use Marvin wake word"
id: use_wake_word
optimistic: true
#restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(marvin_va).set_use_wake_word(true);
- logger.log: "=>> set_use_wake_word(true)"
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
- logger.log: "=>> Voice assistant has been switched on."
on_turn_off:
- voice_assistant.stop
- lambda: id(marvin_va).set_use_wake_word(false);
- logger.log: "=>> set_use_wake_word(false)"
- logger.log: "=>> Voice assistant switched off."
Log file:
16:08:19.708][I][app:185]: ESPHome version 2025.10.1 compiled on Oct 19 2025, 16:07:13
[16:08:19.717][C][wifi:679]: WiFi:
[16:08:19.717][C][wifi:458]: Local MAC: AA:BB:CC:DD:EE:FF
[16:08:19.717][C][wifi:465]: IP Address: 192.168.178.52
[16:08:19.724][C][wifi:469]: SSID: 'XXXXXXX'[redacted]
[16:08:19.724][C][wifi:469]: BSSID: YY:YY:ZZ:ZZ:XX:XX[redacted]
[16:08:19.724][C][wifi:469]: Hostname: 'marvin'
[16:08:19.724][C][wifi:469]: Signal strength: -62 dB ▂▄▆█
[16:08:19.724][C][wifi:469]: Channel: 6
[16:08:19.724][C][wifi:469]: Subnet: 255.255.255.0
[16:08:19.724][C][wifi:469]: Gateway: 192.168.178.1
[16:08:19.724][C][wifi:469]: DNS1: 192.168.178.1
[16:08:19.724][C][wifi:469]: DNS2: 0.0.0.0
[16:08:19.728][C][logger:261]: Logger:
[16:08:19.728][C][logger:261]: Max Level: DEBUG
[16:08:19.728][C][logger:261]: Initial Level: DEBUG
[16:08:19.738][C][logger:267]: Log Baud Rate: 115200
[16:08:19.738][C][logger:267]: Hardware UART: UART0
[16:08:19.742][C][logger:274]: Task Log Buffer Size: 768
[16:08:19.769][C][template.switch:087]: Template Switch 'Use Marvin wake word'
[16:08:19.769][C][template.switch:087]: Restore Mode: always OFF
[16:08:19.770][C][template.switch:057]: Optimistic: YES
[16:08:19.817][C][i2s_audio.microphone:079]: Microphone:
[16:08:19.817][C][i2s_audio.microphone:079]: Pin: 25
[16:08:19.817][C][i2s_audio.microphone:079]: PDM: NO
[16:08:19.817][C][i2s_audio.microphone:079]: DC offset correction: NO
[16:08:19.818][C][psram:016]: PSRAM:
[16:08:19.819][C][psram:019]: Available: NO
[16:08:19.833][C][i2s_audio.speaker:074]: Speaker:
[16:08:19.833][C][i2s_audio.speaker:074]: Pin: 18
[16:08:19.833][C][i2s_audio.speaker:074]: Buffer duration: 500
[16:08:19.836][C][i2s_audio.speaker:080]: Timeout: 500 ms
[16:08:19.836][C][i2s_audio.speaker:088]: Communication format: std
[16:08:19.855][C][captive_portal:116]: Captive Portal:
[16:08:19.876][C][esphome.ota:093]: Over-The-Air updates:
[16:08:19.876][C][esphome.ota:093]: Address: marvin.local:3232
[16:08:19.876][C][esphome.ota:093]: Version: 2
[16:08:19.884][C][esphome.ota:100]: Password configured
[16:08:19.885][C][safe_mode:018]: Safe Mode:
[16:08:19.885][C][safe_mode:018]: Successful after: 60s
[16:08:19.885][C][safe_mode:018]: Invoke after: 10 attempts
[16:08:19.885][C][safe_mode:018]: Duration: 300s
[16:08:19.904][C][web_server.ota:241]: Web Server OTA
[16:08:19.908][C][api:222]: Server:
[16:08:19.908][C][api:222]: Address: marvin.local:6053
[16:08:19.908][C][api:222]: Listen backlog: 4
[16:08:19.908][C][api:222]: Max connections: 8
[16:08:19.908][C][api:229]: Noise encryption: YES
[16:08:19.918][C][mdns:179]: mDNS:
[16:08:19.918][C][mdns:179]: Hostname: marvin
[16:09:14.241][I][safe_mode:042]: Boot seems successful; resetting boot loop counter
[16:09:14.262][D][esp32.preferences:149]: Writing 2 items: 0 cached, 2 written, 0 failed
[16:10:01.118][D][switch:020]: 'Use Marvin wake word' Turning ON.
[16:10:01.124][D][switch:063]: 'Use Marvin wake word': Sending state ON
[16:10:01.125][D][main:223]: =>> set_use_wake_word(true)
[16:10:01.125][D][voice_assistant:478]: State changed from IDLE to START_MICROPHONE
[16:10:01.125][D][voice_assistant:485]: Desired state set to START_PIPELINE
[16:10:01.125][D][main:233]: =>> Voice assistant has been switched on.
[16:10:01.126][D][voice_assistant:207]: Starting Microphone
[16:10:01.126][D][ring_buffer:034]: Created ring buffer with size 16384
[16:10:01.133][D][voice_assistant:478]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[16:10:01.165][D][voice_assistant:478]: State changed from STARTING_MICROPHONE to START_PIPELINE
[16:10:01.166][D][voice_assistant:228]: Requesting start
[16:10:01.167][D][voice_assistant:478]: State changed from START_PIPELINE to STARTING_PIPELINE
[16:10:01.186][D][voice_assistant:500]: Client started, streaming microphone
[16:10:01.196][D][voice_assistant:478]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[16:10:01.196][D][voice_assistant:485]: Desired state set to STREAMING_MICROPHONE
[16:10:01.241][D][voice_assistant:624]: Event Type: 1
[16:10:01.241][D][voice_assistant:627]: Assist Pipeline running
[16:10:01.242][D][voice_assistant:624]: Event Type: 9
[16:10:01.242][D][main:490]: =>> on_start: Voice assist pipeline is started
