Is it possible to get a voice assistant working on an ESP32-S3-devkitc-1?
I also tried looking at the code for the s3-box on github, but it doesn’t say what pins to use for speaker/mic/i2s.
I tried to compile but I’m getting:
Compiling .pioenvs/kitchen-va/src/main.o
Compiling .pioenvs/kitchen-va/components/audio_board/lyrat_v4_3/board_pins_config.o
components/audio_board/lyrat_v4_3/board_pins_config.c: In function 'get_i2c_pins':
components/audio_board/lyrat_v4_3/board_pins_config.c:39:34: error: 'GPIO_NUM_23' undeclared (first use in this function); did you mean 'GPIO_NUM_43'?
i2c_config->scl_io_num = GPIO_NUM_23;
^~~~~~~~~~~
GPIO_NUM_43
components/audio_board/lyrat_v4_3/board_pins_config.c:39:34: note: each undeclared identifier is reported only once for each function it appears in
components/audio_board/lyrat_v4_3/board_pins_config.c: In function 'get_i2s_pins':
components/audio_board/lyrat_v4_3/board_pins_config.c:54:33: error: 'GPIO_NUM_25' undeclared (first use in this function); did you mean 'GPIO_NUM_45'?
i2s_config->ws_io_num = GPIO_NUM_25;
^~~~~~~~~~~
GPIO_NUM_45
In file included from /data/cache/platformio/packages/framework-espidf/components/esp_rom/include/esp32s3/rom/ets_sys.h:19,
from /data/cache/platformio/packages/framework-espidf/components/log/include/esp_log.h:19,
from components/audio_board/lyrat_v4_3/board_pins_config.c:25:
components/audio_board/lyrat_v4_3/board_pins_config.c: In function 'i2s_mclk_gpio_select':
components/audio_board/lyrat_v4_3/board_pins_config.c:95:53: error: 'FUNC_GPIO0_CLK_OUT1' undeclared (first use in this function); did you mean 'FUNC_GPIO20_CLK_OUT1'?
PIN_FUNC_SELECT(PERIPHS_IO_MUX_GPIO0_U, FUNC_GPIO0_CLK_OUT1);
^~~~~~~~~~~~~~~~~~~
/data/cache/platformio/packages/framework-espidf/components/soc/esp32s3/include/soc/soc.h:136:45: note: in definition of macro 'REG_WRITE'
(*(volatile uint32_t *)(_r)) = (_v); \
^~
/data/cache/platformio/packages/framework-espidf/components/soc/esp32s3/include/soc/io_mux_reg.h:93:46: note: in expansion of macro 'REG_SET_FIELD'
#define PIN_FUNC_SELECT(PIN_NAME, FUNC) REG_SET_FIELD(PIN_NAME, MCU_SEL, FUNC)
^~~~~~~~~~~~~
components/audio_board/lyrat_v4_3/board_pins_config.c:95:13: note: in expansion of macro 'PIN_FUNC_SELECT'
PIN_FUNC_SELECT(PERIPHS_IO_MUX_GPIO0_U, FUNC_GPIO0_CLK_OUT1);
^~~~~~~~~~~~~~~
components/audio_board/lyrat_v4_3/board_pins_config.c:98:53: error: 'FUNC_U0TXD_CLK_OUT3' undeclared (first use in this function); did you mean 'FUNC_U0TXD_CLK_OUT1'?
PIN_FUNC_SELECT(PERIPHS_IO_MUX_U0TXD_U, FUNC_U0TXD_CLK_OUT3);
^~~~~~~~~~~~~~~~~~~
/data/cache/platformio/packages/framework-espidf/components/soc/esp32s3/include/soc/soc.h:136:45: note: in definition of macro 'REG_WRITE'
(*(volatile uint32_t *)(_r)) = (_v); \
^~
/data/cache/platformio/packages/framework-espidf/components/soc/esp32s3/include/soc/io_mux_reg.h:93:46: note: in expansion of macro 'REG_SET_FIELD'
#define PIN_FUNC_SELECT(PIN_NAME, FUNC) REG_SET_FIELD(PIN_NAME, MCU_SEL, FUNC)
^~~~~~~~~~~~~
components/audio_board/lyrat_v4_3/board_pins_config.c:98:13: note: in expansion of macro 'PIN_FUNC_SELECT'
PIN_FUNC_SELECT(PERIPHS_IO_MUX_U0TXD_U, FUNC_U0TXD_CLK_OUT3);
^~~~~~~~~~~~~~~
In file included from components/audio_board/lyrat_v4_3/board.h:29,
from components/audio_board/lyrat_v4_3/board_pins_config.c:28:
components/audio_board/lyrat_v4_3/board_pins_config.c: In function 'get_green_led_gpio':
components/audio_board/lyrat_v4_3/board_def.h:42:35: error: 'GPIO_NUM_22' undeclared (first use in this function); did you mean 'GPIO_NUM_42'?
#define GREEN_LED_GPIO GPIO_NUM_22
^~~~~~~~~~~
components/audio_board/lyrat_v4_3/board_pins_config.c:186:12: note: in expansion of macro 'GREEN_LED_GPIO'
return GREEN_LED_GPIO;
^~~~~~~~~~~~~~
components/audio_board/lyrat_v4_3/board_pins_config.c:187:1: error: control reaches end of non-void function [-Werror=return-type]
}
^
cc1: some warnings being treated as errors
*** [.pioenvs/kitchen-va/components/audio_board/lyrat_v4_3/board_pins_config.o] Error 1
I was eventally able to get this to work with the following options. Currently in testing with micro_wake_word, and so far on-device wake word seems to be as good as streaming to Open Wake Word
substitutions:
# Phases of the Voice Assistant
# IDLE: The voice assistant is ready to be triggered by a wake-word
voice_assist_idle_phase_id: '1'
# LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word)
voice_assist_listening_phase_id: '2'
# THINKING: The voice assistant is currently processing the command
voice_assist_thinking_phase_id: '3'
# REPLYING: The voice assistant is replying to the command
voice_assist_replying_phase_id: '4'
# NOT_READY: The voice assistant is not ready
voice_assist_not_ready_phase_id: '10'
# ERROR: The voice assistant encountered an error
voice_assist_error_phase_id: '11'
# MUTED: The voice assistant is muted and will not reply to a wake-word
voice_assist_muted_phase_id: '12'
#pins
i2s_out_lrclk_pin: GPIO11 # LRC on Max98357
i2s_out_bclk_pin: GPIO9 # BCLK on Max98357
i2s_in_lrclk_pin: GPIO3 # WS on INMP441
i2s_in_bclk_pin: GPIO2 # SLK on INMP441
light_pin: GPIO21 # on-board LED
speaker_pin: GPIO8 # DIN on Max98357
mic_pin: GPIO4 # SD on INMP441
ip: <redacted>
dns: <redacted>
esphome:
name: vatest
friendly_name: VATest
platformio_options:
board_build.flash_mode: dio
on_boot:
priority: 600
then:
# Run the script to refresh the LED status
- script.execute: control_led
# - output.turn_off: set_low_speaker
# If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- script.execute: control_led
esp32:
board: esp32-s3-devkitc-1
variant: ESP32S3
framework:
type: esp-idf
version: recommended
sdkconfig_options:
# need to set a s3 compatible board for the adf-sdk to compile
# board specific code is not used though
CONFIG_ESP32_S3_BOX_BOARD: "y"
CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
CONFIG_TCPIP_RECVMBOX_SIZE: "512"
CONFIG_TCP_SND_BUF_DEFAULT: "65535"
CONFIG_TCP_WND_DEFAULT: "512000"
CONFIG_TCP_RECVMBOX_SIZE: "512"
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
psram:
mode: quad
speed: 80MHz
external_components:
- source:
type: git
url: https://github.com/gnumpi/esphome_audio
ref: main
components:
- adf_pipeline
- i2s_audio
refresh: 0s
adf_pipeline:
- platform: i2s_audio
type: audio_out
id: adf_i2s_out
i2s_audio_id: i2s_out
i2s_dout_pin: ${speaker_pin}
- platform: i2s_audio
type: audio_in
id: adf_i2s_in
i2s_audio_id: i2s_in
i2s_din_pin: ${mic_pin}
pdm: false
channel: left
sample_rate: 16000
bits_per_sample: 32bit
microphone:
- platform: adf_pipeline
id: adf_microphone
gain_log2: 3
keep_pipeline_alive: false
pipeline:
- adf_i2s_in
- self
media_player:
- platform: adf_pipeline
id: adf_media_player
name: s3-dev_media_player
keep_pipeline_alive: false
internal: false
pipeline:
- self
- adf_i2s_out
# This is our two i2s buses with the correct pins.
# You can refer to the wirinng diagram of our voice assistant for more details
i2s_audio:
- id: i2s_out
i2s_lrclk_pin: ${i2s_out_lrclk_pin}
i2s_bclk_pin: ${i2s_out_bclk_pin}
- id: i2s_in
i2s_lrclk_pin: ${i2s_in_lrclk_pin}
i2s_bclk_pin: ${i2s_in_bclk_pin}
# This is the declaration of our voice assistant
# It references the microphone and speaker declared above.
voice_assistant:
id: va
microphone: adf_microphone
media_player: adf_media_player
# use_wake_word: true
# This is how I personally tune my voice assistant, you may have to test a few values for the 4 parameters above
noise_suppression_level: 4 #4
auto_gain: 31dBFS # 31dBFS
volume_multiplier: 8 # 8.0
# vad_threshold: 3
# When the voice assistant connects to HA:
# Set init_in_progress to false (Initialization is over).
# If the switch is on, start the voice assistant
# In any case: Set the correct phase and run the script to refresh the LED status
on_client_connected:
- lambda: id(init_in_progress) = false;
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: control_led
# When the voice assistant disconnects to HA:
# Stop the voice assistant
# Set the correct phase and run the script to refresh the LED status
on_client_disconnected:
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- micro_wake_word.stop
- script.execute: control_led
# When the voice assistant starts to listen: Set the correct phase and run the script to refresh the LED status
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
- script.execute: control_led
# When the voice assistant starts to think: Set the correct phase and run the script to refresh the LED status
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
- script.execute: control_led
# When the voice assistant starts to reply: Set the correct phase and run the script to refresh the LED status
# on_tts_stream_start:
on_tts_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- script.execute: control_led
on_end:
- if:
condition:
- switch.is_on: use_wake_word
then:
- wait_until:
not:
voice_assistant.is_running:
- micro_wake_word.start
# When the voice assistant finished to reply: Set the correct phase and run the script to refresh the LED status
# on_tts_stream_end:
# on_stt_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: control_led
# When the voice assistant encounters an error:
# Set the error phase and run the script to refresh the LED status
# Wait 1 second and set the correct phase (idle or muted depending on the state of the switch) and run the script to refresh the LED status
on_error:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- script.execute: control_led
- delay: 1s
- if:
condition:
switch.is_on: use_wake_word
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: control_led
# Enable logging
logger:
# level: VERBOSE
# logs:
# micro_wake_word: DEBUG
ota:
password: "<redacted>"
# If the device connects, or disconnects, to Home Assistant: Run the script to refresh the LED status
# Enable Home Assistant API
api:
encryption:
key: <redacted>"
on_client_connected:
- script.execute: control_led
on_client_disconnected:
- script.execute: control_led
wifi:
ssid: !secret tp_wifi_ssid
password: !secret tp_wifi_password
power_save_mode: none
manual_ip:
static_ip: ${ip}
gateway: <redacted>
subnet: 255.255.255.0
dns1: ${dns}
# If the device connects, or disconnects, to the Wifi: Run the script to refresh the LED status
on_connect:
- script.execute: control_led
on_disconnect:
- script.execute: control_led
globals:
# Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience
- id: init_in_progress
type: bool
restore_value: no
initial_value: 'true'
# Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready
- id: voice_assistant_phase
type: int
restore_value: no
initial_value: ${voice_assist_not_ready_phase_id}
sensor:
- platform: wifi_signal
name: "WiFi Signal Sensor"
update_interval: 120s
light:
- platform: esp32_rmt_led_strip
rgb_order: GRB
pin: ${light_pin}
num_leds: 1
rmt_channel: 0
chipset: WS2812
name: "Status LED"
id: led
disabled_by_default: True
# entity_category: diagnostic
icon: mdi:led-on
default_transition_length: 0s
effects:
- pulse:
name: "Slow Pulse"
transition_length: 250ms
update_interval: 250ms
min_brightness: 50%
max_brightness: 100%
- pulse:
name: "Fast Pulse"
transition_length: 100ms
update_interval: 100ms
min_brightness: 50%
max_brightness: 100%
script:
# Master script controlling the LED, based on different conditions: initialization in progress, wifi and API connected, and the current voice assistant phase.
# For the sake of simplicity and re-usability, the script calls child scripts defined below.
# This script will be called every time one of these conditions is changing.
- id: control_led
then:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- if:
condition:
wifi.connected:
then:
- if:
condition:
api.connected:
then:
- lambda: |
switch(id(voice_assistant_phase)) {
case ${voice_assist_listening_phase_id}:
id(control_led_voice_assist_listening_phase).execute();
break;
case ${voice_assist_thinking_phase_id}:
id(control_led_voice_assist_thinking_phase).execute();
break;
case ${voice_assist_replying_phase_id}:
id(control_led_voice_assist_replying_phase).execute();
break;
case ${voice_assist_error_phase_id}:
id(control_led_voice_assist_error_phase).execute();
break;
case ${voice_assist_muted_phase_id}:
id(control_led_voice_assist_muted_phase).execute();
break;
case ${voice_assist_not_ready_phase_id}:
id(control_led_voice_assist_not_ready_phase).execute();
break;
default:
id(control_led_voice_assist_idle_phase).execute();
break;
}
else:
- script.execute: control_led_no_ha_connection_state
else:
- script.execute: control_led_no_ha_connection_state
else:
- script.execute: control_led_init_state
# Script executed during initialisation: In this example: Turn the LED in green with a slow pulse 🟢
- id: control_led_init_state
then:
- light.turn_on:
id: led
blue: 0%
red: 0%
green: 100%
effect: "Fast Pulse"
# Script executed when the device has no connection to Home Assistant: In this example: Turn off the LED
- id: control_led_no_ha_connection_state
then:
- light.turn_off:
id: led
# Script executed when the voice assistant is idle (waiting for a wake word): In this example: Turn the LED in white with 20% of brightness ⚪
- id: control_led_voice_assist_idle_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 100%
green: 100%
brightness: 20%
effect: "none"
# Script executed when the voice assistant is listening to a command: In this example: Turn the LED in blue with a slow pulse 🔵
- id: control_led_voice_assist_listening_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
effect: "Slow Pulse"
# Script executed when the voice assistant is processing the command: In this example: Turn the LED in blue with a fast pulse 🔵
- id: control_led_voice_assist_thinking_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
effect: "Fast Pulse"
# Script executed when the voice assistant is replying to a command: In this example: Turn the LED in blue, solid (no pulse) 🔵
- id: control_led_voice_assist_replying_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: "none"
# Script executed when the voice assistant encounters an error: In this example: Turn the LED in red, solid (no pulse) 🔴
- id: control_led_voice_assist_error_phase
then:
- light.turn_on:
id: led
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: "none"
# Script executed when the voice assistant is muted: In this example: Turn off the LED
- id: control_led_voice_assist_muted_phase
then:
- light.turn_off:
id: led
# Script executed when the voice assistant is not ready: In this example: Turn off the LED
- id: control_led_voice_assist_not_ready_phase
then:
- light.turn_off:
id: led
# Declaration of the switch that will be used to turn on or off (mute) or voice assistant
button:
#system
- platform: restart
name: Restart
id: restart_switch
switch:
- platform: template
name: Enable Voice Assistant
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
icon: mdi:assistant
# When the switch is turned on (on Home Assistant):
# Start the voice assistant component
# Set the correct phase and run the script to refresh the LED status
on_turn_on:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- if:
condition:
not:
- voice_assistant.is_running
then:
- micro_wake_word.start
- script.execute: control_led
# When the switch is turned off (on Home Assistant):
# Stop the voice assistant component
# Set the correct phase and run the script to refresh the LED status
on_turn_off:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- voice_assistant.stop
- micro_wake_word.stop
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: control_led
micro_wake_word:
model: okay_nabu
on_wake_word_detected:
then:
- media_player.stop:
# - media_player.play_media: https://steriku.duckdns.org:8123/local/va/din-ding.mp3
- voice_assistant.start:
I can only use it once and cannot continue to wake up. The log shows the following content:
[D] [esp.adf_pipeline: 302]: State changed from STOPPING to UNITIALIZED
Detailed logs:
[15:04:26][D][media_player:061]: 's3-dev_media_player' - Setting
[15:04:26][D][media_player:068]: Media URL: http://192.168.3.242:8123/api/tts_proxy/2bea3af15933f57a7d0e53bf47780474ebef10f5_zh-cn_6c2e43c6c1_edge_tts.mp3
[15:04:26][D][media_player:074]: Announcement: yes
[15:04:26][D][adf_media_player:030]: Got control call in state 1
[15:04:26][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED
[15:04:26][V][esp-idf:000]: I (25310) MP3_DECODER: MP3 init
[15:04:26][V][esp-idf:000]: I (25322) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=4
[15:04:26][D][i2s_audio:072]: Installing driver : yes
[15:04:26][D][esp_adf_pipeline:358]: pipeline tag 0, http
[15:04:26][D][esp_adf_pipeline:358]: pipeline tag 1, decoder
[15:04:26][D][esp_adf_pipeline:358]: pipeline tag 2, i2s_out
[15:04:26][V][esp-idf:000]: I (25343) AUDIO_PIPELINE: link el->rb, el:0x3d81c0f8, tag:http, rb:0x3d81c64c
[15:04:26][V][esp-idf:000]: I (25351) AUDIO_PIPELINE: link el->rb, el:0x3d81c2e8, tag:decoder, rb:0x3d81d68c
[15:04:26][D][esp_adf_pipeline:370]: Setting up event listener.
[15:04:26][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING
[15:04:26][I][adf_media_player:135]: got new pipeline state: 1
[15:04:26][D][adf_i2s_out:127]: Set final i2s settings: 16000
[15:04:26][W][component:237]: Component voice_assistant took a long time for an operation (124 ms).
[15:04:26][W][component:238]: Components should block for at most 30 ms.
[15:04:26][D][voice_assistant:625]: Event Type: 2
[15:04:26][D][voice_assistant:715]: Assist Pipeline ended
[15:04:26][V][esp-idf:000]: I (25423) AUDIO_THREAD: The http task allocate stack on external memory
[15:04:26][V][esp-idf:000]: I (25426) AUDIO_ELEMENT: [http-0x3d81c0f8] Element task created
[15:04:26][V][esp-idf:000]: I (25433) AUDIO_THREAD: The decoder task allocate stack on external memory
[15:04:26][V][esp-idf:000]: I (25443) AUDIO_ELEMENT: [decoder-0x3d81c2e8] Element task created
[15:04:26][V][esp-idf:000][http]: I (25453) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1
[15:04:26][V][esp-idf:000][decoder]: I (25464) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1
[15:04:26][D][esp_aud:000][decoStreamer status: 2
[15:04:26][D][esp_audio_sources:098]: decoder status: 2
[15:04:26][W][component:237]: Component adf_pipeline.media_player took a long time for an operation (61 ms).
[15:04:26][W][component:237]: Component adf_pipeline.media_player took a long time for an operation (61 ms).
[15:04:26][W][component:238]: Components should block for at most 30 ms.
[15:04:26][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=24000, bits=16, ch=1
[15:04:26][D][adf_i2s_out:114]: update i2s clk settings: rate:24000 bits:16 ch:1
[15:04:26][V][esp-idf:000]: I (25519) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4
[15:04:26][D][adf_i2s_out:127]: Set final i2s settings: 24000
[15:04:26][V][esp-idf:000][decoder]: W (25536) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT
[15:04:26][V][esp-idf:000][decoder]: W (25546) MP3_DECODER: output aborted -3
[15:04:26][V][esp-idf:000][decoder]: I (25557) MP3_DECODER: Closed
[15:04:26][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[15:04:26][I][adf_media_player:135]: got new pipeline state: 2
[15:04:26][D][adf_i2s_out:127]: Set final i2s settings: 24000
[15:04:26][V][esp-idf:000]: I (25588) AUDIO_ELEMENT: [i2s_out-0x3d81c4b4] Element task created
[15:04:26][V][esp-idf:000]: I (25597) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8424887 Bytes, Inter:174452 Bytes, Dram:174452 Bytes
[15:04:26][V][esp-idf:000][http]: I (25607) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1
[15:04:26][V][esp-idf:000][decoder]: I (25618) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1
[15:04:26][V][esp-idf:000][i2s_out]: I (25628) AUDIO_ELEMENT: [i2s_out] AEL_MSG_CMD_RESUME,state:1
[15:04:26][V][esp-idf:000][i2s_out]: I (25639) I2S_STREAM: AUDIO_STREAM_WRITER
[15:04:26][I][esp_adf_pipeline:214]: [ i2s_out ] status: 12
[15:04:26][D][esp_adf_pipeline:131]: Check element [http] status, 3
[15:04:26][D][esp_adf_pipeline:131]: Check element [decoder] status, 2
[15:04:27][I][esp_adf_pipeline:214]: [ decoder ] status: 12
[15:04:27][D][esp_adf_pipeline:131]: Check element [http] status, 3
[15:04:27][D][esp_adf_pipeline:131]: Check element [decoder] status, 3
[15:04:27][D][esp_adf_pipeline:131]: Check element [i2s_out] status, 3
[15:04:27][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[15:04:27][I][adf_media_player:135]: got new pipeline state: 3
[15:04:27][D][adf_i2s_out:127]: Set final i2s settings: 24000
[15:04:27][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=24000, bits=16, ch=1
[15:04:27][D][adf_i2s_out:127]: Set final i2s settings: 24000
[15:04:27][V][esp-idf:000][http]: W (26660) HTTP_STREAM: No more data,errno:0, total_bytes:14221, rlen = 0
[15:04:27][V][esp-idf:000][http]: I (26663) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0
[15:04:27][I][esp_adf_pipeline:214]: [ http ] status: 15
[15:04:27][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[15:04:27][I][adf_media_player:135]: got new pipeline state: 4
[15:04:28][V][esp-idf:000][decoder]: I (27363) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2
[15:04:29][V][esp-idf:000][decoder]: I (27723) MP3_DECODER: Closed
[15:04:29][V][esp-idf:000][i2s_out]: I (27829) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2
[15:04:29][D][esp_adf_pipeline:400]: Called deinit_all
[15:04:29][V][esp-idf:000]: I (28003) AUDIO_PIPELINE: audio_pipeline_unlinked
[15:04:29][V][esp-idf:000]: W (28005) AUDIO_ELEMENT: [http] Element has not create when AUDIO_ELEMENT_TERMINATE
[15:04:29][V][esp-idf:000]: W (28008) AUDIO_ELEMENT: [decoder] Element has not create when AUDIO_ELEMENT_TERMINATE
[15:04:29][V][esp-idf:000]: W (28010) AUDIO_ELEMENT: [i2s_out] Element has not create when AUDIO_ELEMENT_TERMINATE
[15:04:29][V][esp-idf:000]: I (28025) I2S: DMA queue destroyed
[15:04:29][D][esp_adf_pipeline:302]: State changed from STOPPING to UNINITIALIZED
[15:04:29][I][adf_media_player:135]: got new pipeline state: 0
Hi, I’ve also been working on trying to get this working for a few days now.
I’ve recently had a similar issue to you. It wakes once, responds, but then never picks up another wake-word.
I’ve taken the source that esphome generates and poured over the voice_assistant.cpp implementation. And noticed that voice_assistant does not move itself into the IDLE state (accepting subsequent commands) unless the media_player has completed its announcement.
Now, the media_player object that this config builds does not seem to support announce fully. So it never informs voice_assistant that it’s completed.
In voice_assistant.cpp, if I change .set_announce(true) to false. And I changed the MEDIA_PLAYER_STATE check from ANNOUNCING to PLAYING
This way, the voice_assistant pipeline can correctly determine if a response has finished playing, and then set itself to IDLE, expecting another command.
I’m still working on this, I’ll update here (and probably make a seperate post) once I get an adapted solution working well
Mmh interesting…
I’ve also encountered this issue and I noticed that when the condition is changed to not mediaplayer.is_playing it does go back to idle, indicating that the on_end event had fired and the voice assistant is idle?
After this another wake word can be detected and it does, but unfortunately the voice assist does not (re)start listening and the device basically halts after detecting the wake word.
I have this exact issue. I’m trying to build a satellite style device with a media player to use in general and also voice assistant. It does work on the first try, detects wakeword, runs pipeline and executes command. But for the second time it does not unfortunaly. In logs I see that the wakeword is detected but the pipeline is not running.
Has anyone ever found a workaround for this except using speaker instead of media_player?