Am I correct that you’re not running the changes I suggested? You should put them back in, as I’m pretty sure you will not hit that specific issue with the changes in place as I haven’t seen it at all with the change in place. I also did a different modification for my ESP32-S3-BOX that is kind of the same thing the ESP32 mod, but slightly changed because the S3-BOX is doing initial wake word detection. Both the ESP32 and ESP32S3 ran without issue for at least 4 days. I had the S3-BOX first and gave up on it because it would be non-responsive most morning. I did have an issue with the ESP32 that I describe below. You were correct about network connectivity being part of the problem associated with the voice assistant. All of the communications between the satellite and HA are UDP. if a packet gets stepped on or arrives out of order, it’ll cause the audio to be messed up. I did some ping test and found that when I had reasonable ping response times (less than 10 ms) the audio quality out the speaker was great. However, if something was delaying packets on the network I get the studder issue. Now I did have an issue last night where the ping responses from the ESP32 were greater than 1000ms with about a 50% packet drop rate on the ping request. I’m not sure what caused this, but it happened after I did my network modifications. So, network activity would have been getting interrupted. I’m going to watch this to see if over time the ESP32 gets slower responding to pings. In the event that happens I’m going to set up an automation on HA to reboot the satellites in the middle of the night.
Check this post #72 further up-thread from @Rich37804 specifically regarding using GPIO20 for your amplifier bit clock. Also be certain the ground between the MAX98537a amplifier and ESP32-S3 is connected.
My device bacame very unstable with those changes. Lots of freezing.
Im going on 3 days with only one issue as things are now with 4 assistants running.
Had two false positives tonight, but that’s not surprising, considering the VA sits beneath the television. Just very happy to have my first VA fully assembled and operational this afternoon and evening. Plan is to assemble another two or three this weekend.
Hmm that’s interesting. The changes I posted work for me on the ESP32. I had lots of issues without those changes. These same changes were problematic on the ESP32S3, so I made slightly different changes for better performance on the ESP32S3. Can you post your most recent configuration file for the ESP32, so I can run a test on my system?
For the ChatGPT, did you set up the localai implementation or did you just use the online approach? I set up the localai stuff. From the command line I get pretty good responses to generalized question. Here’s and example command line query and response:
$ curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "luna",
"messages": [{"role": "user", "content": "Who is batman?"}],
"temperature": 0.5
}'
{"created":1709344133,"object":"chat.completion","id":"0ebf26e2-869e-4dcf-b6ac-413c710fa692","model":"lunademo","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Batman is a superhero that appears in American comic books published by DC Comics. He is the alias of Bruce Wayne, a billionaire playboy who uses his wealth to fight crime in Gotham City."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
When I hook it up to the HA voice assistant pipeline using extended_openai_conversation, I see the question to come through in the localai logs. I’m running localai on an AMD 3900x. Giving 22 cores for localai processing. Response take about 3 minutes. I can’t seem to get things right so it’ll attempt to turn the lights out. I’m not sure if I have a configuration issue, a model issue or a docker container issue. The thing I really don’t understand is, for some strange reason the model, which returns good responses to generalize question through the command line, simply return the question I ask as the response when queried through extended_openai_conversation integration. This is seen in the GUI and in the localia logs.
This link was the best reference I found for setting up localai with an appropriate model. I don’t currently have a CUDA based graphics card in the server I’m running it on, so I had to slightly modify the docker compose lines. I’m running the full pipeline on one server. This is my full docker-composer file for the pipeline:
version: '3'
services:
wyoming-whisper:
image: rhasspy/wyoming-whisper
ports:
- "10300:10300"
volumes:
- ./whisper-data:/data
#command: [ "--model", "medium-int8", "--language", "en" ]
command: [ "--model", "small-int8", "--language", "en" ]
restart: unless-stopped
openwakeword:
container_name: openWakeWord
image: rhasspy/wyoming-openwakeword
volumes:
- ./openwakeword-data:/data
- ./openwakeword-data:/custom
- /etc/timezone:/etc/timezone:ro
- /etc/localtime:/etc/localtime:ro
environment:
- TZ=America/New_York
#command: [ "--preload-model", "ok_nabu", "--custom-model-dir", "/custom" ]
#command: --preload-model 'ok_nabu' --custom-model-dir /custom
#command: --preload-model 'alexa' --custom-model-dir /custom
command: --preload-model 'hey_jarvis' --custom-model-dir /custom
restart: unless-stopped
ports:
- 10400:10400
wyoming-piper:
image: rhasspy/wyoming-piper
ports:
- "10200:10200"
volumes:
- "./piper-data:/data"
#command: [ "--voice", "en-gb-southern_english_female-low" ]
command: [ "--voice", "en_GB-northern_english_male-medium" ]
#command: [ "--voice", "en_GB-semaine-medium" ]
restart: unless-stopped
localai:
image: quay.io/go-skynet/local-ai:v2.9.0
ports:
- 8080:8080
environment:
- DEBUG=true
- MODELS_PATH=/models
volumes:
- "./models:/models"
Anyway that’s a lot of words and I’m really hoping you have the localai working and if yes could share which model you using and any configuration you’ve used to make it work.
post deleted
Only problem is, too many of the AITRIP ESP32-S3-WROOM-1-N16R8 boards I received simply do not work reliably. I cannot recommend this board, based on my experience with two different orders of three boards each.
Mine have been purring like kittens for a few days now. I only have 2 running at the moment because I have replaced the other 2 with Wyoming satellites.
No errors/issues from the 2 esp32 satellites in a few days now.
Are you running micro wake word on yours?
No, I am not. I see no need for that in my configuration.
I was just wondering as the S3 am i trying to get micro wake word running on is casing crackling on the speaker as I said earlier. I have had 2 ESP32 wroom boards running without micro wake word for a while.
It just seems to be my S3 boards which are causing issues.
The way I trouble shoot that is to disconnect one wire at a time until the cracking goes away. Dont unplug any ground or positive lines. Unplug one, if the crackling stays, plug back in. Unplug and the crackling stops, reassign that output/input to a different gpio. On one of my s-3 boards it was a line running to the microphone causing it.
The microphone is responding fine, its only when the speaker plays the response that it crackles. Sometimes one word is legible but not often. I have tried various pins for the speaker with no luck.
I had the same problem, I managed to solve it by changing the speaker ground pin.
Will try that, as I have not tried that yet. Thanks for the tip.
This configuration is working well for me so far with micro_wake_word enabled on the esp32-s3-devkitc-1. I’m also using the on-board ws2812 LED for visual feedback. I’ve modified the yaml from here to suit the s3-board.
I’m using this board (ESP32-S3 N16R8). If you go for this one, you need to be aware that you’ll have to solder the LED pads for pin 48 to have the led work and 5V is not connected to power the Max98357, but I have not had an issue so far powering it off of 3.3V, as it is supported.
The one issue I am getting, and I was getting this with my esp32-devkit boards, is the voice/speaker jitter/stutter. Not sure if this is common across all implementations of VA on esp32s. This is an intermittent issue and I’m guessing it is CPU load related.
esphome:
name: s3test
friendly_name: S3Test
platformio_options:
board_build.flash_mode: dio
on_boot:
priority: 600
then:
# Run the script to refresh the LED status
- script.execute: control_led
# - output.turn_off: set_low_speaker
# If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- script.execute: control_led
esp32:
board: esp32-s3-devkitc-1
variant: esp32s3
framework:
type: esp-idf
components:
- name: esphome_board
source: github://jesserockz/esphome-esp-adf-board@main
refresh: 0s
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_AUDIO_BOARD_CUSTOM: "y"
psram:
mode: octal
speed: 80MHz
# Enable logging
logger:
ota:
password: "<redacted>"
# Enable Home Assistant API
api:
encryption:
key: "<redacted>"
# If the device connects, or disconnects, to Home Assistant: Run the script to refresh the LED status
on_client_connected:
- script.execute: control_led
on_client_disconnected:
- script.execute: control_led
wifi:
ssid: !secret tp_wifi_ssid
password: !secret tp_wifi_password
# If the device connects, or disconnects, to the Wifi: Run the script to refresh the LED status
on_connect:
- script.execute: control_led
on_disconnect:
- script.execute: control_led
substitutions:
# Phases of the Voice Assistant
# IDLE: The voice assistant is ready to be triggered by a wake-word
voice_assist_idle_phase_id: '1'
# LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word)
voice_assist_listening_phase_id: '2'
# THINKING: The voice assistant is currently processing the command
voice_assist_thinking_phase_id: '3'
# REPLYING: The voice assistant is replying to the command
voice_assist_replying_phase_id: '4'
# NOT_READY: The voice assistant is not ready
voice_assist_not_ready_phase_id: '10'
# ERROR: The voice assistant encountered an error
voice_assist_error_phase_id: '11'
# MUTED: The voice assistant is muted and will not reply to a wake-word
voice_assist_muted_phase_id: '12'
#pins
i2s_out_lrclk_pin: GPIO6 # LRC on Max98357
i2s_out_bclk_pin: GPIO7 # BCLK on Max98357
i2s_in_lrclk_pin: GPIO3 # WS on INMP441
i2s_in_bclk_pin: GPIO2 # SLK on INMP441
light_pin: GPIO48 # on-board LED
speaker_pin: GPIO8 # DIN on Max98357
mic_pin: GPIO4 # SD on INMP441
globals:
# Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience
- id: init_in_progress
type: bool
restore_value: no
initial_value: 'true'
# Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready
- id: voice_assistant_phase
type: int
restore_value: no
initial_value: ${voice_assist_not_ready_phase_id}
light:
- platform: esp32_rmt_led_strip
rgb_order: GRB
pin: ${light_pin}
num_leds: 1
rmt_channel: 0
chipset: WS2812
name: "Status LED"
id: led
disabled_by_default: True
# entity_category: diagnostic
icon: mdi:led-on
default_transition_length: 0s
effects:
- pulse:
name: "Slow Pulse"
transition_length: 250ms
update_interval: 250ms
min_brightness: 50%
max_brightness: 100%
- pulse:
name: "Fast Pulse"
transition_length: 100ms
update_interval: 100ms
min_brightness: 50%
max_brightness: 100%
script:
# Master script controlling the LED, based on different conditions: initialization in progress, wifi and API connected, and the current voice assistant phase.
# For the sake of simplicity and re-usability, the script calls child scripts defined below.
# This script will be called every time one of these conditions is changing.
- id: control_led
then:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- if:
condition:
wifi.connected:
then:
- if:
condition:
api.connected:
then:
- lambda: |
switch(id(voice_assistant_phase)) {
case ${voice_assist_listening_phase_id}:
id(control_led_voice_assist_listening_phase).execute();
break;
case ${voice_assist_thinking_phase_id}:
id(control_led_voice_assist_thinking_phase).execute();
break;
case ${voice_assist_replying_phase_id}:
id(control_led_voice_assist_replying_phase).execute();
break;
case ${voice_assist_error_phase_id}:
id(control_led_voice_assist_error_phase).execute();
break;
case ${voice_assist_muted_phase_id}:
id(control_led_voice_assist_muted_phase).execute();
break;
case ${voice_assist_not_ready_phase_id}:
id(control_led_voice_assist_not_ready_phase).execute();
break;
default:
id(control_led_voice_assist_idle_phase).execute();
break;
}
else:
- script.execute: control_led_no_ha_connection_state
else:
- script.execute: control_led_no_ha_connection_state
else:
- script.execute: control_led_init_state
# Script executed during initialisation: In this example: Turn the LED in green with a slow pulse 🟢
- id: control_led_init_state
then:
- light.turn_on:
id: led
blue: 0%
red: 0%
green: 100%
effect: "Fast Pulse"
# Script executed when the device has no connection to Home Assistant: In this example: Turn off the LED
- id: control_led_no_ha_connection_state
then:
- light.turn_off:
id: led
# Script executed when the voice assistant is idle (waiting for a wake word): In this example: Turn the LED in white with 20% of brightness ⚪
- id: control_led_voice_assist_idle_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 100%
green: 100%
brightness: 20%
effect: "none"
# Script executed when the voice assistant is listening to a command: In this example: Turn the LED in blue with a slow pulse 🔵
- id: control_led_voice_assist_listening_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
effect: "Slow Pulse"
# Script executed when the voice assistant is processing the command: In this example: Turn the LED in blue with a fast pulse 🔵
- id: control_led_voice_assist_thinking_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
effect: "Fast Pulse"
# Script executed when the voice assistant is replying to a command: In this example: Turn the LED in blue, solid (no pulse) 🔵
- id: control_led_voice_assist_replying_phase
then:
- light.turn_on:
id: led
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: "none"
# Script executed when the voice assistant encounters an error: In this example: Turn the LED in red, solid (no pulse) 🔴
- id: control_led_voice_assist_error_phase
then:
- light.turn_on:
id: led
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: "none"
# Script executed when the voice assistant is muted: In this example: Turn off the LED
- id: control_led_voice_assist_muted_phase
then:
- light.turn_off:
id: led
# Script executed when the voice assistant is not ready: In this example: Turn off the LED
- id: control_led_voice_assist_not_ready_phase
then:
- light.turn_off:
id: led
# This is how to include the Espressif Audio Development Framework.
# This is needed to be able to use VAD (Voice audio detection) and prevent the voice assistant from being constantly streaming audio to Home Assistant
# For now, this component is not documented, nor on the code base of ESPHome, hence the reference to the external component.
esp_adf:
external_components:
- source: github://pr#5230
components:
- esp_adf
refresh: 0s
# Declaration of the switch that will be used to turn on or off (mute) or voice assistant
switch:
#system
- platform: restart
name: Restart
id: restart_switch
- platform: template
name: Enable Voice Assistant
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
icon: mdi:assistant
# When the switch is turned on (on Home Assistant):
# Start the voice assistant component
# Set the correct phase and run the script to refresh the LED status
on_turn_on:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- if:
condition:
not:
- voice_assistant.is_running
then:
- micro_wake_word.start
- script.execute: control_led
# When the switch is turned off (on Home Assistant):
# Stop the voice assistant component
# Set the correct phase and run the script to refresh the LED status
on_turn_off:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- voice_assistant.stop
- micro_wake_word.stop
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: control_led
# This is our two i2s buses with the correct pins.
# You can refer to the wirinng diagram of our voice assistant for more details
i2s_audio:
- id: i2s_out
i2s_lrclk_pin: ${i2s_out_lrclk_pin}
i2s_bclk_pin: ${i2s_out_bclk_pin}
- id: i2s_in
i2s_lrclk_pin: ${i2s_in_lrclk_pin}
i2s_bclk_pin: ${i2s_in_bclk_pin}
# This is the declaration of our microphone.
# It includes the data pin (You can refer to the wiring diagram of our voice assistant for more details)
# It references the correct i2s bus declared above.
microphone:
platform: i2s_audio
id: external_microphone
adc_type: external
i2s_audio_id: i2s_in
i2s_din_pin: ${mic_pin}
channel: left
pdm: false
# This is the declaration of our speaker.
# It includes the data pin (You can refer to the wiring diagram of our voice assistant for more details)
# It references the correct i2s bus declared above.
# output:
# - platform: gpio
# pin:
# number: ${speaker_pin}
# allow_other_uses: true
# id: set_low_speaker
speaker:
platform: i2s_audio
id: external_speaker
dac_type: external
i2s_audio_id: i2s_out
i2s_dout_pin:
number: ${speaker_pin}
# allow_other_uses: true
micro_wake_word:
model: okay_nabu
on_wake_word_detected:
then:
- voice_assistant.start:
# This is the declaration of our voice assistant
# It references the microphone and speaker declared above.
voice_assistant:
id: va
microphone: external_microphone
speaker: external_speaker
# use_wake_word: true
# This is how I personally tune my voice assistant, you may have to test a few values for the 4 parameters above
noise_suppression_level: 4
auto_gain: 31dBFS
volume_multiplier: 8.0
vad_threshold: 3
# When the voice assistant connects to HA:
# Set init_in_progress to false (Initialization is over).
# If the switch is on, start the voice assistant
# In any case: Set the correct phase and run the script to refresh the LED status
on_client_connected:
- lambda: id(init_in_progress) = false;
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: control_led
# When the voice assistant disconnects to HA:
# Stop the voice assistant
# Set the correct phase and run the script to refresh the LED status
on_client_disconnected:
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- micro_wake_word.stop
- script.execute: control_led
# When the voice assistant starts to listen: Set the correct phase and run the script to refresh the LED status
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
- script.execute: control_led
# When the voice assistant starts to think: Set the correct phase and run the script to refresh the LED status
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
- script.execute: control_led
# When the voice assistant starts to reply: Set the correct phase and run the script to refresh the LED status
on_tts_stream_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- script.execute: control_led
on_end:
- if:
condition:
- switch.is_on: use_wake_word
then:
- wait_until:
not:
voice_assistant.is_running:
- micro_wake_word.start
# When the voice assistant finished to reply: Set the correct phase and run the script to refresh the LED status
on_tts_stream_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: control_led
# When the voice assistant encounters an error:
# Set the error phase and run the script to refresh the LED status
# Wait 1 second and set the correct phase (idle or muted depending on the state of the switch) and run the script to refresh the LED status
on_error:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- script.execute: control_led
- delay: 1s
- if:
condition:
switch.is_on: use_wake_word
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: control_led
VAD will always say that , there is no benefit in having it in the config, esp-adf only works with the s3 box 3 , you can safely remove esp-adf , the external component and vad_threshold:
stutter shouldn’t be an issue on an s3, make sure that your wifi connection is good, and also that your HA isnt instance isn’t under load (check hardware monitor whilst issuing voice command) . if you have other components installed on the same board such as web_server: try disabling that.
Thank you, I’ll try that with the second VA I assembled today. (Apparently, I found one of the few good S3 boards in my recent shipments.) NOPE! esp_adf is needed.
-- Building ESP-IDF components for target esp32s3
-- Configuring incomplete, errors occurred!
See also "/data/build/vasst-sunroom/.pioenvs/vasst-sunroom/CMakeFiles/CMakeOutput.log".
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
CMake Error at /data/cache/platformio/packages/framework-espidf/tools/cmake/build.cmake:201 (message):
Failed to resolve component 'audio_sal'.
Call Stack (most recent call first):
/data/cache/platformio/packages/framework-espidf/tools/cmake/build.cmake:241 (__build_resolve_and_add_req)
/data/cache/platformio/packages/framework-espidf/tools/cmake/build.cmake:518 (__build_expand_requirements)
/data/cache/platformio/packages/framework-espidf/tools/cmake/project.cmake:476 (idf_build_process)
CMakeLists.txt:3 (project)
========================== [FAILED] Took 4.08 seconds ==========================
FYI, I’ve found the Adafruit SPH045 MEMS microphone to be a MUCH better device to connect, configure, and physically install. It has pins in a straight line, instead of all around a circle. Also, if you leave the SEL pin unconnected, it defaults to Left, no need to ground it.
do a clean build files , then try install again. you have to run clean build after adding or removing a component when using esp-idf framework