@Alextrical did you ever print a case for your korvo 1.1? If so, mind sharing updates and/or print files?
I’m still using that older model posted. Printed on a bluish PETG CF combo. Took apart an old Alexa. It just fits but no way to route the 3.5mm jack anywhere and didn’t feel like doing that. That ribbon cable freaks me out sometimes. Literally the worst place for it on the korvo-1 (I don’t there’s a huge hardware payout difference between the 2, outside the -1 has 2 micro USB ports, one for power and one for UART). That Alexa had a surprisingly well built speaker that I’ll be snagging. I should know how to cut a hole out for the 3.5mm audio jack in a slicer but haven’t yet.
I thought you had to allow the ESPHome device to make HA service calls. If you go to devices, then ESPHome where the list shows up and each device has a “configure” option, you get the below
ESPHome devices can make service calls to any Home Assistant service. This functionality is not enabled by default for newly configured device, but can be turned on the options flow on a per device basis.
Yup, has to be done but then you could use one of the 2 below services to output audio, or should be able to.
e.io/components/api.html#homeassistant-service-action
Yep, that will allow the communication between esphome and HA but you also need to configure the on_tts_end to pipe the wav to a media player which is what i was talking about.
Okay, that makes total sense now. I was just a bit confused. I don’t think you used to have to specifically allow a device to make HA service calls from ESPHome. I think that might have been added at some point but if/when it was I’m not sure.
Did anyone else have any issues with the latest core update? All my voice assistants, regardless of microwakeword just stopped working. This included Korvo-1, USB speakerphone hooked directly to HA server using Assist Microphone and Openeakeword, and Android phone which I have to long press the power button so no wake word involved. They all reply but nothing happens. Like turning a light on, it says “okay” but it doesn’t trigger actions.
I always take a full backup before a core upgrade so restring is the solution for now. Lastly, has any tried the below board? It seems like a Korvo V1.1 clone but doesn’t use a ribbon cable. There is a male pin connector on the bottom board and a female connector on the bottom of the mic array board. While it only has 4MB ROM, that’s still more than enough for me. It does have 8MB of external PSRAM. Seems like a bargain for the price.
ESP32-LyraTD-MSC Development Board ESP32 WiFi Bluetooth-compatible Audio Module ESP32-WROVER-E 4MB Flash 8MB P-SRAM
https://a.aliexpress.com/_mK98axG
My devices are all working currently although im just using 2 box-s3’s. I havent plugged my korvo 1.1’s in recently as i was hoping for things to harden and for someone smarter than me to figure out the secret sauce. I’ve heard a few folks talking about the lyra boards and have had some success with them. The pin seems like such a better design than the ribbon cables for sure.
Edit: ok so i tested my boxes after updating HA and EspHome and what im seeing happening is the wakeword not stopping after executing an action. Its just stuck listening until I can ask it again to cancel. Then it goes back to idle.
Hmm. Original issue was some old deprecated stuff for the recorder in my configuration.yaml and then everything started working although some things have been flaky. Might roll back. Just an FYI, the I2/tts doesn’t work with esp_adf which to my knowledge is required for microwakeword. When trying to do the below (per the docs) and from the HA forums to actually send audio to an external speaker. Once I got the config correct it instantly went red. While hovering over it it said “i2s_audio doesn’t work with esp_adf” or something similar and per the docs you have to install the i2s_audio component now to play on an external speaker for audio.
Today’s release of Music Assistant 2.0 has me even more stoked for this project! It would be awesome if the device ends up being supported as a player by Music Assistant.
Yep you’re right. I meant openWakeWord vs microWakeWord. My bad.
Hi, I’ve recently completed this:
It needs some minor tweaks but will be being uploaded here: GitHub - joey-90/ESP32-S3-Korvo-1---Voice-Assistant: Yaml configuration for ESP Home using an Espressif ESP32-S3-KORVO-1 board. shortly.
Looks very nice, looking forward to seeing construction details.
where did you get that case? did you design it?
I am trying to understand this older yaml and how it function but as far as i can see you have to flip the switch to start continuous conversation and back to off again when you stop. Am I right about this?
Hello. Are you also experiencing issues with the LED light on the Korvo? I’ve tried different versions of the codes, and I’ve varied the GPIO on 18, 19, and 33, but I can’t control the color; the lighting effects work, but I haven’t managed to get them all in the same color. Am I doing something wrong, or is it an issue with the board?
Yes, it makes it pretty much useless, I am using updated code, This is what I am currently using which needs to also be updated but works for now
The GPIO for all lights is 19, it should work per the docs but lights are very pesky about being told to be turned on/off.
substitutions:
name: korvo
friendly_name: korvo
voice_assist_idle_phase_id: "1"
voice_assist_listening_phase_id: "2"
voice_assist_thinking_phase_id: "3"
voice_assist_replying_phase_id: "4"
voice_assist_not_ready_phase_id: "10"
voice_assist_error_phase_id: "11"
voice_assist_muted_phase_id: "12"
voice_assist_timer_finished_phase_id: "20"
micro_wake_word_model: okay_nabu
esphome:
name: ${name}
friendly_name: ${friendly_name}
#name_add_mac_suffix: true
platformio_options:
board_build.flash_mode: dio
upload_speed: 460800
# project:
# name: esphome.voice-assistant
# version: "1.0"
#min_version: 2023.11.5
on_boot:
- priority: 600
then:
- light.turn_on:
id: led_ring
red: 0%
blue: 0%
green: 100%
brightness: 100%
effect: random
- delay: 30s
- if:
condition:
- lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
esp32:
board: esp32s3box
flash_size: 16MB
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_AUDIO_BOARD_CUSTOM: "y"
CONFIG_ESP32_S3_KORVO1_BOARD: "y"
components:
- name: esp32_korvo1_board #esp32_s3_korvo1_board for the s3 variant and really you should be able to name this anything
source: github://abmantis/esphome_custom_audio_boards@main #s3_korvo_1 for the s3 variant
refresh: 0s
psram:
mode: octal
speed: 80MHz
external_components:
- source: github://pr#5230
components: esp_adf
refresh: 0s
- source: github://jesserockz/esphome-components
components: [file]
refresh: 0s
# Enable logging
ota:
- platform: esphome
id: my_ota
password: "OTA"
logger:
api:
encryption:
key: API
text_sensor:
- platform: wifi_info
ip_address:
name: "${friendly_name} IP Address"
time:
platform: homeassistant
id: homeassistant_time
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
use_address: 192.168.0.48
on_connect:
then:
- delay: 20ms # Gives time for improv results to be transmitted
- ble.disable:
- delay: 5s
on_disconnect:
then:
- ble.enable:
ap:
ssid: "Korvo Fallback Hotspot"
password: "HS"
improv_serial:
esp32_improv:
authorizer: none
#captive_portal:
esp_adf:
board: esp32s3box3
speaker:
- platform: esp_adf
id: box_speaker
microphone:
- platform: esp_adf
id: box_mic
micro_wake_word:
models: ${micro_wake_word_model}
on_wake_word_detected:
- voice_assistant.start:
wake_word: !lambda return wake_word;
voice_assistant:
id: va
microphone: box_mic
speaker: box_speaker
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
vad_threshold: 3
use_wake_word: true
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: wakeword
- script.execute: reset_led
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 50%
effect: pulse
- script.execute: reset_led
on_tts_stream_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 50%
effect: pulse
- script.execute: reset_led
on_tts_end:
- homeassistant.service:
service: media_player.play_media
data:
entity_id: media_player.sound_bar
media_content_id: !lambda 'return x;'
media_content_type: music
announce: "true"
- script.execute: reset_led
on_tts_stream_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: reset_led
on_end:
- if:
condition:
and:
- switch.is_off: mute
- lambda: return id(wake_word_engine_location).state == "On device";
- lambda: return id(voice_assistant_phase) != ${voice_assist_timer_finished_phase_id};
then:
- wait_until:
not:
voice_assistant.is_running:
- micro_wake_word.start:
on_error:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- script.execute: reset_led
- delay: 1s
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: reset_led
on_client_connected:
- wait_until:
not: ble.enabled
- lambda: id(init_in_progress) = false;
- script.execute: start_voice_assistant
- script.execute: reset_led
on_client_disconnected:
- script.execute: stop_voice_assistant
- script.execute: reset_led
on_timer_started:
- script.execute: reset_led
on_timer_cancelled:
- script.execute: reset_led
on_timer_updated:
- script.execute: reset_led
on_timer_tick:
- script.execute: reset_led
on_timer_finished:
- script.execute: stop_voice_assistant
- lambda: id(voice_assistant_phase) = ${voice_assist_timer_finished_phase_id};
- switch.turn_on: timer_ringing
- script.execute: reset_led
- wait_until:
not:
microphone.is_capturing:
- while:
condition:
switch.is_on: timer_ringing
then:
- lambda: id(box_speaker).play(id(timer_finished_wave_file), sizeof(id(timer_finished_wave_file)));
#- homeassistant.service:
# service: media_player.play_media
# data:
# entity_id: media_player.sound_bar
# media_content_id: timer_finished_wave_file
# media_content_type: music
# announce: "true"
- delay: 1s
- wait_until:
not:
speaker.is_playing:
- switch.turn_off: timer_ringing
- script.execute: start_voice_assistant
- script.execute: reset_led
script:
- id: reset_led
then:
- if:
condition:
# - switch.is_on: use_wake_word
- switch.is_on: use_listen_light
- switch.is_off: night_mode
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 80%
effect: none
else:
- if:
condition:
# - switch.is_on: use_wake_word
- switch.is_on: use_listen_light
- switch.is_on: night_mode
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 30%
effect: none
else:
- light.turn_off: led_ring
- id: fetch_first_active_timer
then:
- lambda: |
const auto timers = id(va).get_timers();
auto output_timer = timers.begin()->second;
for (auto &iterable_timer : timers) {
if (iterable_timer.second.is_active && iterable_timer.second.seconds_left <= output_timer.seconds_left) {
output_timer = iterable_timer.second;
}
}
id(global_first_active_timer) = output_timer;
- id: check_if_timers_active
then:
- lambda: |
const auto timers = id(va).get_timers();
bool output = false;
if (timers.size() > 0) {
for (auto &iterable_timer : timers) {
if(iterable_timer.second.is_active) {
output = true;
}
}
}
id(global_is_timer_active) = output;
- id: fetch_first_timer
then:
- lambda: |
const auto timers = id(va).get_timers();
auto output_timer = timers.begin()->second;
for (auto &iterable_timer : timers) {
if (iterable_timer.second.seconds_left <= output_timer.seconds_left) {
output_timer = iterable_timer.second;
}
}
id(global_first_timer) = output_timer;
- id: check_if_timers
then:
- lambda: |
const auto timers = id(va).get_timers();
bool output = false;
if (timers.size() > 0) {
output = true;
}
id(global_is_timer) = output;
- id: start_voice_assistant
then:
- if:
condition:
switch.is_off: mute
then:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- id: stop_voice_assistant
then:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "On device";
then:
- voice_assistant.stop:
- micro_wake_word.stop:
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
select:
- platform: template
entity_category: config
name: Wake word engine location
id: wake_word_engine_location
optimistic: true
restore_value: true
options:
- In Home Assistant
- On device
initial_option: On device
on_value:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- wait_until:
lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
- if:
condition:
lambda: return x == "In Home Assistant";
then:
- micro_wake_word.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- if:
condition:
lambda: return x == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- micro_wake_word.start
globals:
- id: init_in_progress
type: bool
restore_value: false
initial_value: "true"
- id: voice_assistant_phase
type: int
restore_value: false
initial_value: ${voice_assist_not_ready_phase_id}
- id: global_first_active_timer
type: voice_assistant::Timer
restore_value: false
- id: global_is_timer_active
type: bool
restore_value: false
- id: global_first_timer
type: voice_assistant::Timer
restore_value: false
- id: global_is_timer
type: bool
restore_value: false
switch:
- platform: gpio
id: pa_ctrl
pin: GPIO38
name: "${friendly_name} Speaker Mute"
restore_mode: ALWAYS_ON
- platform: template
name: Mute
id: mute
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
entity_category: config
on_turn_off:
- if:
condition:
- lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- if:
condition:
not:
- voice_assistant.is_running
then:
- if:
condition:
lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous
- if:
condition:
lambda: return id(wake_word_engine_location).state == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start
- script.execute: reset_led
on_turn_on:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop
- micro_wake_word.stop
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: reset_led
- platform: template
name: Use Listen Light
id: use_listen_light
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- script.execute: reset_led
on_turn_off:
- script.execute: reset_led
- platform: template
id: timer_ringing
optimistic: true
internal: true
restore_mode: ALWAYS_OFF
on_turn_on:
- delay: 15min
- switch.turn_off: timer_ringing
#night mode switch
- platform: template
name: Night Mode Switch
id: night_mode
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- script.execute: reset_led
on_turn_off:
- script.execute: reset_led
light:
- platform: esp32_rmt_led_strip
id: led_ring
name: "${friendly_name} Light"
pin: GPIO19
num_leds: 12
rmt_channel: 0
rgb_order: GRB
chipset: ws2812
default_transition_length: 0s
effects:
- pulse:
name: "Pulse"
transition_length: 0.5s
update_interval: 0.5s
- addressable_twinkle:
name: "Working"
twinkle_probability: 5%
progress_interval: 4ms
- addressable_color_wipe:
name: "Wakeword"
colors:
- red: 0%
green: 50%
blue: 0%
num_leds: 12
add_led_interval: 40ms
reverse: false
- addressable_color_wipe:
name: "Power"
colors:
- red: 100%
green: 0%
blue: 0%
num_leds: 12
add_led_interval: 50ms
reverse: false
binary_sensor:
- platform: template
name: "${friendly_name} Volume Up"
id: btn_volume_up
publish_initial_state : true
- platform: template
name: "${friendly_name} Volume Down"
id: btn_volume_down
publish_initial_state : true
- platform: template
name: "${friendly_name} Set"
id: btn_set
publish_initial_state : true
- platform: template
name: "${friendly_name} Play"
id: btn_play
publish_initial_state : true
- platform: template
name: "${friendly_name} Mode"
id: btn_mode
publish_initial_state : true
- platform: template
name: "${friendly_name} Record"
id: btn_record
publish_initial_state : true
file:
- id: timer_finished_wave_file
file: https://github.com/esphome/firmware/raw/main/voice-assistant/sounds/timer_finished.wav
sensor:
- id: button_adc
platform: adc
internal: true
pin: 8
attenuation: 12db
update_interval: 15ms
filters:
- median:
window_size: 5
send_every: 5
send_first_at: 1
- delta: 0.1
on_value_range:
- below: 0.55
then:
- binary_sensor.template.publish:
id: btn_volume_up
state: ON
- above: 0.65
below: 0.92
then:
- binary_sensor.template.publish:
id: btn_volume_down
state: ON
- above: 1.02
below: 1.33
then:
- binary_sensor.template.publish:
id: btn_set
state: ON
- above: 1.43
below: 1.77
then:
- binary_sensor.template.publish:
id: btn_play
state: ON
- above: 1.87
below: 2.15
then:
- binary_sensor.template.publish:
id: btn_mode
state: ON
- above: 2.25
below: 2.56
then:
- binary_sensor.template.publish:
id: btn_record
state: ON
- above: 2.8
then:
- binary_sensor.template.publish:
id: btn_volume_up
state: OFF
- binary_sensor.template.publish:
id: btn_volume_down
state: OFF
- binary_sensor.template.publish:
id: btn_set
state: OFF
- binary_sensor.template.publish:
id: btn_play
state: OFF
- binary_sensor.template.publish:
id: btn_mode
state: OFF
- binary_sensor.template.publish:
id: btn_record
state: OFF
I want to implement the below on one of the buttons like the S3 box touchscreen. It has an icon you can hold and let go when you want it to stop listening. Effective when background noise is present.
- platform: touchscreen
page_id: idle_page
id: control_6
internal: true
x_min: 215
x_max: 315
y_min: 175
y_max: 240
on_press:
then:
- micro_wake_word.stop
- delay: 50ms
- lambda: id(va).set_use_wake_word(false);
- delay: 50ms
- voice_assistant.stop:
- delay: 50ms
- voice_assistant.start
- display.page.show: listening_page
- component.update: s3_box_lcd
on_release:
- delay: 100ms
- if:
condition:
lambda: return id(wake_word_engine_location).state == "In Home Assistant";
then:
- voice_assistant.stop:
- lambda: id(va).set_use_wake_word(true);
- delay: 10ms
- voice_assistant.start_continuous:
else:
- voice_assistant.stop:
- delay: 100ms
- micro_wake_word.start
Well, your code got me thinking and thinkering but I got it working in conversation mode without using a button. Just start with the wake word and have a conversation. At the and it reverts automatically back to listening for wake word. Of course single commands are still possible. At the moment I am testing the code for a few days. While I am happy with it I still have an issue with long responses. For me the time between receiving the response as seen in the log and the actual voice speaking it out loud takes to long. I have a few ideas to fix it but it will take some time to implement and test it. After that i will post the yaml.
I wouldn’t tinker too much. Apparently going forward ESP-ADF is no longer needed now that these new boards are coming out with XMOS chips for voice/echo cancelation. The firmware/software for ESP32 via I2S is apparently very new and changing. I have a Seeed respeaker kit, don’t get it, wait until Nabu comes out with something. The listen light didn’t work, Seeed actually released a firmware update that fixed it but got access denied when downloading from their wiki so someone put it on GitHub. YAML code is, well, different in some areas and similar in others. All Nabu has announced is it will have a supposedly better XMOS chip and 3.5mm input and output for another mic but don’t quote me on that. You know Nabu Casa will use it as their main voice testing device and do more updates then say, Seeed and others. I imagine trying to perfect the YAML now is pretty pointless considering a lot might change in the next six months+
I still use Nabu Cloud because it’s better than local and I’m running HA full OS on a 3 year old NUC. They did work with Nvidia to port everything to GPU based but I’m not buying a Jetson module just for local voice commands. I get why they did it, it’s kinda like a raspberry pi, just 10x the price, more for what’s recommended as the 8GB model has a RAM issues in it doesn’t have enough… An LLM is neat but not needed by me personally. I’m not a developer but obvious changes are in the yaml I use for the respeaker lite vs the Korvo-1 or S3 Box which uses a newer wake word also that does actually work better and can be used on anything using Microwakeword.
Dedicated thread
Thanks for the update. I am aware about the upcoming HA voicebox and I am looking forward to it. However I do not mind tinkering that much. For me it is getting experience with the software and hardware. Until the XMOS hardware from HA is available I am using my own esp32 voice pucks which are already working quiet well for me but still far apart from Google home hardware. I am running HA OS also on a older NUC but I am experimenting with a local LLM on a separate pc. It will make giving commands easier. And HA understand them much easier too. My wife don’t have to remember anymore the exact words to get things done. For me that is a big plus. Besides that it is a much more natural way to controlling the house by voice.
Honestly, my Wyoming satellite is probably the most accurate with the Seeed and S3 box being a close 2nd, the S3 probably beating out the Seeed due to it being the HA’s teams main “device” they focused on for ESP32. Amazon and Google lost billions on voice. On the race to win they realized too late that you couldn’t put in ads after it being free. I do know that newer hardware is more powerful and does more on device, which you can see in the price of Alexa, and even now we have no idea how much either leverages cloud resources to be as accurate as they are. Honestly, background noise, especially TV is the main issue and it has to be hard to determine what’s your voice and not. Music with no lyrics does pretty well but trying to use it watching TV is a no go. Still, a huge accomplishment by Nabu/HA to get this far with something as low powered as an ESP32 being the end device listening for the wake word. They will get there. It’s just going to be gradual. Some people want overnight results and that’s not going to happen.They originally tried to just do the LLM on the Jetson and HA separate but Nabu for fed up and just ported HA Core to the Jetson. It is technically just running a specialized version of Ubuntu for ARM, with all the voice stuff being docker containers. It could honestly be just as bad with background voices/noises anyways so wait and see. Really just looking forward to whatever Nabu announces at this point. Will continue to tinker as always in the meantime.