I can get the speaker mount to match, though out of curiosity do you have a link to the speaker you are using? A make/model, or at the least the size and wattage of the speaker?
I’ve yet to connect and test it (and I see others have problems to get audio to work), but I found that this model did fit physically:
https://www.aliexpress.com/item/1005003717211529.html
@aherbjornsen Ive got a speaker on order from AliExpress, that got some good reviews
(No technical data sheets, but I don’t feel like paying for a reputable supplier while testing)
I expect I will eventually go for Viaston FRS 5 X units, once the project has its kinks ironed out and can play music
Fabric for covering the unit has been ordered too, this stuff in grey should look nice, and hopefully be photogenic (Black would look nicer IRL, but likely not photograph well)
I got access to the ESP32-Korvo V1.1 and ESP32-LyraTD-MSC Dev boards today, Now to look at getting the audio working (if the ESP32 Wrover is a fast enough chip for ESPHome)
Work will begin on getting the aesthetic enclosure closer to being released
This is so exciting! Like I commented over in Egg Voice Assistant, I’ve been dreaming of a smart speaker system with pretty much exactly the same criteria you outlined in your post—far field, LED ring display, can play music and sound good. I hadn’t considered a ceiling mount, though—what a neat bonus!
I’m definitely interested in this project and will be keeping my eyes on it.
Thank you for your interest, I’m collaborating with a few other smart speaker creators over on the ESPHome and Willow Discord server, to get the schematic put together to meet those goals.
I’m aiming to the the first draft of the KiCAD schematics posted to GitHub later today, under the LibreTalk GitHub organization.
My apologies, as this seems to be focusing alot on enclosures and the like, without a ton of talk on the setuo. Am I correct in that this is mostly just a thought experiment?
I think my issue was that I have tried to attack this project from multiple directions at the same time.
I tested both the ESP32-Korvo-V1.1 and ESP32-LyraTD-MSC,
The Audio output on the Korvo V1.1 doesn’t seem to work regardless of all the setting I’ve tried (and a lot of others look to have the same issue with it)
The LyraTD-MSC worked well for Audio reproduction (TTS playback) and the microphones worked to pick up the wake work and STT. Not perfectly tuned, but better than a wall mounted tablet (looking at you Amazon Fire 7") however the light array is a bit of a PITA with it not being a neopixel
Then the icing on the top is neither board is using a ESP32 S3
So I have been focusing my time collaborating with other smart speaker designers, to design something that I believe will suit the needs of most end users. Which has resulted in me spending less time on the Enclosure or Yaml configs for the development boards (mostly this thread’s intent)
The schematic for the PCB we have been working on can be found at
GitHub - Libre-Talk/Wave: Smart Speaker satellite it’s an early draft without all the nets fully connected, and the PCB is being layed out
So the setup with me appropriately using Sonos speakers as output media players “was” working for me — until I started hitting buffer errors.
At this point, I’m migrating to Raspberry Pis with ReSpeakers. IMO, ESP32-S3 devices don’t have enough memory to properly handle quality voice assistant work and was a terrible recommendation over satellite pis.
Hmm, sorry to hear that. Oddly I did get a ESP32 Wrover on the LyraTD_MSC working nicely as my first try with Assist (Audio pickup and playback) and that’s a lower spec chip than the S3.
I wonder if that’s an issue related to Bluetooth audio streaming?
If you’re looking to sell the Korvo-1’s, give me a DM
Actually, yes. There’s a warning all over the ESPHome documentation that says that bluetooth, audio, and voice assistant functions each “consume a significant amount of resources” which causes issues when used together. I would be willing to bet most boards available for sale are quite underpowered for the purpose. ESP32-S3s may have the issue compounded if you’re using microWakeWord as well, since the requirement for using that is at least 2Mb of PSRAM.
This guy details out some of the root issues I’m seeing. If everything works with the Pi I’ll definitely reach out.
But for clarity I’m using a network call to call my Sonos speakers through media player integrations in HA (see my linked config). No BL use at all.
No worries, I didn’t mean to seem dismissive. This looks amazing, I just have been hurt before by good design and not good software/hardware
I get where you are coming from. For a project like this all 3 need to be worked on for a good device, Hardware, Software and the Enclosure.
I’m putting my hope into the work that has been done by the community on the software side as that has a lot of activity with the Espressif ADF at it’s core
Part of me is still worried that a ESP32 S3 isn’t powerful enough to stream music, while also being able to use a Wake word. I’m keeping an eye on how that is developing
Some additional discussion over on the Rhasspy forums: Question: would anyone be interested in a open source DSP mic array? - #2 by synesthesiam - Hardware - Rhasspy Voice Assistant
Any updates on the 3d printed enclosure for the Korvo v1.1? I found this one on thingiverse: ESP32 Korvo (Jarvis Voice Assistant) Puck Enclosure by CleverCasaChronicles - Thingiverse which is not bad looking and pretty basic to just get going with. Credit to CleverCasaChronicles for this: https://www.youtube.com/shorts/ehUY2GHL8XE Also, if anyone has been able to get the korvo to continually listen as well as stay running after boot that’d be great. Followed this install and it works, but its kinda unstable for some reason: Voice Assistant-Add support for the espressif esp32-korvo-v1.1 · Issue #2430 · esphome/feature-requests · GitHub
That’s the thread for the original korvo that has a vanilla ESP32 in it. It’s the around 20 dollar one on AliExpress. I actually just posted the below because it shouldn’t work at all without an S3 with PSRAM. The GPIO pins are different between the 2.models so I’m surprised it worked at all but explains the odd behavior… Also, there are a mix of korvov1.1 and korvo-1 posts that just make it a bit more confusing to navigate through.
←
the voice pipeline used should have no value in home assistant because it’s defined in the ESPHome configuration file. While I have been working on the Korvo-1, and I know this isn’t the thread for that one, I’m just curious is PSRAM is no longer required for microwakeword because everything I’ve read says it’s a requirement. Additionally the ESPHome documation says it’s still required although it may not have been updated.
The micro_wake_word component requires an ESP32-S3 with PSRAM to function
→
I do have the korvo-1 working with microwakeword but I’m having an “odd” issue that I want to get fixed because the firmware doesn’t show and ESPHome thinks the korvo-1 is offline and prompts me to install the “newly” found one. It works with zero issues which is just really odd. In ESPHome it shows up as red (offline) but if I click on logs they show up and I can see everything working, yet ESPHome still thinks it’s offline. Very odd. I will post the confirmation once I figure out what’s going on because you will more than likely end up with the same issue. I also used the below so if the GPIO pins don’t match what’s in that other thread then it’s not going to work. Look at the file with pins in the name.
here is the yaml, everything works without any issues. I tried looking into the firmware thing and it looks like it has to be added on the esphome side and there has been a feature request put in. If it will ever happen is another story though. I did get some warnings about the microphone and during the tensorflow.
Also, it was actually working with OpenWakeWord still until I looked into it. I realized after looking into the repository I am pointing to and if has an option to work with both. When looking more closely at the logs, it wasn’t, they are totally different (as far a cli output) when being triggered and processing a wake word. Remember to create a new pipline with no wake word defined. If you use your default, I believe it will use OpenWakeWord. Another fun thing is since I set the board to esp32-s3-devkit-1, it formatted the ROM at 4MB or either 8MB. It was saying 19344 bytes was to large for my flash of 18345 bytes, which I interpreted as 19MB and 18MB which is obviously WAY off. I probably spent a good 2 hours trying to figure this out when I noticed that when I got that message the PSRAM was more than double the size!!!. So all I had to do was add one simple line to specify flash size as 16MB. DOH. Oh well, it’s working, that is all that matters.
Remember to add the API key below and change wifi if not using the same secret format I am using also. For some reason, a few releases ago all the LED’s stopped working. I don’t know if if it’s just a weird LED thing or if only one mic is working. I was pretty much relying on that repo for a lot of stuff because he had the actual espressif files with the includes to the driver files and specific hardware firmware. Stuff like that. So, you could just specify the mic and one of the mic gpio’'s, but I doubt it will work out as well as this because it won’t be utilizing all the hardware. I honestly can’t say if all of it is working with this either. The 3.5mm output works fine, haven’t tried the speaker output and all my speakers are passive, I also don’t have a jst adapter that will fit it either.
Just remember, you may forever have a new entity wanting to be adopted after this, just leave it alone, even though it shows up as red it just works and outside someone adding the firmware to esphome, it will always be that way. What is really odd is there is a line in there that points to a github url that doesn’t exist for the firmware. If I comment it out, flashing fails, so it has to be in there, even though the the file doesn’t exist in the main esphome repository it’s pointed to
Lastly, it took over 600 seconds to compile, and I am running HA on a 7 year old intel Nuc with an I5 so if you are using a raspberry pi 4, it could take a while longer so make sure to just wait until it flashes or times out. Actually , if it’s your first time flashing it you will have to plug it into a PC or the HA server so you will see the CLI output. The tensorflow part takes quite some time. You have to hold the boot button when plugging the power cord in to put it into boot mode. The USB cable for for communication should already be plugged into your PC when doing this.
substitutions:
name: "korvo"
friendly_name: korvo
esphome:
name: ${name}
friendly_name: ${friendly_name}
name_add_mac_suffix: true
platformio_options:
board_build.flash_mode: dio
upload_speed: 460800
project:
name: esphome.voice-assistant
version: "1.0"
min_version: 2023.11.5
on_boot:
- priority: 600
then:
- light.turn_on:
id: led_ring
brightness: 70%
effect: connecting
esp32:
board: esp32-s3-devkitc-1
flash_size: 16MB
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_AUDIO_BOARD_CUSTOM: "y"
CONFIG_ESP32_S3_KORVO1_BOARD: "y"
components:
- name: esp32_s3_korvo1_board
source: github://abmantis/esphome_custom_audio_boards@main
refresh: 0s
psram:
mode: octal
speed: 80MHz
external_components:
- source: github://pr#5230
components: esp_adf
refresh: 0s
ota:
logger:
api:
encryption:
key: you_api_key
on_client_connected:
then:
- if:
condition:
switch.is_on: use_wake_word
then:
- delay: 1s
- voice_assistant.start_continuous:
- delay: 1s
- voice_assistant.stop:
- delay: 1500ms
- voice_assistant.start_continuous:
- script.execute: reset_led
on_client_disconnected:
then:
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 100%
brightness: 50%
effect: connecting
dashboard_import:
package_import_url: github://esphome/firmware/voice-assistant/esp32-s3-korvo1.yaml@main
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
use_address: 192.168.0.xx
ap:
on_connect:
then:
- delay: 5s # Gives time for improv results to be transmitted
- ble.disable:
on_disconnect:
then:
- ble.enable:
improv_serial:
esp32_improv:
authorizer: none
button:
- platform: factory_reset
id: factory_reset_btn
name: Factory reset
esp_adf:
board: esp32s3korvo1
microphone:
- platform: esp_adf
id: korvo_mic
speaker:
- platform: esp_adf
id: korvo_speaker
micro_wake_word:
on_wake_word_detected:
# then:
- voice_assistant.start:
wake_word: !lambda return wake_word;
- light.turn_on:
id: led_ring
red: 30%
green: 30%
blue: 70%
brightness: 60%
effect: fast pulse
model: okay_nabu.
voice_assistant:
id: voice_asst
microphone: korvo_mic
speaker: korvo_speaker
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2
use_wake_word: true
on_listening:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: wakeword
on_tts_start:
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 50%
effect: pulse
on_end:
- delay: 500ms
- wait_until:
not:
speaker.is_playing:
- script.execute: reset_led
on_error:
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: none
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- lambda: |-
if (code == "wake-provider-missing" || code == "wake-engine-missing") {
id(use_wake_word).turn_off();
}
script:
- id: reset_led
then:
- if:
condition:
switch.is_on: use_wake_word
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 30%
effect: none
else:
- light.turn_off: led_ring
switch:
- platform: gpio
id: pa_ctrl
pin: GPIO38
name: "${friendly_name} Speaker Mute"
restore_mode: ALWAYS_ON
- platform: template
name: Use wake word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(voice_asst).set_use_wake_word(true);
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
- script.execute: reset_led
on_turn_off:
- voice_assistant.stop
- script.execute: reset_led
light:
- platform: esp32_rmt_led_strip
id: led_ring
name: "${friendly_name} Light"
pin: GPIO19
num_leds: 12
rmt_channel: 0
rgb_order: GRB
chipset: ws2812
default_transition_length: 0s
effects:
- pulse:
name: "Pulse"
transition_length: 0.5s
update_interval: 0.5s
- addressable_twinkle:
name: "Working"
twinkle_probability: 5%
progress_interval: 4ms
- addressable_color_wipe:
name: "Wakeword"
colors:
- red: 0%
green: 50%
blue: 0%
num_leds: 12
add_led_interval: 20ms
reverse: false
- addressable_color_wipe:
name: "Connecting"
colors:
- red: 60%
green: 60%
blue: 60%
num_leds: 12
- red: 60%
green: 60%
blue: 0%
num_leds: 12
add_led_interval: 100ms
reverse: true
binary_sensor:
- platform: template
name: "${friendly_name} Volume Up"
id: btn_volume_up
- platform: template
name: "${friendly_name} Volume Down"
id: btn_volume_down
- platform: template
name: "${friendly_name} Set"
id: btn_set
- platform: template
name: "${friendly_name} Play"
id: btn_play
- platform: template
name: "${friendly_name} Mode"
id: btn_mode
on_multi_click:
- timing:
- ON for at least 10s
then:
- button.press: factory_reset_btn
- platform: template
name: "${friendly_name} Record"
id: btn_record
on_press:
- voice_assistant.start:
- light.turn_on:
id: led_ring
brightness: 100%
effect: "Wakeword"
on_release:
- voice_assistant.stop:
- light.turn_off:
id: led_ring
sensor:
- id: button_adc
platform: adc
internal: true
pin: 8
attenuation: 11db
update_interval: 15ms
filters:
- median:
window_size: 5
send_every: 5
send_first_at: 1
- delta: 0.1
on_value_range:
- below: 0.55
then:
- binary_sensor.template.publish:
id: btn_volume_up
state: ON
- above: 0.65
below: 0.92
then:
- binary_sensor.template.publish:
id: btn_volume_down
state: ON
- above: 1.02
below: 1.33
then:
- binary_sensor.template.publish:
id: btn_set
state: ON
- above: 1.43
below: 1.77
then:
- binary_sensor.template.publish:
id: btn_play
state: ON
- above: 1.87
below: 2.15
then:
- binary_sensor.template.publish:
id: btn_mode
state: ON
- above: 2.25
below: 2.56
then:
- binary_sensor.template.publish:
id: btn_record
state: ON
- above: 2.8
then:
- binary_sensor.template.publish:
id: btn_volume_up
state: OFF
- binary_sensor.template.publish:
id: btn_volume_down
state: OFF
- binary_sensor.template.publish:
id: btn_set
state: OFF
- binary_sensor.template.publish:
id: btn_play
state: OFF
- binary_sensor.template.publish:
id: btn_mode
state: OFF
- binary_sensor.template.publish:
id: btn_record
state: OFF
Sorry for the quietness on this thread, I’ve been spending my time trying to develop a cheap smart speaker solution that hopefully ticks the boxes for the most users possible.
I’m guessing the code above is for the Korvo1 V2.4 not the Korvo V1.1 (that naming convention caught me out too, Now I have 3 ESP32 based units that don’t work with ESPHome ) though I did have some luck with the ESP32-LyraTD-MSC, despite it having a lower end cpu, happily picked up the wake word, and even worked as an input and output for assist, until I tried to transplant a ESP32-S3 N16R8 and it got locked into bootloader mode.
Yes, it’s for the Korvo-1. Real great naming convention. It’s the one with an S3 with 16MB of ROM and 8MB of PSRAM so not the Korvov1.1 or whatever, technically mine is korvo-1 v5, and there is a Korvo-2 also which has screen and camera interface along with 2 mics and a speaker if I am not mistaken. If you use the code above it that I originally posted it will work with OpenWakeWord, just take out the microwakeword part. from “on_wake_word_detected:” to “model”, just delete all that and it will work on the KorvoV1.1 using OpenWakeWord. Microwakeword requires PSRAM and the vanilla ESP32 in that model doesn’t have any. I originally thought it was using microwakeword, it wasn’t, it was using Open. Works better with micro, just noticeably quicker to allow a voice command but still works well with Open. If you do use micro, create a voice pipline with no wakeword. I have heard conflicting information about if you use a pipeline with a wake word defined it uses Open.
Another ethe above won’t work for the korvov1.1 . It’s getting the GPIO.amdnother information from that repo. What you would need is something similar to below, which I’m unsure how I missed during my searches. Issue is it’s for the same model I have so the GPIO pins and drivers/code s defined may be different. Someone should be able to redu the GPIO pins, and their has to be some v1.1 yaml out there. I’ll do some searches but something like the ESPHome.code in the below link, which doesn’t depend on any repositories, is more straight forward. I actually might see how the below compares to the above but I’ve been impressed in the difference in picking up the trigger word much easier with micro.
EDIT: Specific model just to make sure there is no confusion