The Espressif ESP32-S3-Korvo-1 developer module comes with all the hardware features expected for a smart speaker but in a DIY format: mic array, speaker output, addressable leds, push buttons, sd card, battery charger. Espressif provides example software that implements the necessary audio processing, wakeword detection, and voice command identification. I merged the voice command example with the source files resulting from an ESPHome configuration for the dev module to create a usable voice command interface for HA. The example software provides up to 200 preprogrammed voice command phrases. Automations related to each phrase must be configured in HA.
Wakeword detection seems to be on par with existing commercial smart speakers. The voice command recognition is usable but way more sensitive to ambient noise and distance to speaker. Steps to setup build environment and configure the software are posted in this github repository:
I’m not really familiar with any of the other methods for a good comparison. I had started out to add support for this dev board to the ESP32 Rhasspy Satellite project but got sidetracked a bit. My current implementation is not the most user friendly but should not be much trouble for users with linux / programing experience to get working. It has benefit that all the audio processing is done right on the device so good for users with HA running on a system with less resources.
now we have voice-assist with 2023.5.0 core of HA, there is a much simple way to use a Atom Echo for Speech-to-text and let your HA hardware make the work of recognition.
But there is no more Atom Echo available now.
I search a way to use a standard ESP32 with inmp441 to do the job, is someone has succeed in please?
ESPHome 2023.4 added microphone support, I was looking into using onboard wakeword detection that that then feeds the processed audio to the HA once activated.
I am very interested in your project.
“Mixing” ESPHome with sample code couldn’t have ben easy! Great work.
I’d like to make something like a wearable Star Trek badge, with a Push-Button or Touch Sensor as I don’t really trust wake words, plus it must use a lot of battery.
One mic pointing “up” towards the mouth might be enough as I think the array would point “away”.
If I were to do that I would also have to “mess” with the mic array code and I’m afraid
I appreciate the building instructions on your Github page but for now I don’t trust myself to make the mods.
Anyway, I’ll follow your project with much interest; thank you
Have you had any luck getting yours to work with HA / ESPHome? I’ve been able to get the default Chinese language firmware working, but everything audio related I try to do with ESPHome has not worked. I’ve got the buttons and the LEDs working, but no audio in or out.
Could you share your pin configuration? I’m not sure I haev mine correct.
This take advantage of all microphones? I’ve tried manny setup versions, but none of them are closer to Alexa or Google Nest at detecting wake word (with or without microwakeword). TTS aparently works well. But detecting wakeword works only if i am next to korvo or speak verry loud. Can we use audio drivers and wake word engine from skainet? I’m not good with pyton programing but i can work with yaml