Hi folks, I’m really tired of the dumb Alexa not understanding half of the smart home commands and would really love to migrate to Assist and add some nice LLM to make it “smart”. However, as far as see, there is still no off-the-shelf and nice looking voice satellite device out there. There are some PCBs in forever “coming soon” state (e.g. Satellite1 PCB Dev Kit) or the ugly as hell ESP32-S3-BOX with questionable mic quality and unusable speaker, and that’s everything. Is this really the state right now, or I’m missing something?
What hardware do you use for voice assistants using Assist?
The way I see it right now you are really going to have to DIY something to have anything that is half way useful.
Some thing to consider.
Voice assistant is still in development and will not work as well as Alexa right now.
To DIY something presentable and with adequate WAF, you will need good soldering skills, reasonable CAD skills, a 3D printer, and a lot of patience.
As mentioned there is hardware in development, but chances are that will still be a beta version when it launches. Something that competes with the off the shelf devices is probably still some time away, but I am sure we will get it soon.
Assist is never going to be as good as Alexa unless you have a server farm in your back garden.
My setup is built around Assist.
ESP32-S3-BOX hardware (which is, I agree, very ugly, particularly the ghastly blue base), flashed with Willow firmware
Willow add-on for speech recognition (at least as good as Alexa in my experience)
Sonos speakers for TTS responses - fantastic sound quality, obviously
Amazon Polly for the TTS voice - “Brian” is the only grown-up voice available anywhere as far as I can see.
The main drawback with all this is that if you want voice responses you have to write your own sentences, and at the moment this doesn’t seem to be compatible with LLM integrations.
I’m also a bit concerned about the future of Willow, which has gone quiet on Discord. I would love to build my own server, which they encourage you to do, but that’s a big investment and probably beyond me anyway…
The main drawback with all this is that if you want voice responses you have to write your own sentences, and at the moment this doesn’t seem to be compatible with LLM integrations.
Really? Why not? I have ChatGPT working with my M5Stack Atom Echo and the ESPHome Firmware
EDIT: Nevermind, I didn’t clock that you were using another firmware. I thought Willow was just for the Speech to Text portion
Thanks! I’ve built in the past a lot of stuff with ESPs, Arduinos and Raspis (and other SBCs), but I have 2 kids now and a company to run. I’m ready to pay 2x or even 3x the price of an Echo Dot for off-the-shelf product that I install in 5 mins and just works. And I think I’m not alone here. Hope this sparks some serious entrepreneurial enthusiasm among the community!