I think I should start by saying that I have been using Rhasspy with home Assistant for a few years now on 3 Raspberry Pi’s with reSpeaker HATs. Looking forward to getting the last few wrinkles out of wyoming-satellite so I can swap all my satellites over.
User expectations
I have not used Alexa or Google devices so I don’t have a particular standard I am comparing Rhasspy/Voice Assist to. Similarly we have not got used to AI. I doubt that any open source project can ever come close to the quality or cost that the big boys can subsidise - and unfortunately Voice Assist will always be measured against that impossible standard.
Granular Permissions
There is a clear use case for recognising which person is giving the command, this has already been requested, and the response was that there are other projects working on this, so it may be possible to integrate … but not a priority for the near future.
Wyoming-satellite
I actually found it pretty straightforward … but maybe that’s because of my pervious experience with Mike’s Rhasspy system. There certainly were challenges trying to get my head around how Rhasspy worked; and since I had something working well I didn’t start into Voice Assist until chapter 4 when the functionality came close.
Playing Audio / TTS Via Wyoming Satellites
I hear you brother !! RasPis already make good media players, so this seems obvious. At least expose a media_player interface so we can send TTS audio messages. Two devices both doing audio output in the same room seems overkill … so possibly make the mic a tiny separate ESP32 device (or multiple “ears” around the room) and send audio output to a separate media_player device ? But if we have audio out and audio in on the same device, we could subtract the audio out and hear what else is happening in the room.
After experimentation trying to get back to minimal components I have just concluded that Voice Assist and media_player are mutually exclusive. Squeezelite blocks Wyoming-satellite; but mpd coexists, playing music or processing voice commands … until wyoming-satellite tries to make a sound while music is playing.
@synesthesiam seems to keep dodging this issue, so I guess there isn’t a straightforward way to do it.
Voice Hardware is a concern
RasPi with a reSpeaker HAT used to be the recommended hardware for Rhasspy … but while the seeed reSpeaker page talks about multiple microphones, Voice Activity Detection, Direction of Arrival and Key Word Spotting (and even has a video demonstrating) they forgot to incorporate them in the device driver or release any source code before they stopped supporting the reSpeaker in 2018. A simple USB mic does just as well. Add to that the supply and cost issues with RasPi during/since covid, and that combination no longer looks attractive.
I am awaiting a Nabu Casa voice assistant product (like ESP32-S3-BOX, but optimised for voice assistant, rather than being a technology demo), and this appears to be in development … but apparently waiting for some DSP (Digital Signal Processing) algorithms to make it into public domain.
User-level documentation
This is a real “soapbox” issue for me.
- I understand that FOSS projects start out as one developer writing something for his own use, then releasing it in case other developers find it useful.
- I understand they want to get on to adding the next great feature instead of wasting all their time trying to explain the bleeding obvious to stupid “users” !
- I understand that Home Assistant is a moving target, growing exponentially (even without considering all the extra components in HACS); and that is is virtually impossible to regulate.
- I understand them hanging onto the belief that the “users” are like them - tech savvy (if not professional programmers) who enjoy the challenge of figuring things out for themselves from minimal oblique hints.
- I do not understand how they think that 1 million users are all experienced programmers. The user base has gradually changed. Face it. I commend Nabu Casa management for focussing on improving the User Interface … but they have so far ignored that documentation is also an important part of the whole “User eXperience”.
I do acknowledge that some parts of the official documentation are better than others … but generally non-technical users have to turn to google and YouTube videos to try to understand what the official documentation is saying. At least the Community Guides are within the home-assistant.io umbrella, even if content is unregulated.
Sorry to rant on
</ soapbox >