I got very excited about the recently introduced Ada and Almond.
I consider privacy to be a fundamental human right and tell this to all of my friends. Having an Echo Dot Alexa doesn’t strengthen my arguments, haha.
Therefore, I am considering to completely getting rid of the Echos in my house, and to replace them by a device that I can use to do simple things, such as turning devices on or off and activating scripts/scenes.
As it stands now, it seems like Almond is already capable of doing this; however, this still requires me to type it, which is more work than just flipping a switch.
I want to use a Pi zero (or similar) with a good microphone to accomplish this.
Rasspy, seems to do everything I want, but I prefer a simpler solution that doesn’t require a lot of configuration. From Rasspy’s documentation, I found a list of recommended hardware, so that is probably a good place to start.
My idea for this topic is that we discuss how we can use Ada to listen to a wake word and then send over the STT to HA’s Almond integration.
So who else is interested in this and/or has already started tinkering?
Rhasspy author here. Just a heads up: I’m working on an HA integration that will allow Rhasspy to be an STT platform like Ada. You can still use Almond downstream, though what you can say will be more limited.
More limited than what you can get out of a cloud speech service. This is the key trade-off between Rhasspy and Ada, which currently uses Microsoft’s cloud for speech recognition.
To be a little more precise, Rhasspy has 3 modes of operation for speech recognition (all completely offline):
Closed
The default mode, where only the voice commands you specify can be recognized. This is what Rhasspy was designed for, and where it shines.
Open
Recently added, this mode uses a general language model and ignores any custom voice commands. You can say anything, and Rhasspy will do its best to transcribe it. But you will probably find the performance to be poor compared to a cloud service.
Mixed
An interesting combination of Open and Closed. Your custom voice commands are mixed into the general language model. You can say anything (like Open), but Rhasspy will be more likely to recognize your custom voice commands (like Closed). This mode is much slower than Closed, so a NUC or server should be used instead of a Pi.
It will be possible soon to use Rhasspy just for speech recognition, and have it forward sentences to HA’s conversation integration for intent recognition (using Almond, etc.).
For now, until the Rhasspy integration works with Almond, it seems like Ada in the cloud is the best option.
Thanks a lot, @synesthesiam for your clear explanation and your continued work on Rhasspy!
@balloob, have you tried to get Ada + Almond to work on Hass.io and a Pi? I have seen Pascal’s video where he uses a Pi, but is that the released version?
Are there some instructions somewhere on how to connect speakers and a microphone to a Pi running Hass.io? Or is it just plug-and-play and the OS detects them by itself?
For me at least, with no speakers nor microphone connected now, I get this error when I start the Ada add-on, and I don’t know if that is expected? edit: this is expected.
Very interesting. I’ve wanted to setup something like this for a while now.
I have to say I’m a bit conflicted on which platform to start with though.
Rhasspy sounds (from the small amount of reading I’ve done so far) like it will do what I want, but with the official backing of home Assistant will Ada be better supported?
Hmm so many options. Either way I look forward to seeing this area develop.
I’m absolutely on board with using Almond and Ada. It seems there isn’t much in the way of documentation yet. I’ve posted about using a PS eye microphone as input without so much as a response.
Good day everyone. I am also very interested to build an Alexa like device for my home.
At some point I tried the build in Voice Commands but never got it to work.
Therefore I am very exicted to see the idea of Ada and Almond.
As Homeassistant seems to have a strategy towards simpliying things I am a little suprised about “server based” approach for Ada.
Leaving the technical bondaries aside, I feel it would be the best approach would be to include the interface into the homeassistants apps for iPhone and Android. In this case inbuild microphone and speaker can be used. In my case I have a dedicated HA tablet in my living room and I also have some spare androids phone, which I would like to convert into Alexa like devices.
By Installing Almond, i seemed to have lost by existing ‘Conversation’ words. I have lot of conversation topic that I used for simple but effective voice based commands to control various lights and bulbs. After installing, none of my conversation words are recognized by Almond
How different the Rhasspy - Closed will be from the existing ‘Conversation’ module in HA?
Rhasspy - Open - Will this be restrictive like Almond? i.e will the existing conversation or Rhasspy -Closed custom intents not work if this mode is active? If yes, then we are in the same loop as ‘Conversation’. I have lotts of custom intents which my family has got used to and for me, those are not to be replaced. I wanted a snips like experience where regardless of the snips NLP, my existing conversation intents co-existed in HA
Rhasspy - Mixed. I think this is what I would be looking for. Hopefully this would retain the intents written for Rhasspy - Closed (will it support existing Conversation also?) and the Rhasspy - Open
So far not a good experience on Almond. Even reverting back to original conversation is not working yet even after the removal of Almond - somehow conversation still references the earlier installed Almond.
I think you will find that HA switched to Almond and it is the way forward.
If you installed the Almond addon for hassio you just installed a local copy of the server.
Hi @manju-rn, I’ll do my best to answer your questions.
The HA conversation module takes in text and recognizes/handles intents. If you write your Rhasspy voice commands to match what conversation expects, then you can use your existing HA configuration. Just configure Rhasspy to use HA conversation for intent recognition.
Once the HA intent integration goes live, Rhasspy will be able to trigger intents directly in HA, without needing conversation. This means you could port your conversation templates over to Rhasspy, but keep your intent_script configuration.
The existing Rhasspy custom intents will not work in Open mode, but your conversation intents should work just fine as long as Rhasspy can understand what you’re saying. I doubt the Open mode will work very well, but it’s worth a try since you don’t need to write any Rhasspy intents up front.
Since you have existing conversation templates, Mixed mode might allow you to gradually port intents over to Rhasspy. But I think using the Closed mode with Rhasspy configured for conversation would work better in the end.
@danbutter Yes, I understand that Almond is the way forward. What I am concerned is that it is breaking the existing perfectly working solution (atleast to me). Moreover, I wouldn’t have been too bothered if I could go back seamlessly after uninstallation of Almond as I would expect from any other add-ons. Almond in this case is working like a ghost even though I have uninstalled and now I am breaking my head to see how to go back without having to port back to previous version or reinstall HA