Rhasspy offline voice assistant toolkit

synesthesiam · November 20, 2019, 2:46am

Just to check, is your file named fullchain.pem?

Make sure to check the log for any additional messages. Rhasspy is really just passing the file name you give directly into requests.post as the verify argument. The requests docs are very unhelpful, stating that it should point to a “CA bundle”.

Are you using this from Hass.io, or externally?

vageesh79 · November 20, 2019, 4:09am

yes the file name is fullchain.pem. I am using hass.io in docker. There is nothing in logs just - “[DEBUG:78841915] main: Loading phoneme examples from/usr/share/rhasspy/profiles/en/phoneme_examples.txt”

and in problems “HomeAssistantIntentHandler Can’t contact server Unable to reach your Home Assistant server at https://10.0.0.211:8123. Is it running?”
mqtt is connected to server.

koan · November 21, 2019, 12:02pm

Hi Michael, I know you already mentioned Home Assistant’s plans with Almond. But now that I read the newest blog article Almond & Ada: privacy-focused voice assistant, I wonder: how does Rhasspy fit into this? Is the plan to make Rhasspy plug into the various parts of the Ada architecture? Will it become part of the Ada project?

synesthesiam · November 21, 2019, 8:15pm

There are a few ways I see Rhasspy fitting. They’ve extended HA with a speech-to-text integration that streams audio from Ada to some platform. conversation has also been extended to support arbitrary “agents” that do intent recognition from text. Finally, I recently got a pull request accepted for having conversation also take JSON intents in via HTTP and handle them.

So, here are the ways Rhasspy fits with all that:

Ada => Rhasspy => HA
- Ada does hotword/silence detection, Rhasspy does speech-to-text/intent recognition, HA handles intents
Rhasspy => HA
- Rhasspy does everything but handling intents
- Basically what happens now, expect you use intent_script rather than events
Rhasspy => HA => Almond
- Rhasspy does hotword/silence/speech-to-text and then gives text to Almond (through HA)
Ada => Rhasspy => HA => Almond
- Rhasspy just does speech to text

For scenarios 1 and 2, the new rhasspy integration (in progress) will examine your devices in configuration.yaml and auto-generate voice commands to turn them on, ask about their states, etc. You will also be able to add custom commands in configuration.yaml and determine which devices commands are generated for.

I’m not so sure about scenarios 3 and 4. Rhasspy needs to know the possible voice commands up front, and Almond supports a lot of things (of course, only in English). There may be a way to query Almond and auto-generate commands, but I’m hesitant to do this, especially because it’s yet another cloud service.

By the way, if anyone is willing to help translate voice command templates and response templates, that would be great

FunkyBoT · November 22, 2019, 12:29am

I don’t know how it can be integrated with Rhasspy. It is just something new that you should be aware of. Maybe Rhasspy can take some advantage of it.

nickrout · November 22, 2019, 12:48am

Did you read the post above yours?

FunkyBoT · November 22, 2019, 1:06am

Not really, since I didn’t notice any post in this thread (I usually get notified)., but it seems that I was wrong. I was just trying to help.
Sorry.

synesthesiam · November 22, 2019, 2:00am

No worries, I appreciate it

I just checked the source code for HA 102.1, and it looks like my additional to conversation didn’t make it in yet.

koan · November 22, 2019, 7:47am

Thanks for the explanation, Michael! So basically scenario 2 looks like the most interesting one at this moment, because Rhasspy already does everything that Almond/Ada promises, and more (including other languages than English, which I consider a big pro). This scenario actually takes Almond and Ada completely out of the equation. So I don’t see the added value yet of Almond and Ada, but let’s see how this pans out.

The Rhasspy integration sounds awesome!

Romkabouter · November 22, 2019, 10:10am

Untill Almond and Ada do not support Dutch, I am sticking with Rhasspy.

I like the way you can create skills and such, but language support is a must for me.

koan · November 22, 2019, 11:02am

This is actually the only thing I really miss in Rhasspy. Snips has a very nice skills system (although not without its issues), and I liked the way I could create skills that others were able to easily install using the Snips Store or by a few commands.

Romkabouter · November 22, 2019, 12:40pm

Well, I personally don’t miss it because I use it for controlling devices and not for searching things on the net or asking what time/date it is.

But I see the added value for skills in general

geoffrey · November 22, 2019, 1:04pm

Snips announced that it is beign acquired bij Sonos the day before yesterday:

So I’m not sure what hardware platform to look forward to in the future to be entirely working offline if you choose so. I really liked their vision, architecture and approach to things.

Once I’m done reconstructing my entire HA setup into smaller building blocks, I’m going to continue with the PS2 microphones and a RPi at the start, but any tips for hardware like products from ReSpeaker are welcome.

Also the recent announcements of the new capabilities in HA for voice assistants and the mentioning of Rhasspy in the state of the union are very promising.

Keep things up everybody, you’re doing a wonderful job!

synesthesiam · November 22, 2019, 2:49pm

I haven’t actually used Snips, so I’d like to know more about their concept of skills. How would you envision them in Rhasspy?

I can imagine doing something with Rhasspy where you pull down community contributed intents/sentences/slots via a GitHub repo, but it would need access to Home Assistant to actually do something with them. Once they integrate my conversation change, it will be a matter of adding to the intent_script configuration rather than creating automations, at least.

From looking into Almond, I was disappointed to see that their sentences are either hand-coded for each skill or sourced from Mechanical Turk. They do have a system that generates training samples based on those sentences, at least, but all that work is for U.S. English only.

koan · November 22, 2019, 3:28pm

What I like about the Snips ‘skills’ (or actually they are called ‘apps’) is that each app is a separate component with:

A set of intents and associated training examples, including custom slot types if you have them defined.
Actions (e.g. in Python code or whatever) that react on a recognized intent.

So as a user you just have to download/install one thing and you get the specific functionality the app offers: no need to configure intent scripts, add sentences, retrain the system or anything else yourself. Under the hood the intents and training examples are added to the list of things Snips recognizes, and the Snips NLU and ASR are retrained with those examples added. The actions of the installed app are run by a ‘skills server’ to make something happen when one of these newly added intents is recognized. When removing an app, of course the intents, training examples and actions should be removed too.

Something like this would be my gold standard for apps in Rhasspy Rhasspy is already extremely modular, letting you change various components, but a modular app system would make it even more awesome.

Maybe an interesting idea to explore: I have been thinking about creating actions for Rhasspy in AppDaemon.

Romkabouter · November 22, 2019, 4:00pm

That is most probably some bad privacy news sadly.
While Snips can work offline, I highly doubt that Sonos will not enforce some sort off cloud use

koan · November 22, 2019, 4:24pm

If only Snips was completely open source… I had a feeling that something like this would happen, and their continuing postponement of open sourcing more critical components (such as the ASR) was the reason for me to stop developing Snips apps and tools. I was quite enthusiast about Snips and heavily involved in the community, but I didn’t like the prospect of locking myself into a platform that could die or be enforced by an acquisition. If it was completely open source now, the acquisition by Sonos wouldn’t have to be bad news.

nickrout · November 22, 2019, 9:43pm

I came to this thread when I wanted voice control. I looked at snips, then found this thread. I have been following (but not always understanding). Life has got in the way but with summer holidays approaching I will be prgressing. Rhasspy still seems to come out on top for me. Keep up the good work @synesthesiam, @koan and @Romkabouter. Hoping I didn’t miss anyone.

geoffrey · November 22, 2019, 11:09pm

What about doing something with HACS which has a store for python scripts, AppDaemon apps, integrations, themes etc

koan · November 22, 2019, 11:22pm

Good thinking!