Rhasspy offline voice assistant toolkit

Well, I personally don’t miss it because I use it for controlling devices and not for searching things on the net or asking what time/date it is.

But I see the added value for skills in general :slight_smile:

Snips announced that it is beign acquired bij Sonos the day before yesterday:

So I’m not sure what hardware platform to look forward to in the future to be entirely working offline if you choose so. I really liked their vision, architecture and approach to things.

Once I’m done reconstructing my entire HA setup into smaller building blocks, I’m going to continue with the PS2 microphones and a RPi at the start, but any tips for hardware like products from ReSpeaker are welcome.

Also the recent announcements of the new capabilities in HA for voice assistants and the mentioning of Rhasspy in the state of the union are very promising.

Keep things up everybody, you’re doing a wonderful job!

1 Like

I haven’t actually used Snips, so I’d like to know more about their concept of skills. How would you envision them in Rhasspy?

I can imagine doing something with Rhasspy where you pull down community contributed intents/sentences/slots via a GitHub repo, but it would need access to Home Assistant to actually do something with them. Once they integrate my conversation change, it will be a matter of adding to the intent_script configuration rather than creating automations, at least.

From looking into Almond, I was disappointed to see that their sentences are either hand-coded for each skill or sourced from Mechanical Turk. They do have a system that generates training samples based on those sentences, at least, but all that work is for U.S. English only.

What I like about the Snips ‘skills’ (or actually they are called ‘apps’) is that each app is a separate component with:

  • A set of intents and associated training examples, including custom slot types if you have them defined.
  • Actions (e.g. in Python code or whatever) that react on a recognized intent.

So as a user you just have to download/install one thing and you get the specific functionality the app offers: no need to configure intent scripts, add sentences, retrain the system or anything else yourself. Under the hood the intents and training examples are added to the list of things Snips recognizes, and the Snips NLU and ASR are retrained with those examples added. The actions of the installed app are run by a ‘skills server’ to make something happen when one of these newly added intents is recognized. When removing an app, of course the intents, training examples and actions should be removed too.

Something like this would be my gold standard for apps in Rhasspy :slight_smile: Rhasspy is already extremely modular, letting you change various components, but a modular app system would make it even more awesome.

Maybe an interesting idea to explore: I have been thinking about creating actions for Rhasspy in AppDaemon.

1 Like

That is most probably some bad privacy news sadly.
While Snips can work offline, I highly doubt that Sonos will not enforce some sort off cloud use

If only Snips was completely open source… I had a feeling that something like this would happen, and their continuing postponement of open sourcing more critical components (such as the ASR) was the reason for me to stop developing Snips apps and tools. I was quite enthusiast about Snips and heavily involved in the community, but I didn’t like the prospect of locking myself into a platform that could die or be enforced by an acquisition. If it was completely open source now, the acquisition by Sonos wouldn’t have to be bad news.

I came to this thread when I wanted voice control. I looked at snips, then found this thread. I have been following (but not always understanding). Life has got in the way but with summer holidays approaching I will be prgressing. Rhasspy still seems to come out on top for me. Keep up the good work @synesthesiam, @koan and @Romkabouter. Hoping I didn’t miss anyone.

1 Like

What about doing something with HACS which has a store for python scripts, AppDaemon apps, integrations, themes etc

Good thinking!

Does anyone have any pics of an actual setup? One problem I have seen (which I know is only a problem for some people) is that DIY smart speakers don’t look that great. I’ve looked for something that would house both respeaker and a pi, but haven’t come up with anything

Well, that might be the price you pay for not being dependant on google/apple/amazon or some other cloud devices.
DIY are almost never consumerproduct devices.

I am currently trying to build a case for my hardware, but 3D is not my expertise I can savely say :smiley:

I was unaware of HACS and AppDaemon. These look like a great way of implementing “skills”!

So, a Rhasspy skill might contain:

  • Intents
  • Sentences
  • Slots
  • Custom words
  • AppDaemon apps

How would this work with Hass.io? And what could be done to allow people to provide localization/translations for skills?

In my Snips apps I did something like the following to localize utterances:

i18n = importlib.import_module('translations.' + SnipsAppMixin().assistant['language'])

The app would get the user’s language from the Snips configuration and then import the utterances from the right language.

And then the app would have code like:

 self.publish(*end_session(payload["sessionId"], i18n.RESULT_INTENT_SORRY))

People could provide a localization as a file with their translated utterances such as this RESULT_INTENT_SORRY in a GitHub pull request. I know, it’s just a hacky way to implement i18n, but I found tools like gettext a bit overkill for this purpose.

I wanted to do intents, example sentences, slots and custom words in a similar way in my Snips apps, but the way Snips works is that you create these in the Snips Console, a web based interface. For Rhasspy skills this could be implemented too in just some files that can be translated. On installation of the skill, it just has to know which language profile you are using in Rhasspy and install the intents and so on from the right language.

This way in most cases people can translate a skill to their own language just by a pull request with a couple of translated files.

1 Like

It would be very cool if rhasspy would continue to run as standalone solution in the future, i.e. that it does not depend on any special things like HA AppDaemon for those people who do not run HA, but other solutions and who would now like to move over to your solution as a result of recent Snips news.
(In fact, since yesterday I was not aware of rhasspy.)

There is also something very similar to AppDaemon: https://habapp.readthedocs.io/en/latest/
It can be used with MQTT and/or OpenHAB.

Would it be possible to create something like the snips skill server so that existing skills only need minor adjustments?

Do you mean something compatible with Snips skills, or just a similar service?

I agree. I’d prefer to keep Rhasspy out of the business of actually performing the actions. But I can see where it would be useful for people to want to share skills, which may contain actions.

Maybe like @koan described localization, other users could add actions to skills for various providers, like HABApp. But then there would again be the problem of how to get those actions into the appropriate server…

Hi,
I am trying to get the brightness to work with Rhasspy for days, but I couldn’t make it work.
I saw your message and I wondered how you got it done.
Maybe you can post the code you used in the configration.yaml which will be very helpful.
Thanks in advance.

You are welcome.
Take a look at this. It should solve your issue:

Automations:

    action:
      service: light.turn_on
      entity_id: light.w1
      data_template:
        brightness: >
          {{{ "ten":254, "nine":230, "eight":200, "seven":170, "six":140, "five":110, "four":80, "three":50, "two":30, "one":10, "zero":1 }[trigger.event.data["brightness"]]}}

I also use “zero” in order to get a minimum value for the brightness.

Hi, everyone. In preparation for the upcoming update (version 2.4), I’ve tagged version 2.3 on DockerHub. Version 2.4 hopefully won’t break anything, but just in case…

To save space in the Docker image, I’m not including the flair intent recognizer and Mycroft Precise. Leaving those two out reduces the image size by 3GB. If anyone is using them, I can prepare a larger Docker image; let me know!

1 Like

much appreciated,
It worked like a charm, thanks

I do not use it, I think smaller images are great :slight_smile: