Rhasspy offline voice assistant toolkit

frankiej911 · December 27, 2018, 7:29pm

Thanks for your fast reply! I understand your solution and it looks good. I will wait a moment for now. Hopefully this new tool is not too heavy for my Pi Zero haha.

ozzi91 · January 4, 2019, 9:10pm

Heh mby lame question but i have ps3 eye actually, i dont know how to trigger it? what i have to say to launch sth?
Is sth like Hi snips?

infiniteloop · January 6, 2019, 11:48am

You have to access the web interface via browser using the IP address where Rhasspy is running + port 12101 (e.g. http://192.168.1.132:12101).
In the speech tab you can hold down the button “HOLD TO RECORD” while speaking.
When you release the button Rhasspy process your speech and give you a response/intent and optionally send it to home assistant.
Otherwise in the setting tab you can specify a wake word so you can trigger Rhasspy with your voice instead of pressing the button. This should be the default setting for Rhasspy.
Hope this helps

synesthesiam · January 6, 2019, 8:02pm

Not a lame question. @infiniteloop is correct that the web interface should probably be your first stop for testing out your voice commands.

Once you have things going, go to the Settings tab, check the “Listen for wake word on start-up” box in the Rhasspy section, and then restart Rhasspy. The wake word down in the Wake section of the Settings tab should be what triggers Rhasspy. By default, this is “okay rhasspy”.

Beware that the current wake word system (pocketsphinx) is not very good. In the next version of Rhasspy (coming soon!), I have support for Snowboy and Mycroft Precise, which require some training but are much better.

ozzi91 · January 6, 2019, 8:16pm

I have very big problem, I already cleared config, can u post a default? is this system use only commands saved up in config?

My english is not very good but when i say sth like turn light on it can be translate to change light to green and it is very strange to me, even if i clearead config

synesthesiam · January 7, 2019, 3:34am

OK, I’ll try to help. Which language are you using with Rhasspy? All of the defaults are available here.

ozzi91 · January 7, 2019, 5:39pm

Thank you very much for commitment.
I am usign english languague. I already have a snips addon and it recognise well: for example “turn light on”
BTW I have default settings already, Mby i should create a sentence “turn light on” and it will be recognised? Now when i say “turn light on” its is transalte to “set the bedroom light to red”. I will do whatever u want Thnaks you in advance

Romkabouter · January 11, 2019, 2:09pm

I have installed the addon and will try some things with the Dutch language, great work sofar

synesthesiam · January 11, 2019, 10:03pm

This is correct. Rhasspy will only recognize sentences that you put in the Sentences tab. You can do things like “turn [the] light on” to get it to recognize both “turn the light on” and “turn light on”. So you don’t have to type everything out manually

synesthesiam · January 11, 2019, 10:04pm

Thanks for giving it a try! I’m hoping to push out a new version this weekend, which will really expand on the wake word functionality.

chairstacker · January 11, 2019, 10:53pm

Looking forward to that!

Romkabouter · January 12, 2019, 1:08pm

Sofar it is working good, only my Hassio is not in the livingroom.
I am thinking about forking and implement (part) of the snips protocol.
This work with a MQTT broker for audio, that way I do not need the hardware from the Pi

synesthesiam · January 12, 2019, 3:31pm

This is a fantastic idea. I went ahead and added some preliminary support for the Hermes (snips) protocol via MQTT. So far, I just have it listening for the hotword detected event.

Does the Hermes protocol specify anything for audio data over MQTT? I’m currently streaming it via nanomsg to my external wake word services (snowboy, mycroft precise), but MQTT would do fine as well.

Romkabouter · January 12, 2019, 4:55pm

Yes, the Snips Audio Server publishes a lot of small WAV message in 16bit, 16000 Hz format.

There is more info here: https://docs.snips.ai/ressources/hermes-protocol
A nice graphical view can be seen here:
https://snips.gitbook.io/tutorials/t/technical-guides/listening-to-intents-over-mqtt-using-javascript

Main idea for rhasspy would be listing to an audio mqtt stream, you could implement that any way you want. It does not really have to follow the hermes protocol, although that is a pretty decent design.
The ping! sounds are any TTS is send to a playBytes topic in wav format, I think that is also important to handle. That way, the external audio source could also play the response

My main drive here is: I like Snips, but does not yet have Dutch language support.
I like rhasspy because it handles Dutch, so combining things might be cool

Why a need the MQTT functionality is because I currently use my Matrix Voice as standalone Audio Server:

synesthesiam · January 12, 2019, 8:12pm

OK, I think this should be pretty straightforward. Rhasspy’s recording architecture is centered around an AudioRecorder class that distributes raw 16-bit 16Khz mono frames to the various sub-components of the system. I should be able to add a “recorder” that simply unpacks the audio data from the WAV frames and relays them while “recording”. The only trick will be to ensure that the chunks are 10ms, 20ms, or 30ms in order to be compatible with webrtcvad (what I use to detect silence).

Do you want Rhasspy to actually play the WAV sounds, or just send out the playBytes messages?

Romkabouter · January 12, 2019, 10:00pm

The audio from snips is stereo as well I believe, sorry about that. But mono should be fine as well if that is easier, rhasspy does not have to a Snips replacement obviuously.
I would prefer sending to WAV sound to the playBytes topic, but maybe both is possible.
That way, any device which subscribes to that topic is able to play sounds coming form rhasspy

From a technical point Snips works like this:

The AudioServer records from microphone and send the audio as 256 (configurable) frames wav file to a mqtt broker. Format is 16bit, 16000Hz stereo, so in each wav message there is the wav header + 512 bytes. It is published to hermes/audioServer//audioFrame.
The Hotword service is subscribed to that topic and most probably uses a ringbuffer to listen to that stream.
When a hotword is detected, a message is publised to hermes/hotword/<WAKEWORD_ID>/detected

Maybe this is a good resource for you: https://github.com/syntithenai/opensnips/tree/master/opensnips/old/docker-images/rasa2/snips_services

In the past I have created a Hassio Addon called Snowboy Hotword Detection
https://github.com/Romkabouter/hassio-addons/tree/master/snowboy-hotword/snowboy, that implements the above hotword detector using Snowboy.
Maybe it will help you

Like I said, rhasspy does not have to be a Snips replacement, but a lot can be learned here

ozzi91 · January 13, 2019, 11:10am

How to delete completly this voice assistant? I do not see any profiles to choose like before, rebuild does not fix it, aswell as uninstall and install. I just want to delete every single files to make a clear install

synesthesiam · January 13, 2019, 1:31pm

How did you install Rhasspy? Hass.io, Docker, or in a virtual environment?

ozzi91 · January 13, 2019, 1:41pm

Hass.io as addon but now it is working nicely. But i have another questions. First, when i go “words” and next click button “pronounce” i have error "‘APlayAudioPlayer’ object has no attribute ‘play_wav’ " How can i fix it?

Second: I hvae a light called “Lampka nad ubraniami” and i have a lot of automatization for it. Because it is not english name, i would call it by saying for example “turn room light on”. How to do it? Make a complete another pronounce for my polish word? I mean make custom word “ubraniami” and pronounce it as “room light”?

And last question: I configure a simple phrase to turn light on. The “friendly name” of this light is “Lampka nad ubraniami”. When i press “hold to record” on config site it gives me a result:

"intent":

“entities”:
0:
“entity”: “state”
“value”: “on”
1:
“entity”: “name”
“value”: “lampka nad ubraniami”
“hass_event”:
“event_data”:
“name”: “lampka nad ubraniami”
“state”: “on”
“event_type”: “rhasspy_ChangeLightState”
“intent”:
“confidence”: 0.95
“name”: “ChangeLightState”
“text”: “turn on the lampka nad ubraniami”
“time_sec”: 5.359760284423828

I have a sign in square “Send to home assistant”. Do i have to do sth else to make it work? Put it somewhere? Or it should be working now, after saying a command?

synesthesiam · January 14, 2019, 2:41am

You need to have something in your Home Assistant automations that will catch the rhasspy_ChangeLightState event that Rhasspy sends. See the How it Works section of the README for an example. I’ll try to help with what I can