Rhasspy offline voice assistant toolkit

koan · May 4, 2019, 7:33pm

@synesthesiam I also found this link about an attempt to add Dutch in MaryTTS.

Romkabouter · May 7, 2019, 6:15pm

Succesfully used Google Wavenet in Rhasspy! Now some cleanup has to be done and make it a bit more configurable.

ntuseracc · May 8, 2019, 10:03am

THis project really evolved in the last few month and i really want to try this.

As far as i currently understand the latest version is not a custom component for home assistant anymore and only communicates through the apis?

I want to run the this in a docker container on my home server but then i have to use a remote microphone with this setup. What is the cheapest and easiest way to get remote recording and playback working with this? It does not have to be pretty and i just want something that works for testing.

Im thinking of something like a pi zero w that has speakers and a microphone? (Playstation Eye?). Does anyone have tutorial to set this up? I read about mqtt/hermes for audio streaming, but if my remote mic doesnt do the hotword detection audio is recorded all the time and sent to the server? Could this be a problem?

Ideal main goal would be in the end to have 2 or 3 remote mic/speakers which communicate with the same rhasspy server, is it possible or planned to have several remote microphones?

Thanks for any help!

koan · May 8, 2019, 12:32pm

This is exactly the setup I’m running now: Rhasspy, an MQTT server, Home Assistant and AppDaemon are running in Docker containers on my NAS, with snips-audio-server running on a Raspberry Pi in my living room (more satellites are planned). You could do this easily with a Raspberry Pi Zero W and a ReSpeaker 2-Mics Pi HAT.

Install Raspbian on your Pi, install the ReSpeaker driver and then install snips-audio-server:

sudo apt update
sudo apt install -y dirmngr
sudo bash -c 'echo "deb https://raspbian.snips.ai/$(lsb_release -cs) stable main" > /etc/apt/sources.list.d/snips.list'
sudo apt-key adv --keyserver pgp.mit.edu --recv-keys D4F50CDCA10A2849
sudo apt update
sudo apt install snips-audio-server

Then put the following in /etc/snips.toml to point at your MQTT server and give your audio server a site ID:

[snips-common]
mqtt = "example.com:1883"

[snips-audio-server]
bind = "livingroom@mqtt"

And restart snips-audio-server with:

sudo systemctl restart snips-audio-server

Note that snips-audio-server is not open source. I’m currently working on an open source audio server that talks the Hermes protocol (and hence is able to talk with the Rhasspy server) and should be installable on a Raspberry Pi. That way you could eliminate Snips and run this setup on a completely open source stack.

If the satellites don’t have to be Raspberry Pis, take a look at @Romkabouter’s Matrix-Voice-ESP32-MQTT-Audio-Streamer which runs on the Matrix Voice with ESP32, an impressive piece of hardware.

By the way, you’re right that this setup is continuously sending audio to the Rhasspy server. This hasn’t been a problem in my network, but if you don’t want this, you could run a hotword service on your satellite device and only start recording after the hotword is detected until there’s silence again.

ntuseracc · May 8, 2019, 12:46pm

Thanks for the small tutorial
This is almost the same setup im using or planning to do.

Ah ok so the snips audio server (Snips satellite) is the only stream solution possible at the moment (if you dont want to forward a whole device through the network). This is good engough for testing and im looking forward to you opensource alternative.

I have a pi zero here so i will start with this but the Matrix ESP32 looks like a great solution for the future.

Do you have experience with the respeaker 2 head? Does this work well with distances up to 5 meters?

koan · May 8, 2019, 12:55pm

Most of the time I have it two meters from me and that works perfectly. Occasionally I have used it from a larger distance too, around five meters, but then you have to talk really loud and clear. Of course it also depends on your case: mine is housed in a case that partly covers (with holes in the case) the microphone and speaker, so maybe that decreases the quality of the audio input in my setup.

ntuseracc · May 8, 2019, 1:05pm

thanks again for the quick reply.
Well i shall see just ordered one for my pi. 5 meters is the most extreme situation in our livingroom , mostly it will be arround 2-3 meters.

synesthesiam · May 8, 2019, 9:14pm

Great! I’ll be interested to check out the code. I’m planning to do a new release of Rhasspy by the end of next week, so I might be able to include this

synesthesiam · May 8, 2019, 9:16pm

Thanks! I’ve got this on my TODO list. If I can’t get it working, I’ll contact the author and see if they’ll have any ideas.

Romkabouter · May 9, 2019, 6:58am

Cool, I’ll try to do a PR request by then.

koan · May 10, 2019, 3:04pm

@ntuseracc So after a bit of hacking I have a working equivalent of snips-audio-server. I’ll clean up the code and publish it soon, so you can install it on your Pi Zero W as a remote recording and playback device for Rhasspy.

ntuseracc · May 10, 2019, 3:57pm

Wow that was fast.
got my respeaker head today and wanted to play arround with snips tool this evening to test my first automations.
if you have it published i will try it and report back.
Thanks for your work!

koan · May 10, 2019, 4:51pm

I have now published a rough version (no daemon, no systemd script, no logging except stdout output) on GitHub:

It probably has some bugs in it for corner cases, because I only finished it today, but those bugs will surely be ironed out after some more testing. I have also a TODO list to make it more robust, so you can run it as a service and have it log to syslog then.

Feedback is welcome

koan · May 10, 2019, 5:11pm

By the way, @synesthesiam, Snips has just open sourced an inference engine for their hotword models:

Maybe also something interesting for Rhasspy?

FunkyBoT · May 10, 2019, 5:32pm

That’s great. I have a Pi ZeroW laying around, so I would be glad to test it.

synesthesiam · May 10, 2019, 9:01pm

I’ll check it out this weekend and let you know

synesthesiam · May 10, 2019, 9:05pm

I need to sit down at some point and try out the snips wake word system, as well as porcupine. Rhasspy could definitely use more options for wake word systems.

I’ve finally gotten Mycroft Precise’s new version integrated (0.3.0), so that will be available in the next version of Rhasspy with a universal model for “hey mycroft” included. Unfortunately, it seems to burn about 50% of the CPU while running…

nickrout · May 10, 2019, 10:09pm

Damn you @synesthesiam I was just setting up snips.ai and came across rhasspy. This looks so much better. Back to square one. Good thing I like learning and experimenting.

Thank you for your excellent work and documentation, which I am reading and can understand. Better than snips already.

synesthesiam · May 10, 2019, 11:10pm

I aim to please

ntuseracc · May 11, 2019, 7:08am

Have to say this as well. Great Documentation and it looks allready very polished (webui).
What could be added to the documentation in the future are a bit more complex examples (sentences and automations) and instructions to add tts support to home assistant. I found the TTS instructions in the configuration examples for ha on github tho.

A place to collect community examples would be cool as well.

btw, has anyone create a working timer with this? Like “ok rhasspy, set a timer for 4 minutes and 30 seconds” and it will report back when the time is up?