Rhasspy offline voice assistant toolkit

Thanks for the tips and links! I’ve made a first pass at implementing the Hermes protocol in Rhasspy. Here’s how to use it:

  • Enable and configure MQTT on the Settings tab (or in your default profile)
  • Set the wake system to remote MQTT

If all goes well, this should stream the audio data out via MQTT in the format you described above. When the wake word is detected and the hermes/hotword/<WAKEWORD_ID>/detected is sent back, it’ll trigger Rhasspy to listen for a command.

I haven’t added it to the Settings page yet, but you can set the sounds.system to hermes in your profile to have it send out a playBytes message too instead of using aplay.

Hope it works!

Everyone, I’ve updated Rhasspy on Docker and my Hass.IO add-on repository. I did a major refactoring of the code base, so I hope I fixed more bugs than I introduced…

Anyway, here are some of the new features:

  • Support for external wake word detection systems. Based on feedback from @Romkabouter, I’ve based this on the Snips.AI Hermes protocol. I’m working on two Hass.IO add-ons for wake word detection, one based on snowboy, the other on Mycroft Precise.
  • A command-line interface for building voice-enabled Linux applications on top of Rhasspy (see the rhasspy shell script).
  • Support for tuning your acoustic model with voice samples (tutorial coming soon)

For most people, I assume the external wake word stuff is the most interesting. It’s more complicated than I would like to use, but I’m doing my best to make it so Rhasspy can interact with other systems. To get it working, you need:

Then in your Settings, you need:

  1. MQTT enabled and pointing at your MQTT server
  2. Your wake word system set to remote MQTT

I plan on making another video to demonstrate this soon. At the very least, I think a docker-compose example could help. It shouldn’t be too hard to put Rhasspy, an MQTT server, and one of the wake word systems all together…

1 Like

If you could just say which docker image to use for the wake word section, I can add it to my existing docker-compose file which already has everything else necessary.

Hello, I have another problem :stuck_out_tongue: and one question

After rhasspy update (today) when i say sth i have got this error:

`"intent":

“entities”:
0:
“entity”: “state”
“value”: “on”
1:
“entity”: “name”
“value”: “lampka nad ubraniami”
“error”: "Invalid URL ‘api/events/rhasspy_ChangeLightState’: No schema supplied. Perhaps you meant http://api/events/rhasspy_ChangeLightState?"
“hass_event”:
“event_data”:
“name”: “lampka nad ubraniami”
“state”: “on”
“event_type”: “rhasspy_ChangeLightState”
“intent”:
“confidence”: 0
“name”: “ChangeLightState”
“text”: “turn on the lampka nad ubraniami”
“time_sec”: 0.009526252746582031`

What is wrong with this error?

And the question is: is this right to make voice commadn works:

- alias: Rhasspy lampka nad ubraniami

trigger:
platform: event
event_type: rhasspy_ChangeLightState
event_data:
name: lampka nad ubraniami
state: on
action:
entity_id: light.lampka_nad_ubraniami
service: light.turn_on`

Do you have any plans on, or is there a way currently to process network streams as the audio device? Say a bunch of esp32’s outputting their mics as a constant audio stream.

This should work now, actually. There’s a new MQTT option under the microphone settings, which will listen for WAV data inside an MQTT payload as @Romkabouter described above.

Make sure not to also set the wake word system to MQTT, though, as this will try to send/receive network audio at the same time.

From the error, it looks like your Home Assistant URL got erased somehow in the settings. The default value is http://hassio/homeassistant/ if you’re inside Hass.io.

Your Home Assistant automation looks right to me :slight_smile:

I’ve got a start on a docker compose example with Rhasspy, snowboy, and an MQTT server. To make it work, you need to go into the Rhasspy settings (once it’s all up) and:

  1. Set the wake system to remote MQTT
  2. Enable MQTT and set the host to mosquitto (the docker hostname of the MQTT server)

I have it set to use my personal snowboy model with “okay rhasspy” as the wake word. I’d suggest creating your own personal model and giving it a try!

my configuration
rhasspy:
image: synesthesiam/rhasspy-server:latest
environment:
- RHASSPY_PROFILES=/profiles
volumes:
- $HOME/.rhasspy/profiles:/profiles
devices:
- /dev/snd:/dev/snd
ports:
- “12101:12101”

snowboy:
image: synesthesiam/snowboy:1.3.0
volumes:
- /run/dbus:/run/dbus
command: --host mosquitto --model /models/okay_rhasspy.pmdl

I stopped working when I restarted

Hey awesome, thanks for the response. I haven’t had a chance to play around yet so sorry if it’s a dumb question but when you say don’t set the wake word system to mqtt, do you mean I need to do move the wake word portion to the audio device and only send the relevant voice commands to rhasspy via mqtt to be interpreted?

I was hoping essentially to have a bunch of hot mics streaming to rhasspy and have it take care of everything.

I may need to think about this more, but here are my initial thoughts

By default, Rhasspy records from a microphone and, when you set the wake word system to MQTT, it streams this recorded audio data out on the hermes/audioServer/<SITE_ID>/audioFrame topic. When some external system detects the wake word (snowboy, etc.), it publishes to hermes/hotword/<WAKEWORD_ID>/detected. Rhasspy picks that up and switches to listening for a command (from the microphone).

In your case, though, you just want to send audio data to Rhasspy. By setting the audio recording system to MQTT, Rhasspy will not record from a microphone, and will instead take in your audio data (on hermes/audioServer/<SITE_ID>/audioFrame). If you have snowboy listening on the same MQTT server, then it should notify Rhasspy when the wake word is detected on hermes/hotword/<WAKEWORD_ID>/detected. Then, Rhasspy will switch to listening for a command from the incoming MQTT audio stream.

So really, the point is to avoid having Rhasspy both publish out audio data and listen for it at the same time. I may need to make this clearer in the settings.

Did you see any errors in the Docker log? Thanks.

I restart docker-compson
rhasspy skin in the future

1 Like

Ok this make sense.

Like I said I haven’t had a chance to play around with Rhasspy so I might be imagining a bit how it works but it looks fun.

Hmm, I got an error: OSError: [Errno -9996] Invalid input device (no default output device)
Only a Pi3 and hassio, but with USB microphone, uninstalled and installed again.
Same issue, anyone else?

Ok, i am learning fast, simple automatic is working, i am fighting with wake up words.
I check “Listen for wake word on start-up” and “Use pocketsphinx locally " set a wakeup word as"okay rhasspy” and other simple words but it seems i cannot make it work.
Should I hear a notification sound that rhasspy is listening now like in snips?

After many tries i read about other addons to activate mqtt wake word system but i cannot install them as addon in hassio. I just hitted install, progressing circle is circling around and nothing happend. Doesnt matter is Mycroft or Snowboy.

After a restart i dont see any profiles in right corner (fr,en ect) To make it work, I am doing a few restarts and it shows up, but should it be like this?

@Romkabouter It happpend to me, i just reconnecting phisically devices, restarting arduino or just wait more tiem after boot.

I’ve seen this happen in Hass.IO when it didn’t select an input audio device for me. I had to stop the add-on, select the microphone, save, then start it back up. Kind of irritating.

The Mycroft and Snowboy add-ons can take quite a while to build. Try the Snowboy add-on first, since it’s the one I’ve tested the most. I usually go to the Hass.io System tab, refresh the log, and wait for it to report "Build complete for " for the add-on (sometimes 10-15 minutes on a Pi).

Pocketsphinx should work sometimes with an appropriate microphone and wake word (they recommend 3-4 syllables). I have almost no success with my laptop’s built-in microphone, but ok performance through the PS3 Eye. Make sure you have the proper microphone select in the Hass.io add-on tab for audio input too. Rhasspy will pick the default audio device unless told otherwise.

Also try to switch from PyAudio to arecord for audio recording in the settings. Some folks here have had better luck with that for some reason.

I keep forgetting to mention too: Rhasspy has an OpenAPI (Swagger) compatible page available where you can test lots of things through the HTTP API.

Just visit http://localhost:12101/api/ to see the testing page (don’t forget that final slash!, it seems really picky about it).

You can do cool things, like run a voice command from a WAV file:

curl -X POST -H 'Content-Type: audio/wav' --data-binary @/path/to/my-command.wav localhost:12101/api/speech-to-intent

Or by-pass the wake word and have Rhasspy listen for a command right now:

curl -X POST localhost:12101/api/listen-for-command

Many of the API calls accept a ?no_hass=true query parameter that tells Rhasspy not to pass anything on to Home Assistant.