Rhasspy offline voice assistant toolkit

Awesome! I will do my best not to break things for you :slight_smile: I may have to grab an esp32 to try out your streamer; I have a bunch of esp8266’s lying around, but they may not be powerful enough.

As you’ve seen, I think the wake word functionality is Rhasspy’s weakest point. I have yet to find a system that works well, and is usable across the various architectures Rhasspy is intended to run on. Looking at Snip’s tract compatibility layer for Tensorflow, it might just be possible to run Mycroft Precise models on all the supported platforms!

BTW, I updated the snowboy/precise add-ons to use /share in Hass.io, so your models should stay put between reboots now.

You will need a Matrix Voice, not any esp32 :wink:

I could not get the wakeword indeed (none of the methods except the mqtt), but using a separate addon is no problem for me.
I will try the /share, nice!


This is my new error
when i reinstall my machine

Please add vietnam language: https://github.com/trungtv/vivi_spacy

Try deleting the command line under rhasspy and see if that helps. That might be causing some problems during start up.

A spaCy model is great, but not enough to add a new language to Rhasspy. At a minimum, I need:

  1. An acoustic model (sounds -> phonemes)
  2. A pronunciation dictionary (phonemes -> words)

It’s also nice to have a grapheme-to-phoneme (g2p) model for guessing how new words are pronounced, but not completely necessary.

However, I do think it may be possible to add Vietnamese in another way. I see that a group from Canada has a Vietnamese acoustic model trained for Kaldi (a different speech recognizer). I haven’t used Kaldi, but Sphinx4 (pocketsphinx’s big brother) might be able to load it!

I will try to get Sphinx4 working with Rhasspy so we can load the Vietnamese model and test it out. It would be very helpful for me if you could provide some WAV files of example voice commands in Vietnamese and the words that go with them. Can you do that?

1 Like

ok .Thank you very much for working on my nuc
Can you help me configure this I added rhasspy
but not working

I want to add janet to run and receive information when offline

Thanks for your enthusiasm, I will send your request as soon as possible.

here, some WAV files of example voice commands in Vietnamese and the words that go with them. i’m sorry english

Hi synesthesiam, first of all i think that your work is awesome it’s a couple of months that i’m studying around snips platform, but only yesterday i’ve found your repo and i’m really impressed from your work. i’m working on a presonal assistant that can drive some BLE devices (like relays, sensors etc etc). at the moment i’m using SNIPS on a Rpi3b+ and a node App that use MQTT to subsribe and receive intents from thr snips platform. then, my Node drive the BLE Devices. at the moment i’ve read your documentations but in a preliminary way. I do not need Home Assistant for my project, so, i kindly ask you if you can drive me in a right way to achieve my setup. all that i need is :

Rpi 3b+ with debian stretch lite.
Rhasspy installed and configured
My node App that use MQTT to subscribe amd receive intent.

Hoping that you can help me, thank again for your work and Regards

TThis is really great stuff, @synesthesiam! Thank you for making this available! I’m excited to get started playing with it. I have a few questions/comments/suggestions

First, because this thread/post contains so much configuration/troubleshooting information that’s user and/or platform specific, it’s pretty tough to slog through and try to find answers to questions I have, so I apologize if any of my questions below have already been answered. I would suggest renaming this thread to “Rhasspy Configuration/Support” (or reasonable facsimile) and open a new thread that gives a Rhasspy overview, includes a FAQ and the most current procedure for installing and getting started with Rhasspy on both Home Assistant and Hassio, and consolidates key/still relevant bits of info that are buried in this thread. Big, bold text that says, “DO NOT ASK FOR SUPPORT IN THIS THREAD — USE [link to this thread] FOR SUPPORT” would also be a really good idea.

I posted some of the following questions in the comments section of your YouTube video but this is probably a better place to ask, so please feel free to ignore the questions on YouTube.

  1. Is this still equally viable/supported on Home Assistant AND Hassio? (All off the later posts on this thread seem to focus on Hassio.)
  2. Where can I find the latest recommended installation/configuration/setup procedure?
  3. Could a different text-to-speech engine be used to make the speech output sound more natural?
  4. Is it possible to have a randomized response for confirmations? (I.e. not just “OK” but sometimes “You got it” or “Consider it done” or “Happy to help”)
  5. Could Rhasspy trigger Node-RED (instead of yaml) automations?
  6. Would it be possible to program a task consisting of a sequence of back and forth responses? (“Rhasspy, change the light color. “ “OK, Which light?” “The corner lamp” “Sure, what color?” “Blue” “Got it. Here you go.”) A better example might be (for my extremely forgetful wife) - “Rhasspy, I’m leaving” “OK, lets go over your checklist. Do you have your inhaler?” “Yes” “Good. Do you have your phone?” “Hang on, let me check
 No. I don’t know where it is. Help me find it.” (Rhasspy ignores the rest but is triggered by “help me find it” and calls her phone then listens for confirmation that she has it and continues with the checklist.) “OK, Rhasspy, I have my phone!” “Great! Do you have your sunglasses?” (Etc. until the whole list has been processed.)
  7. In one of your posts, you announced that you were allowing users to swap out internal components - including triggering actions using something other than Home Assistant. Unfortunately, that links to a 404 on github. I’m very interested in using Rhasspy for a voice activated password manager — no Home Assistant required, but it would be nice (maybe even essential) to trigger output from a more natural sounding TTS program.

Great, thank you very much! I’ll get back to you when I’ve had a chance to test things out.

Issue wuth MQTT someone help me ?

here the log:
parallels@debian-gnu-linux-vm:~$ cd /
parallels@debian-gnu-linux-vm:/$ cd rhasspy/
parallels@debian-gnu-linux-vm:/rhasspy$ sudo ./run-venv.sh
[sudo] password di parallels:
Using pre-compiled binaries.

  • Serving Flask app “app.py”
  • Environment: production
    WARNING: Do not use the development server in a production environment.
    Use a production WSGI server instead.
  • Debug mode: off
    DEBUG:app:Namespace(profile=None, set=[])
    DEBUG:RhasspyCore:Loaded profile from profiles/en/profile.json
    INFO:root:++++ Actor System gen (3, 9) started, admin @ ActorAddr-(T|:1900)
    DEBUG:root:Thespian source: /rhasspy/.venv/lib/python3.7/site-packages/thespian/init.py
    DEBUG:DialogueManager: -> started
    DEBUG:DialogueManager:started -> loading_mqtt
    DEBUG:DialogueManager:Loading MQTT first
    DEBUG:DialogueManager:Loading
will time out after 30 second(s)
    DEBUG:HermesMqtt: -> started
    DEBUG:HermesMqtt:started -> connecting
    DEBUG:HermesMqtt:Connecting to MQTT broker localhost:1883
    DEBUG:DialogueManager:loading_mqtt -> loading
    ERROR:HermesMqtt:connecting
    Traceback (most recent call last):

File “/rhasspy/rhasspy/mqtt.py”, line 98, in do_connect
ret = self.client.connect(self.host, self.port)

File “/rhasspy/.venv/lib/python3.7/site-packages/paho/mqtt/client.py”, line 839, in connect
return self.reconnect()

File “/rhasspy/.venv/lib/python3.7/site-packages/paho/mqtt/client.py”, line 962, in reconnect
sock = socket.create_connection((self._host, self._port), source_address=(self._bind_address, 0))

File “/usr/lib/python3.7/socket.py”, line 727, in create_connection
raise err

File “/usr/lib/python3.7/socket.py”, line 716, in create_connection
sock.connect(sa)

ConnectionRefusedError: [Errno 111] Connection refused

DEBUG:HermesMqtt:Reconnecting in 5 second(s)
DEBUG:DialogueManager:Loading actors
DEBUG:DialogueManager:Actors created. Waiting for [‘recorder’, ‘player’, ‘wake’, ‘command’, ‘decoder’, ‘recognizer’, ‘handler’, ‘hass_handler’, ‘sentence_generator’, ‘speech_trainer’, ‘intent_trainer’, ‘word_pronouncer’] to start.
DEBUG:PhonetisaurusPronounce: -> started
DEBUG:JsgfSentenceGenerator: -> started
DEBUG:HomeAssistantIntentHandler: -> started
DEBUG:FuzzyWuzzyRecognizer: -> started
DEBUG:HermesCommandListener: -> started
DEBUG:PocketsphinxSpeechTrainer: -> started
DEBUG:FuzzyWuzzyRecognizer:Loaded examples from profiles/en/intent_examples.json
DEBUG:FuzzyWuzzyRecognizer:started -> loaded
DEBUG:HomeAssistantIntentHandler:started -> started
DEBUG:APlayAudioPlayer: -> started
DEBUG:ARecordAudioRecorder: -> started
DEBUG:DialogueManager:word_pronouncer started
DEBUG:DialogueManager:speech_trainer started
DEBUG:DialogueManager:sentence_generator started
DEBUG:DialogueManager:command started
DEBUG:FuzzyWuzzyIntentTrainer: -> started
DEBUG:DialogueManager:handler started
DEBUG:DialogueManager:player started
DEBUG:DialogueManager:recorder started
DEBUG:DialogueManager:recognizer started
DEBUG:DialogueManager:hass_handler started
DEBUG:DialogueManager:intent_trainer started
DEBUG:PocketsphinxWakeListener: -> started
DEBUG:PocketsphinxWakeListener:Loading wake decoder with hmm=profiles/en/acoustic_model, dict=profiles/en/dictionary.txt
DEBUG:PocketsphinxWakeListener:started -> loaded
DEBUG:DialogueManager:wake started
DEBUG:PocketsphinxDecoder: -> started
INFO:PocketsphinxDecoder:Loading decoder with hmm=profiles/en/acoustic_model, dict=profiles/en/dictionary.txt, lm=profiles/en/language_model.txt
DEBUG:PocketsphinxDecoder:started -> loaded
DEBUG:DialogueManager:decoder started
DEBUG:PocketsphinxWakeListener:loaded -> listening
DEBUG:ARecordAudioRecorder:started -> recording
INFO:DialogueManager:Actors loaded
DEBUG:DialogueManager:loading -> ready
DEBUG:ARecordAudioRecorder:[‘arecord’, ‘-q’, ‘-r’, ‘16000’, ‘-f’, ‘S16_LE’, ‘-c’, ‘1’, ‘-t’, ‘raw’, ‘-D’, ‘default’]
DEBUG:ARecordAudioRecorder:Recording from microphone (arecord)
INFO:DialogueManager:Automatically listening for wake word
DEBUG:DialogueManager:ready -> asleep
INFO:werkzeug: * Running on http://0.0.0.0:12101/ (Press CTRL+C to quit)
ERROR:HermesMqtt:connecting
Traceback (most recent call last):

File “/rhasspy/rhasspy/mqtt.py”, line 98, in do_connect
ret = self.client.connect(self.host, self.port)

File “/rhasspy/.venv/lib/python3.7/site-packages/paho/mqtt/client.py”, line 839, in connect
return self.reconnect()

File “/rhasspy/.venv/lib/python3.7/site-packages/paho/mqtt/client.py”, line 962, in reconnect
sock = socket.create_connection((self._host, self._port), source_address=(self._bind_address, 0))

File “/usr/lib/python3.7/socket.py”, line 727, in create_connection
raise err

File “/usr/lib/python3.7/socket.py”, line 716, in create_connection
sock.connect(sa)

ConnectionRefusedError: [Errno 111] Connection refused

DEBUG:HermesMqtt:Reconnecting in 5 second(s)
ERROR:HermesMqtt:connecting
Traceback (most recent call last):

File “/rhasspy/rhasspy/mqtt.py”, line 98, in do_connect
ret = self.client.connect(self.host, self.port)

File “/rhasspy/.venv/lib/python3.7/site-packages/paho/mqtt/client.py”, line 839, in connect
return self.reconnect()

File “/rhasspy/.venv/lib/python3.7/site-packages/paho/mqtt/client.py”, line 962, in reconnect
sock = socket.create_connection((self._host, self._port), source_address=(self._bind_address, 0))

File “/usr/lib/python3.7/socket.py”, line 727, in create_connection
raise err

File “/usr/lib/python3.7/socket.py”, line 716, in create_connection
sock.connect(sa)

ConnectionRefusedError: [Errno 111] Connection refused

DEBUG:HermesMqtt:Reconnecting in 5 second(s)

Do you have this:

running?

A MQTT broker on localhost (the machine Rhasspy is running on)

SOLVED - Yes thanks, the MQTT was not running
Now works :slight_smile:

Hi @hacaro, thanks trying out Rhasspy. I’m a bit late getting back to you, but it looks like you’re figuring things out :slight_smile:

Since you’re not using Home Assistant, you’ll want to make sure and disable it in Rhasspy. This is currently a bit harder than it should be: in the Advanced tab of the web interface, you need to edit the profile JSON to looks something like this:

{
    "language": "en",
    "handle": { "system": "dummy" },
    ...
}

Setting the handle system to dummy will make sure Rhasspy doesn’t waste time trying to reach a non-existent Home Assistant server every time you speak a command.

Let me know if you have any other questions!

Thanks for the feedback, @DonnyBahama. I agree that splitting the thread up is a good idea, especially since the older posts refer to a now defunct version of Rhasspy that was based on custom components. I’ll create a new overview thread when I release the next version.

Rhasspy still supports Home Assistant, but as an external service now instead of as a set of custom components. I just call out to Home Assistant’s HTTP API to push events in now.

Here are latest installation instructions. Using Docker is probably the easiest way to get started, though Docker and microphones are mortal enemies.

My use of eSpeak in Rhasspy is only intended for word pronunciations. I used it in my demo video just because it was handy, but you can use any of the supported TTS systems in Home Assistant. I’ve also played around with deepvoice3 if anyone is interested in hearing about that (there’s no way it would run on a pi, though).

This is something that would be done after Rhasspy interprets the voice command and passes an event along to Home Assistant.

I believe so, yes. I was unaware of Node-RED, and it seems to exactly what I need to complement Rhasspy outside of the Home Assistant use case. Thanks! I’ll get back to you on this.

This isn’t something Rhasspy can currently do by itself, but it could be achieved by having an external program control Rhasspy (probably via its HTTP API). You need some kind of state machine to track the dialogue, and transition that state machine according to the intents coming back from Rhasspy.

What’s missing, however, is the ability to restrict which intents are active at different points in the dialogue. This is actually something Rhasspy could be quite good at, but will require some work.

For your specific example with the checklist, etc., Rhasspy would probably have trouble with too much variation in the responses from your wife. One way around this I’ve is by mixing Rhasspy’s language model with a general English model. This opens things up quite a bit, but leads to longer training/interpretation times. If you’re running Rhasspy on a PC or server, this isn’t a problem.

Whoops, sorry about the 404. I moved all of the documentation over to readthedocs. You could definitely do the password manager thing with Rhasspy. I’ve made a small example where I launch programs using voice, which I can share. The “trick” is to use the command system for intent handling, which just calls an external program with some JSON into standard in. Let me get my example in order and I’ll get back to you.

Thanks!

my error, but i don’t know fix in you help me fix this. thank you very much

Thank you ! yes at the moment i’m studying the platform :slight_smile: and it is great. now i’ll put the “dummy” into the setup.
At the moment i’m using Rhaspy With MQTT (like SNIPS) and a Node App that send message to my devices.

All works really fine in the test environment. now it’s time to understand better all the part of training the intents.

Thank you for the support.