Rhasspy offline voice assistant toolkit

You are on a roll! The compilation documentation worked perfectly.

The examples.ml has been updated with the contents of the examples.md. The next showstopper

subprocess.CalledProcessError: Command '['/data/rhasspy/rhasspy-tools/phonetisaurus/phonetisaurus-armv7l', '--model=/data/rhasspy/rhasspy-tools/phonetisaurus/etc/g014b2b.fst', '--i...ero exit status 1.

Do you know how we can view more information about what goes wrong since the homeassistant.log does not contain the above statement. I’m not sure what domain to use in e.g. the logger component to set the info to debug level.

The issue seems to be the same as with SRILM in that phonetisaurus also gives the error indicating version 'GLIBCXX_3.4.21' not found . Can this also be easily recompiled?

I’ll attempt to do using this info in you repo: https://github.com/synesthesiam/rhasspy-tools/tree/master/phonetisaurus

I was able to get DEBUG level messages in my log by adding the logger component to configuration.yaml without any additional setting s.

I’m afraid you will have to recompile phonetisaurus-g2p as well. Thankfully, you won’t need to retrain its finite state transducer. I originally built it from the instructions in Jasper’s documentation, but the repo instructions actually look easier. I’ll try building it again tonight and write my up experience.

Just wanted to let you know that after rebuilding phonetisaurus-g2p and moving some files into rhasspy-tools, the training seems to be done succesfully and I now have a new mixed language model and dictionary.

With regards to the logger component it would be useful to find out how we can only put the components of rhasspy in debug mode, because enabling like I do now for everything is quite verbose in a busy Home Assistant installation.

The next 2 questions I now have are:

  1. Rhasspy at the moment seems to be always listening and not requiring the wake work before the STT starts working. Any idea what might cause this?
  2. How can we improve the words that are recognized by STT? Is this by editing the accoustic dictionary?

For others needing to rebuild phonetisaurus-g2p, I’ve added instructions here.

I haven’t looked any more into the logger, but I will try some more as I have time. My first pass at getting it to just put out rhasspy-related stuff didn’t work :frowning:

Regarding your other questions:

  1. This could happen if automations.yaml isn’t set up correctly (which may be a bug on my end). Does it start your hotword service listening and then only trigger the STT when the hotword_detected event is seen? Which hotword service are you using?
  2. When you do training, if there are words that aren’t in any of rhasspy_train's dictionary_files, it uses phonetisaurus-g2p to guess how they’re pronounced and places those guesses in guess.dict. You should review these and ultimately move them to one of your dictionary_files. If you manage to get the rhasspy-tools web interface running, I actually have a page that will pronounce words for you and help with this using espeak.

Besides pronouncing words, the other thing that will generally help STT is adding examples sentences to examples.md. The more (relevant) examples you add, the better the language model will capture your custom command language.

The issue with Rhasspy always listening went hand in hand with a lot of other issues I had with regards to PulseAudio.

Home Assistant often crashed with error like the following:

ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
...
ALSA lib pulse.c:243:(pulse_connect) PulseAudio: Unable to connect: Connection refused
connect(2) call to /tmp/jack-0/default/jack_0 failed (err=No such file or directory)
attempt to connect to server failed
...
home-assistant.service: main process exited, code=killed, status=11/SEGV

Ater some digging around online I ended up uninstalling PulseAudio and for now the issues seem resolved.

Is there a way to improve the speech recognition when the wrong words are detected? I assume these are found in the dictionary, they are just not the correct ones.

What steps did you do to remove pulseaudio? I’m working on docker images for Rhasspy, and it really screws things up. The pocketsphinx Python library won’t build without it installed, so I’ll have to remove it as a later step in my Dockerfile.

There are 3 levels for improving the speech recognition:

  1. The language model - adding more training sentences or changing how Rhasspy mixes your examples.lm with the base language model could improve things. The default is a 5% mixture; setting this to 0% or using examples.lm instead of mixed.lm would mean you could only say things from your training set (which maybe is what you want).

  2. The dictionary - the main dictionary has many different ways of pronouncing individual words to accommodate a variety of accents and individual differences. Deleting the alternatives that you don’t use would help reduce the space of possibilities. Also, make sure the words in guess.dict are exactly how you would say them (then move them into user.dict).

  3. The acoustic model - this controls how sound is mapped to phonemes. There’s a tutorial for tuning it to your voice, but it’s fairly complicated. I have code for this that’s not incoporated into Rhasspy yet, but will be eventually.

I just uninstalled it using apt-get remove pulseaudio but was to soon with concluding that all things are resolved. I notice again that Home Assistant goes down from time to time and that Rhasspy is listening en decoding text without using the snowboy wakeword.

Will try to keep you informed when I make progress towards resolving these issues.

After following your installation guide again, I notice now that Rhasspy keeps listening from the moment that Home Assistant is started and the state for the snowboy hotword detection is always listening and never idle.

This causes Rhasspy to always be translating everything it picks up to text, including everything that is said on the tv. Do you have any idea what might cause this?

The snowboy hotword component should be listening right from the start. The first automation in automations.yaml does this so that it is “always on” listening for the hotword. What’s the state of the pocketsphinx component? It should be idle until the hotword is spoken.

If snowboy is picking up things it shouldn’t, there are a few things to try. First, if you haven’t already, make sure to train a personal model using the microphone that you have on your Pi. Next, the hotword_snowboy component has a configuration parameter called sensitivity that you can adjust from 0 to 1 (lower reduces false positives, default is 0.5). Finally, it may be necessary to try a different hotword component. The Mycroft Precise isn’t as good out of the box, but you have the opportunity to train your own network on as much data as you can collect (snowboy only takes 3 WAV files).

If anyone has experience with other hotword detectors, I’d love to hear about them. Ultimately, Rhasspy is just expecting a hotword_detected event to be fired within Home Assistant, so anything could technically work in its place, e.g. a button or an HTTP POST sent from a phone.

Sorry for the confusion, I meant that pocketsphinx is always transcribing and not waiting for a hotword to be triggered to do so.

WIth the tips you gave towards training a model for hotword detection, I’ll have a look into creating a custom one that will use the name for our personal (voice) assistant.

I tried to train both preexisting hotwords like Alexa as well as a few custom ones in Dutch using the website of Snowboy and the initial tests using Alexa seem to work. Its interesting to see that the models you train are language specific and Dutch is one of them as well.

The next few days I’ll monitor Rhasspy to see if it only activates when the hotword is triggered. At the moment I have a push notification sent went it detects the hotword or when a recording of a command has finished. This allow me to monitor it from remote.

I also made the tweak you added on Github to start listening again for the hotword once the text from pocketsphinx is passed to the intent recognizer.

Since it seems like a good niche for Rhasspy might be in the multilingual area, I’d like to start including examples for all of the supported languages. Would you be willing to translate the examples phrases into Dutch?

Hopefully it is not just a niche. Existing platforms like Snips will take a long time I think before supporting languages like Dutch.

In below you find the translated examples.md:

## intent:HassTurnOff
- zet de [woonkamerverlichting](name) uit
- zet de [garageverlichting](name) uit

## intent:HassTurnOn
- zet de [woonkamerverlichting](name) aan
- zet de [garageverlichting](name) aan

## intent:GetTime
- hoe laat is het
- vertel me hoe laat het nu is
- hoe laat is het nu

## intent:GetTemperature
- wat is de temperatuur
- hoe warm is het
- hoe koud is het
- wat is de huidige temperatuur

## intent:GarageOpenClosed
- is de garagepoort open
- is de garagepoort gesloten
- is de garagepoort open of gesloten

The changes I made to my installation with a custom hotword made Rhasspy react less often, but still way to often. I’ll investigate changing the sensitivity.

Thanks for the translations! What would go in the place of “living room lamp” and “garage light”? I plan to add separate HA configs with translated entity names.

I’ve found snowboy to be especially sensitive to white noise from nearby fans and to microphone volume (higher is worse). Mycroft Precise might be a better option long term since it lets you provide negative examples of audio without the hotword. I just recorded an hour’s worth of the TV playing and kids running around. I’ll have a docker soon to simplify the training process.

I updated the previous post with the translations.

I’m trying to get Mycroft Precise to work, but have an issue with installing Tensorflow 1.8.0. Once I’m passed that, that might do the trick.

We also did a test of wathing tv for an hour playing both Dutch and English content and Snowboy went off almost every 3 minutes, sometimes even sooner.

The issue I’m facing with Tensorflow 1.8.0 is that I’m running Raspbian Jessie with Python 3.6 and do not see an option to make these work together. There are wheels available for Python 3.4, but I’m not able to get these to work. I’m not an expert in this area, so its very likely I’m missing something.

When I install Tensorflow through piwheels, I get version 0.1.1. Do you have any idea if Tensorflow 1.8.0 can work together with Python 3.6?

What’s the exact Python version you’re using? Sometimes, it’s as ridiculous as having 3.6.4 when 3.6.5 is required. I’ve ended up using pyenv for those situations.

I’m running 3.6.4.

Would I be able to use pyenv within a virtualenv of homeassistant and instruct pip to use e.g. tensorflow-1.8.0-cp35-none-linux_armv7l.whl which I assume requires Python 3.5.

There seems to be a pyenv-virtualenv module, but before going down a rabbit hole, any guidance is welcome :blush:

I typically do it like this:

  1. Install pyenv and then use it to install the version of Python you want (warning: it takes forever)
  2. Set that version of Python locally in your shell with pyenv local X.Y.Z
  3. Create a virtualenv with the local X.Y.Z Python version with python3 -m venv <DIRECTORY>
  4. Activate the virtual environment (source <DIRECTORY>/bin/activate) and do python3 -m pip install ...

At this point, it might just be easier to use Docker if that’s supported on Jessie. But that’s a whole other rabbit hole…

Hi
I have followed your installation directions on my pi3
in the ypthon virtualization environment
install thanhaf public
but when i restart my pi
is Rhasspy gone and in my configuration there are still Rhasspy and tools
But when I called the service in the home assitant, there was no Rhasspy
please help