Rhasspy offline voice assistant toolkit

File “/usr/share/rhasspy/app.py”, line 83, in

load_default_profile()

File “/usr/share/rhasspy/app.py”, line 70, in load_default_profile

default_profile = Profile(default_profile_name, profiles_dirs)

File “/usr/share/rhasspy/profiles.py”, line 11, in init

self.load_profile()

File “/usr/share/rhasspy/profiles.py”, line 30, in load_profile

recursive_update(self.json, json.load(profile_file))

File “/usr/lib/python3.6/json/init.py”, line 299, in load

parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

File “/usr/lib/python3.6/json/init.py”, line 354, in loads

return _default_decoder.decode(s)

File “/usr/lib/python3.6/json/decoder.py”, line 339, in decode

obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File “/usr/lib/python3.6/json/decoder.py”, line 355, in raw_decode

obj, end = self.scan_once(s, idx)

json.decoder.JSONDecodeError: Expecting ‘,’ delimiter: line 5 column 9 (char 113)

This is my error file

but Here’s avery positive point: This project seems to become very popular and beloved :slight_smile: Let’s make this piece of software a common part of Home Assistant :slight_smile: If there’s anything I can do… Let me know.

It looks like you got caught in a state where there’s some malformed JSON in your profile.json. Depending on your docker-compose configuration, your profiles could either be inside the Docker container or on your file system. Wherever they are, try deleting them and starting from a fresh Docker pull or copying from the Github profiles repo.

Thank you! I’m kind of building this plane as I’m flying it, so to speak. So I understand the confusion. I’ll try to answer your questions as best I can.

  1. Correct, no custom components are needed with the new Rhasspy. Everything is self-contained, and communication with Home Assistant is done via the HTTP REST API.
  2. This will be my next video tutorial. The next step is to create one or more intents and some example sentences like in the English profile. Then, after re-training, should be able to speak those sentences in the Speech tab and have Rhasspy send an event to Hass. You’d then add to automations.yml to do something with the event, like turn some device on/off.
  3. I agree. The only problem I’ve found with the manual Docker install is that sometimes microphones/speakers don’t work right (this is why /dev/snd is mapped in and why the privileged flag is set (per the question by @gpbenton). So the venv option is a last resort that let’s Rhasspy access the hardware natively instead of through Docker.
  4. Yes, I will be cleaning up the Docker images (the names make no sense right now), and writing a single document describing the manual Docker and Hass.IO methods.
  5. German translations would be very welcome! I don’t have anything yet. The phoneme_examples.txt file in the German profile was a best guess at common German words and their pronunciations (from base_dictionary.txt).

The access tokens in on my TODO list. It was easy in Hass.IO where the token is provided via an environment variable. For external HA access, I have to add in the whole websocket back and forth to create a token, I guess? Is there any way to create a long-lived token from within the HA interface that I could just have the user paste into a Rhasspy profile?

As best I can tell, this is necessary to give the Docker container read/write privileges to /dev/snd so it can access the microphone and speakers of the host system. You might try doing docker run without it and see if it works. If so, that would be great, since I don’t want more access to the host system than is strictly necessary.

Yes, creating the long lived token is normally done in HA, from the user profile page. Its normally just a matter of pasting it somewhere.

I have, and it seems to work fine without privileges - I was just wondering if there was definitely something I was missing.

FYI my docker-compose entry is

    rhasspy:
        image: "synesthesiam/rhasspy-server:amd64"
        restart: unless-stopped
        container_name: rhasspy-server
        environment:
            RHASSPY_PROFILES: "/profiles"
        volumes:
            - "./rhasspy_config:/profiles"
        ports:
            - "12101:12101"
        devices:
            - "/dev/snd:/dev/snd"

I have got further now, and almost everything seems to be working, except that when I speak into the web interface, it correctly recognizes I said What time is it, but then converts that in to a getTemperature intent.

"intent":
"entities":
"intent":
"confidence": 0.5575017645851849
"name": "GetTemperature"
"intent_ranking":
0:
"confidence": 0.5575017645851849
"name": "GetTemperature"
1:
"confidence": 0.31922910283110323
"name": "GetTime"
2:
"confidence": 0.0552980109184647
"name": "GetGarageState"
3:
"confidence": 0.0403309328804052
"name": "ChangeLightState"
4:
"confidence": 0.027640188784842316
"name": "ChangeLightColor"
"model": "model_20181211-173604"
"project": "default"
"text": "what time is it"
"time_sec": 0.03763532638549805

So something is wrong in translating the sentence to the intent, but I have re-trained and reloaded without effect.

Ok, up and running.
How can I help you with translation now?

Thanks a lot for your support and explanations this makes it easier to understand and support you in this project . So tomorrow I’ll be watching your video on how to train my voice and add commands / sentences . I still don’t know how to define the hotwire or how to configure this .as this is handled directly in Rhasspy I need to know the source and the way to change it. Also it’s not really clear which software I’ll use for tts stt etc and how to configure those . As soon as I got everything setup and understood I’ll help u out with German translation etc. Thanks so far and we will see each other tomorrow :wink:

But one other suggestion / recommendation:

It’s not really practical to have to hold down the button to record a sentence…it might be better one Klick record …second click stop.is that possible as I’m using a touchscreen and hold won’t work

Next passage I didn’t understand is how it works really … I make intents with sentences but will be the friendly name of an entity I’d be used ?(furtunately I cleaned up all my entities :wink: )

Last but not least: which folder with examples are applicable for docker container install ?

You can use watchtower for docker to keep your docker images up to date :slight_smile:
https://github.com/v2tec/watchtower

Awesome, thank you. That should be a quick fix tonight. I’ll just add a place for it on the Settings tab.

You must be using the “server” image, which does intent recognition through rasaNLU. It uses machine learning, and it apparently doesn’t have enough examples to make the right choice! I’d recommend one of two things:

  1. Add some more sentences to GetTemperature or GetTime and re-train.
  2. Use my cheesy intent recognition system instead by either switching to the rhasspy-client image (which doesn’t include rasaNLU) or edit your profile.json file and change the system property under intent from rasa to fuzzywuzzy

It’s a shame that rasaNLU needs more data for such a simple task, but I guess it only has 5 total examples in this case. My fuzzywuzzy system just compares the transcription from the speech (“what time is it”) to all of the sentences you provide and picks the closest one. Cheesy, but effective for few examples!

Hello,
i updated to the addon-version 1.1. on a x64 linux docker and made the config like this:

{
  "run_dir": "/data/rhasspy",
  "default_profile": "de"
}

In the webui on the speech-tab i get an error:

[Errno 2] No such file or directory: 'profiles/de/phoneme_examples.txt'

With portainer i managed to look in profiles/de/ directory wich is empty. I don’t know how to upload files to this directory within a docker container :-(.

Expression ‘paInvalidSampleRate’ failed in ‘src/hostapi/alsa/pa_linux_alsa.c’, line: 2048

Expression ‘PaAlsaStreamComponent_InitialConfigure( &self->capture, inParams, self->primeBuffers, hwParamsCapture, &realSr )’ failed in ‘src/hostapi/alsa/pa_linux_alsa.c’, line: 2719

Expression ‘PaAlsaStream_Configure( stream, inputParameters, outputParameters, sampleRate, framesPerBuffer, &inputLatency, &outputLatency, &hostBufferSizeMode )’ failed in ‘src/hostapi/alsa/pa_linux_alsa.c’, line: 2843

ERROR:root:[Errno -9997] Invalid sample rate

Traceback (most recent call last):

File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1813, in full_dispatch_request

rv = self.dispatch_request()

File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1799, in dispatch_request

return self.view_functions[rule.endpoint](**req.view_args)

File “/usr/share/rhasspy/app.py”, line 578, in api_start_recording

stream_callback=stream_callback)

File “/usr/local/lib/python3.6/dist-packages/pyaudio.py”, line 750, in open

stream = Stream(self, *args, **kwargs)

File “/usr/local/lib/python3.6/dist-packages/pyaudio.py”, line 441, in init

self._stream = pa.open(**arguments)

OSError: [Errno -9997] Invalid sample rate

INFO:werkzeug:172.17.0.1 - - [11/Dec/2018 21:08:04] “POST /api/start-recording?profile=en&device=-1 HTTP/1.1” 500 -

After we get past the alpha version here of Rhasspy, my plan is to enable the wake system so you can say something like “okay Rhasspy” and then issue a command (like an Alexa). This actually works right now, but some work needs to be done so Rhasspy doesn’t wake up too much.

I’d also be happy to add the two touch option to the web interface. That would be very easy.

Rhasspy sends over an event when you give a speech command with properties matching exactly what it matched in the text (from your sentences). When you write your automations in HA, you catch these events and do whatever with them. If you make your sentences right, you could use templating in HA to save having to write a rule for every entity name. One way to do this is with tag synonyms.

I just have one sample Home Assistant configuration for the time being.

Ok ! That’s really clear now… thanks a lot for your response. Now as shown above. I have sample rate error. I have Ubuntu 18. alsa-dev package also installed. I am absolutely not familiar with this sound stuff and don’t know where and how to check which driver is used by the system. Let me try to record something in the sytem directly.

I think I’ve finally solved this issue. Please refresh the add-on repository and re-build Rhasspy (version 1.11). This should install all of the supported languages.

Make sure you started the Docker container after plugging in the microphone and selecting it as the default one in the Ubuntu sound settings.

Also, try picking a different recording device in the Rhasspy Speech tab and see if PyAudio is just giving you the wrong one as a default. I think I still need to add a setting to make the selected device the default one that Rhasspy uses.

Thanks @synesthesiam - that worked great. :+1:

Now, let the fun begin :smile:

unfortunately that won’T work… stopped container… plugged in mix … sartet container… USB mic detected and then: [Errno -9997] Invalid sample rate

but even in Ubuntu itself i cannot record. USB is shown but no sound would be recorded . i should try arecord or something like this to test?
OK i think i will have to go for another mic :slight_smile: this one doesn’T record anyting

If you do test with arecord, use these settings:

arecord -t wav -f S16_LE -r 16000 -c 1 > test.wav

Rhasspy will always try to record in 16-bit mono at 16Khz (this is what the pocketsphinx speech recognition models all need).

I’m seeing the same error:
[Errno -9997] Invalid sample rate

And trying to run arecord -t wav -f S16_LE -r 16000 -c 1 > test.wav results in:
arecord: main:788: audio open error: No such file or directory

And I can’t get any output on any device (USB or speaker plugged into the 3.5mm socket of the RPi) from the ‘Pronounce’ button on the ‘Words’ tab either.

How would I set the default recording and playback device in a headless Raspian install?

Update:
Installed pactl (sudo apt-get install pulseaudio) and tried to set the defaults by using this guide - no success: