Rhasspy offline voice assistant toolkit

Thanks a lot for your support and explanations this makes it easier to understand and support you in this project . So tomorrow I’ll be watching your video on how to train my voice and add commands / sentences . I still don’t know how to define the hotwire or how to configure this .as this is handled directly in Rhasspy I need to know the source and the way to change it. Also it’s not really clear which software I’ll use for tts stt etc and how to configure those . As soon as I got everything setup and understood I’ll help u out with German translation etc. Thanks so far and we will see each other tomorrow :wink:

But one other suggestion / recommendation:

It’s not really practical to have to hold down the button to record a sentence…it might be better one Klick record …second click stop.is that possible as I’m using a touchscreen and hold won’t work

Next passage I didn’t understand is how it works really … I make intents with sentences but will be the friendly name of an entity I’d be used ?(furtunately I cleaned up all my entities :wink: )

Last but not least: which folder with examples are applicable for docker container install ?

You can use watchtower for docker to keep your docker images up to date :slight_smile:
https://github.com/v2tec/watchtower

Awesome, thank you. That should be a quick fix tonight. I’ll just add a place for it on the Settings tab.

You must be using the “server” image, which does intent recognition through rasaNLU. It uses machine learning, and it apparently doesn’t have enough examples to make the right choice! I’d recommend one of two things:

  1. Add some more sentences to GetTemperature or GetTime and re-train.
  2. Use my cheesy intent recognition system instead by either switching to the rhasspy-client image (which doesn’t include rasaNLU) or edit your profile.json file and change the system property under intent from rasa to fuzzywuzzy

It’s a shame that rasaNLU needs more data for such a simple task, but I guess it only has 5 total examples in this case. My fuzzywuzzy system just compares the transcription from the speech (“what time is it”) to all of the sentences you provide and picks the closest one. Cheesy, but effective for few examples!

Hello,
i updated to the addon-version 1.1. on a x64 linux docker and made the config like this:

{
  "run_dir": "/data/rhasspy",
  "default_profile": "de"
}

In the webui on the speech-tab i get an error:

[Errno 2] No such file or directory: 'profiles/de/phoneme_examples.txt'

With portainer i managed to look in profiles/de/ directory wich is empty. I don’t know how to upload files to this directory within a docker container :-(.

Expression ‘paInvalidSampleRate’ failed in ‘src/hostapi/alsa/pa_linux_alsa.c’, line: 2048

Expression ‘PaAlsaStreamComponent_InitialConfigure( &self->capture, inParams, self->primeBuffers, hwParamsCapture, &realSr )’ failed in ‘src/hostapi/alsa/pa_linux_alsa.c’, line: 2719

Expression ‘PaAlsaStream_Configure( stream, inputParameters, outputParameters, sampleRate, framesPerBuffer, &inputLatency, &outputLatency, &hostBufferSizeMode )’ failed in ‘src/hostapi/alsa/pa_linux_alsa.c’, line: 2843

ERROR:root:[Errno -9997] Invalid sample rate

Traceback (most recent call last):

File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1813, in full_dispatch_request

rv = self.dispatch_request()

File “/usr/local/lib/python3.6/dist-packages/flask/app.py”, line 1799, in dispatch_request

return self.view_functions[rule.endpoint](**req.view_args)

File “/usr/share/rhasspy/app.py”, line 578, in api_start_recording

stream_callback=stream_callback)

File “/usr/local/lib/python3.6/dist-packages/pyaudio.py”, line 750, in open

stream = Stream(self, *args, **kwargs)

File “/usr/local/lib/python3.6/dist-packages/pyaudio.py”, line 441, in init

self._stream = pa.open(**arguments)

OSError: [Errno -9997] Invalid sample rate

INFO:werkzeug:172.17.0.1 - - [11/Dec/2018 21:08:04] “POST /api/start-recording?profile=en&device=-1 HTTP/1.1” 500 -

After we get past the alpha version here of Rhasspy, my plan is to enable the wake system so you can say something like “okay Rhasspy” and then issue a command (like an Alexa). This actually works right now, but some work needs to be done so Rhasspy doesn’t wake up too much.

I’d also be happy to add the two touch option to the web interface. That would be very easy.

Rhasspy sends over an event when you give a speech command with properties matching exactly what it matched in the text (from your sentences). When you write your automations in HA, you catch these events and do whatever with them. If you make your sentences right, you could use templating in HA to save having to write a rule for every entity name. One way to do this is with tag synonyms.

I just have one sample Home Assistant configuration for the time being.

Ok ! That’s really clear now… thanks a lot for your response. Now as shown above. I have sample rate error. I have Ubuntu 18. alsa-dev package also installed. I am absolutely not familiar with this sound stuff and don’t know where and how to check which driver is used by the system. Let me try to record something in the sytem directly.

I think I’ve finally solved this issue. Please refresh the add-on repository and re-build Rhasspy (version 1.11). This should install all of the supported languages.

Make sure you started the Docker container after plugging in the microphone and selecting it as the default one in the Ubuntu sound settings.

Also, try picking a different recording device in the Rhasspy Speech tab and see if PyAudio is just giving you the wrong one as a default. I think I still need to add a setting to make the selected device the default one that Rhasspy uses.

Thanks @synesthesiam - that worked great. :+1:

Now, let the fun begin :smile:

unfortunately that won’T work… stopped container… plugged in mix … sartet container… USB mic detected and then: [Errno -9997] Invalid sample rate

but even in Ubuntu itself i cannot record. USB is shown but no sound would be recorded . i should try arecord or something like this to test?
OK i think i will have to go for another mic :slight_smile: this one doesn’T record anyting

If you do test with arecord, use these settings:

arecord -t wav -f S16_LE -r 16000 -c 1 > test.wav

Rhasspy will always try to record in 16-bit mono at 16Khz (this is what the pocketsphinx speech recognition models all need).

I’m seeing the same error:
[Errno -9997] Invalid sample rate

And trying to run arecord -t wav -f S16_LE -r 16000 -c 1 > test.wav results in:
arecord: main:788: audio open error: No such file or directory

And I can’t get any output on any device (USB or speaker plugged into the 3.5mm socket of the RPi) from the ‘Pronounce’ button on the ‘Words’ tab either.

How would I set the default recording and playback device in a headless Raspian install?

Update:
Installed pactl (sudo apt-get install pulseaudio) and tried to set the defaults by using this guide - no success:

watchtower:
image: v2tec/watchtower
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /root/.docker/config.json:/config.json
command: --interval 30
I am using this form

1 Like

This worked fine. I am now getting the correct events going in to HA. I use appdaemon for automations, so I have set up an appdaemon app to receive the events, and everything seems to work as expected.

I am not sure about your instructions for running the docker container. Mapping /dev/snd as a volume resulted in the sound locking up on my host. Mapping it as a device seems to work fine. I pasted my docker-compose file above, but I think on the command line it would be something like

docker run -d -p 12101:12101 \
      --restart unless-stopped \
      -e RHASSPY_PROFILES=/profiles \
      -v "$HOME/.rhasspy:/profiles" \
      --device /dev/snd:/dev/snd \
      synesthesiam/synesthesiam/rhasspy-server:amd64

Excellent, thank you! I updated the README on GitHub to reflect this (and posted your docker compose config too).

What do you and @thundergreen see when you run arecord -l or aplay -l (l as an “list”)?

I wonder if using @gpbenton’s fix with --device instead of -v for /dev/snd might also help. As a last resort, we can try and get your Rhasspys running in virtual environments.

Here is the output for arecord:

**** List of CAPTURE Hardware Devices ****
card 1: U0x46d0x8d7 [USB Device 0x46d:0x8d7], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

And this is the result for aplay:

**** List of PLAYBACK Hardware Devices ****
card 0: ALSA [bcm2835 ALSA], device 0: bcm2835 ALSA [bcm2835 ALSA]
  Subdevices: 7/7
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
card 0: ALSA [bcm2835 ALSA], device 1: bcm2835 ALSA [bcm2835 IEC958/HDMI]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

Stopped the container, re-started it with --device instead of -v.
Same error message when I hit the ‘Hold to Record’ button:
[Errno -9997] Invalid sample rate

Based on this Stackoverflow answer, you might try creating a file in your home directory named .asoundrc (be sure to keep the “.” in front) with this inside:

pcm.!default {
    type asym
    playback.pcm "plughw:0"
    capture.pcm  "plughw:1"
}

ctl.!default {
    type hw
    card 1
}

Reboot and try again. This should set the USB microphone as the “default” capture device while still outputting sound to the 3.5mm jack.

I have a hard time believing that any microphone these days would be incapable of recording mono audio at 16 Khz, so this must be a case of Raspbian just being dumb.

Sorry - it’s not making any difference.

I still get the same error message for recording and no sound on the speakers.