Rhasspy offline voice assistant toolkit

Is this a Raspberry Pi 3 with the latest Raspbian (based on Debian Stretch)?

Yes, itā€™s an RPi3.

The command cat /etc/*{release,version} results in:

PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
9.6

Hi all,
i want to share with you my progress so far, maybe they can be useful for someone.

Iā€™m using a rPi 3 with a fresh install os Raspian stretch lite and Iā€™ve installed Rhasspy as described by @synesthesiam in a previous post:

It works flawless and even recognize the ps3 eye microphone out of the box.
Then I started to add sentences in my language (Italian) following the example that are present in English.
When you hit Re-Train it complains about missing words so you have to add those words (in the specific section) with the corresponding pronunciation. Itā€™s a little bit long process but itā€™s easy to do. Using the desktop version with NLU probably you can skip this steps. (hint: what about snipsNLU? itā€™s open source and should run on rpi)

The interesting part is how you can send the intent to home assistant.
Iā€™ve made some modification to the example provided in English so that I can create less and more efficient automation in Home assistant. For example:

[ChangeLightState]
room_name = (soggiorno | cucina | camera | camerina | bagno) {room}
light_name = (lampadario | faretti) {name}
light_cmd = (accendi | spengi) {command}

<light_cmd> [il] [la] <light_name> [in] [ <room_name> ]

This way I can say ā€œAccendi il lampadario in soggiornoā€ (translation: Turn on the livingroom chandelier) and I get the following intent:

"intent":
  "entities":
    0:
      "entity": "command"
      "value": "accendi"
    1:
      "entity": "name"
      "value": "lampadario"
    2:
      "entity": "room"
      "value": "soggiorno"
  "hass_event":
    "event_data":
      "command": "accendi"
      "name": "lampadario"
      "room": "soggiorno"
    "event_type": "rhasspy_ChangeLightState"
    "intent":
    "name": "ChangeLightState"
  "text": "accendi lampadario soggiorno"

Now if your entities in home assistant follow a recurring schema, for example

light.lamp_name.room_name

in your automation config file you can write something like this:

- alias: Light voice command (rhasspy)
  trigger:
    platform: event
    event_type: rhasspy_ChangeLightState
  action:
    service_template: >
      {% if trigger.event.data["command"] == "accendi" %}
        light.turn_on
      {% else %}
        light.turn_off
      {% endif %}
    data_template:
      entity_id: "light.{{ trigger.event.data['name'] }}_{{ trigger.event.data['room'] }}"

so with a single automation you can turn on and off every light that you can control with home assistant.

Hope this help

5 Likes

Excellent write-up, @infiniteloop. Thank you. With your permission, Iā€™d like to include your translations in the Italian profile and incorporate your automation template into the example HA config :slight_smile:

Thanks for the tip about snips-nlu. Looks like it supports most of the languages that Rhasspy does out of the box (Dutch and Russian are missing, unfortunately). They have a pre-built pip package for amd64, but it seems like Iā€™d have to build from source for the Raspberry Pi.

Another NLU system Iā€™ve been looking at is Mycroft Adapt, which seems pretty lightweight. One affordance of Adapt that some of the others donā€™t is that you can specify required and optional phrases explicitly for your intents. Itā€™s not as fancy with the machine learning, but seems like it would be a good step up from my fuzzywuzzy based system.

You are welcome, Iā€™m only making use of your awesome work.
Feel free to use whatever you need, however one thing to note is that the service_template part of the automation probably is not so useful for an English speaker.

Regarding the profile: locally is it updated whenever I create new sentences?

Encouraged by @infiniteloop, I just tried it on yet another RPi3:

command: cat /etc/*{release,version}

Result:

PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
9.6

Same result: I get the same error message for recording and no sound on the speakers.

For reference, hereā€™s what my Raspberry Pi 3 reports (installed fresh Raspbian this week, PS3 eye camera is connected via USB). Microphone works great, havenā€™t tested speakers yet.

$> arecord -l

**** List of CAPTURE Hardware Devices ****
card 1: CameraB409241 [USB Camera-B4.09.24.1], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
$> aplay -l

**** List of PLAYBACK Hardware Devices ****
card 0: ALSA [bcm2835 ALSA], device 0: bcm2835 ALSA [bcm2835 ALSA]
  Subdevices: 7/7
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
card 0: ALSA [bcm2835 ALSA], device 1: bcm2835 ALSA [bcm2835 IEC958/HDMI]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
$> cat /etc/*{release,version}

PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
9.4

The only difference I can see - apart from the model of the USB Camera, of course - is that your result from arecord -l is:

Subdevices: 1/1

while mine says:

Subdevices: 0/1

As proposed earlier, I have created a file called .asoundrc in my /home/Pi directory with the following content:

pcm.!default {
    type asym
    playback.pcm "plughw:0"
    capture.pcm  "plughw:1"
}

ctl.!default {
    type hw
    card 1
}

Permissions are ā€˜0644ā€™ and the owner is root

I found that mapping /dev/snd to the docker container with -v completely locked up the sound system on my host (Kubuntu 18.04). I had to reboot the host to free it up and play some sounds again, before attempting to run the container using the --device flag.


my new fault

If youā€™re not running this inside Hass.IO, make sure to change the URL for Home Assistant in the Settings. In your case, maybe itā€™s running at http://192.168.1.168:8123 ?

thank you very much
I would love your project
you do very well

1 Like

The short answer is yes, it will be written locally to $HOME/.rhasspy/profiles. Iā€™ve added a longer answer to the documentation, which describes how the RHASSPY_PROFILES environment variable is used like PATH to decide where to read/write profile-related files.

Docker Image Changes

Everyone: Iā€™ve consolidated the Rhasspy Docker images into the single name synesthesiam/rhasspy-server:latest in order to avoid confusion (and because it was the most heavily used).

Please use this Docker image when updating Rhasspy. NOTE: is does not include rasaNLU anymore ā€“ my plan is to move that out into a separate Docker container and just use its HTTP interface.

Thanks for the hint, @gpbenton.

Iā€™ve started rebooting my Pi on most config/system changes a long time ago - so I did that here as well.

Might just get rid of all containers and start over.

Curious to see, though, if @synesthesiam could get their speakers to work.
As long as my speakers donā€™t work I think itā€™s a settings issue - and therefore donā€™t see a reason just yet to buy a different (3rd?!) USB microphone.

Started over again, i.e.

  1. stopped and removed the container.
  2. made sure that the USB microphone and analog speakers worked (arecord -D plughw:1,0 --duration=3 test.wav && aplay test.wav)
  3. set up the new container (synesthesiam/rhasspy-server:latest) including --device /dev/snd:/dev/snd instead of -v /dev/snd:/dev/snd
  4. container seems to start up fine, now the output when pushing the ā€˜Pronounceā€™-button is sent to the speakers
  5. Pushing the ā€˜Hold to recordā€™-button still results in [Errno -9997] Invalid sample rate

Note - Point 2. shows the following info on the screen now:

Recording WAVE 'test.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono
Playing WAVE 'test.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono

Update:
I have since added a file with the name .bash_aliases to my home directory, the only content it hold is the line
alias arecord='arecord -D plughw:1,0 --format=S16_LE --rate=48000'

It changes the message that is displayed when using this command arecord -D plughw:1,0 --duration=3 test.wav && aplay test.wav to

Recording WAVE 'test.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Mono
Playing WAVE 'test.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Mono

Pushing the ā€˜Hold to recordā€™-button still results in [Errno -9997] Invalid sample rate

OK, at least youā€™re making some progress. What do you see in the list of devices on Rhasspyā€™s Speech tab? Is there anything besides ā€œdefaultā€?

Here is the dropdown list:
image

If I select Default Device or 2 I get the invalid sample rate error, for all the others I get
[Errno -9998] Invalid number of channels

Here are some further results of my testing & trying:

Opening a bash shell inside the container with
docker exec -it condescending_sanderson /bin/bash

and running

arecord -D plughw:1,0 --duration=3 test.wav && aplay test.wav

works both ways, i.e. it records and plays back properly, but it displays:

Recording WAVE 'test.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono
Playing WAVE 'test.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono

To me, this indicates that the standard sample rate is 8 bit while Rhasspy seems to require 16 bit.
Not sure, if or where I could add the file .bash_aliases containing the line
alias arecord='arecord -D plughw:1,0 --format=S16_LE --rate=48000'
as shown above.

Can that be added to the container at all, any hints?
nano doesnā€™t seem to work inside the container and Iā€™m not familiar with vi.

Another update:
Within the container, I changed the following entries in the file /usr/share/alsa/alsa.conf

from

defaults.ctl.card 0
defaults.pcm.card 0

to

defaults.ctl.card 1
defaults.pcm.card 1

This now means that I can record though the USB microphone - but playback through the speakers does not work. :frowning_face:

Leaving the pcm entry at 0 results yet in another error message, though.

OK, Iā€™ve got a couple of things in place and hopefully one will help :slight_smile:
Make sure you pull the latest Docker image before trying. For some reason, it seems like you have to pull the architecture specific image to get an update (Iā€™m still learning how to operate docker):

docker pull synesthesiam/rhasspy-server:armhf

In the Profiles tab, Iā€™ve added an Audio System section where you can pick either PyAudio (the default) or have Rhasspy use arecord directly. Try picking the latter, saving, and then refreshing. If this works, Iā€™d be very interested what PyAudio and arecord are doing differently!

In case that doesnā€™t work, Iā€™ve added some instructions on running Rhasspy in a virtual environment. Iā€™m really hoping that the first option works for you, though.

Good luck, and thanks for sticking with it!