Rhasspy offline voice assistant toolkit

ntuseracc · June 4, 2019, 9:20am

Hey, all my internal services dont use any encryption (mqtt, rhasspy, nodered etc). Only HA is accessible thorugh a nginx reverse proxy with TLS from the outside.

koan · June 4, 2019, 9:22am

Ok, if you still want to try hermes-audio-server, try removing the tls key in the configuration file then:

        "tls": {
            "ca_certificates": "",
            "client_certificate": "",
            "client_key": ""
        }

I have just confirmed that this leads to the error message you described when using this configuration with a broker that’s not using TLS. I’ll update the documentation (and the error handling).

ntuseracc · June 4, 2019, 9:33am

Ah yes removing the tls config completely is working

koan · June 4, 2019, 9:35am

Cool! By the way, a new release is right around the corner, with logging, a daemon mode, and (still working on it) a way to choose your audio devices.

ntuseracc · June 4, 2019, 9:41am

Cool, will try the new release if it is ready.

btw, this is a bit off topic but maybe you can answer me this.
im not a huge linux/audio expert. I was thinking about using my pi not just for voice recognition but turn it into a kind of “smart speaker” using mpd or volumio. But for this to work it would need to lower the volume of the mediaplayer when there is audioplayback using hermes. Can this be done with “pure” alsa or is something like a soundserver needed for this?

koan · June 4, 2019, 9:45am

I’m not a huge audio expert myself, but I think this is possible. I know that a few people have talked about this on the Snips forum, but I can’t find the relevant discussions now. You’ll have to try

ntuseracc · June 4, 2019, 11:06am

Ok figured out the play-wav problem.

WAV urls are supported now instead of raw audio only. (Thanks by the way )

Because of this you need the correct content-type header (audio/wav) if you are sending raw audio
If no header is set it will try to read the content as url.

Just to give something back to the community i have created a small github repository where i will add custom rhasspy examples. At the moment there is only my improved timer for nodered available.

koan · June 4, 2019, 3:03pm

After some more testing I noticed that Rhasspy’s HermesAudioRecorder only records from the first site ID:

    if len(self.site_ids) > 0:                                              
        self.site_id = self.site_ids[0]

Is this by design?

koan · June 4, 2019, 4:52pm

I have just released Hermes Audio Server 0.2.0. You can upgrade your current version with:

 sudo pip3 install hermes-audio-server --upgrade

This release has a new option -d which starts the program in the background as a daemon. It logs its output then to syslog, you can follow it with tail -f /var/log/syslog. Instructions to start the programs as systemd services are in the README on GitHub (not yet on the PyPi project page).

With the option -v you now get a lot more debug information, which should help with troubleshooting. If you encounter some problems, please open an issue on GitHub.

I have deferred the option to configure the audio devices to a future version.

ntuseracc · June 4, 2019, 8:03pm

Hi,
just updated to 0.2.0 but i have problems running this as daemon. (have no tried systemd service). Running it in normal mode works without any problems but with the daemon flag there is no audio input or output. Nothing of interest in syslog either.

Jun 4 20:53:56 raspberrypi hermes-audio-player[4474]: hermes-audio-player 0.2.0

This is the only message for the player as long as it is running.
Dont have any more time today for testing but will try the systemd setup with a dedicated user tomorrow.

Btw, any plans to add support for the LEDs and Buttons on the respeaker hat?
I thought a about a switch for disable voice recording completely or manualy wake word support ? (Is there a long press). If it was an mqtt switch that you could toggle through ha it would be even better
LEDs would be nice for a silent mode without sound to indicate listening.

koan · June 4, 2019, 8:29pm

Maybe try to run the daemon with the -v flag temporarily to see if this gives more information in the syslog.

For the leds and buttons on the ReSpeaker HAT I refer you to the excellent Snips Led Control. I consider this out of scope for Hermes Audio Server.

One thing I’m definitely considering is remote control via MQTT, with messages with the status of the player and recorder, to disable and enable the player and recorder on specific sites, volume control, and so on. Then you could for instance create an automation in Home Assistant to automatically disable audio input and output for the satellite in your living room when you’re watching something in Kodi (and automatically re-enable it when you pause Kodi). I had something like that for the hotword when I was using Snips, it’s perfect to prevent unintended activations.

synesthesiam · June 6, 2019, 12:10am

Thanks! There’s not currently a way to test the wake word through the browser, but I started working on a websocket-based audio recorder, so you could theoretically stream audio to Rhasspy entirely through a browser!

synesthesiam · June 6, 2019, 12:13am

I’m sure this could be done with PulseAudio and its family of command-line tools. I actually have a version of the Rhasspy Docker image that works with PulseAudio, if you’re interested.

synesthesiam · June 6, 2019, 12:17am

koan:

After some more testing I noticed that Rhasspy’s HermesAudioRecorder only records from the first site ID:
    if len(self.site_ids) > 0:                                              
        self.site_id = self.site_ids[0] 
Is this by design?

It is, but only because I’m not sure what will happen if the two audio streams get mixed (e.g., if speech is detected by two sites simultaneously). If you don’t think that will be a problem, I can remove the check

synesthesiam · June 6, 2019, 12:22am

ntuseracc:

Hi, there seems to be a problem with porcupine and i cant get it to work.
I just updated my docker container, configured rhasspy:

"wake": {
    "porcupine": {
		"library_path": "porcupine/libpv_porcupine.so",
		"model_path": "porcupine/porcupine_params.pv",
		"keyword_path": "porcupine/porcupine.ppn",
		"sensitivity": 0.5
    },
    "system": "porcupine"
},
"rhasspy": {
	"listen_on_start": true
}

But it wont wake, mqtt mic and speaker work fine (snips-audio-server).
Manually waking Rhasspy with the gui shows the following Python error:

[ERROR:80243] PorcupineWakeListener: loading wake handle
Traceback (most recent call last):
  File "/usr/share/rhasspy/rhasspy/wake.py", line 692, in in_started
    self.load_handle()
  File "/usr/share/rhasspy/rhasspy/wake.py", line 749, in load_handle
    sensitivities=self.sensitivities,
  File "/usr/share/rhasspy/porcupine.py", line 69, in __init__
    library = cdll.LoadLibrary(library_path)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 426, in LoadLibrary
    return self._dlltype(name)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /profiles/en/porcupine/libpv_porcupine.so: failed to map segment from shared object

Im running Rhasspy-Server in docker on a linux, amd64 system:

`Linux OMS-THOS-NAS 4.19.0-0.bpo.2-amd64 #1 SMP Debian 4.19.16-1~bpo9+1 (2019-02-07) x86_64 GNU/Linux`

After switching to porcupine wakeword detection rhasspy downloaded the necessary files.

That’s a very interesting and unusual error. The relevant line is:

OSError: /profiles/en/porcupine/libpv_porcupine.so: failed to map segment from shared object

Which kind of Pi are you using? There are a ton of libpv_porcupine.so files, and its possible I picked the wrong one for your architecture. It’s worth trying a different one and seeing if it loads.

koan · June 6, 2019, 7:06am

I’m not sure either what will happen But I do think it could be a problem: wouldn’t both streams getting mixed give the wake word detector and command listener an audio stream where it’s difficult to process your voice because there are ‘gaps’ (you are talking in one stream, silent in another one from another room) or ‘duplications’ (you are talking in both streams, because there are two microphones in the same room)?

Wouldn’t it be better to keep the audio streams from each recorder separate and also handle them separately further in the pipeline so the wake word detector and command listener get to process a single audio stream for each site?

ntuseracc · June 6, 2019, 7:58am

Rhasspy (porcupine) is not running on a pi at all, like i said X64 Linux/Debian in Docker.
I am using the pi only for audio streaming using snips (pi zero w).

Just tried to download the .so myself and replaced the one in my profile (linux x64) but still the same error.

ntuseracc · June 6, 2019, 8:47am

Thank you for the offer, but im currently running a setup with using snips/hermes as receiver and player and rhasspy itself is running on a server in the basement. So i would have to setup something on my satellite which i want to keep a light as possible.

koan · June 6, 2019, 8:58am

Actually we could use something like ‘rhasspy-satellite’: a stripped Rhasspy system that only runs something like Hermes Audio Server and the wake word component, so you could install this on a Pi. It could then stream audio to the full Rhasspy server running on a remote system when the wake word is detected on the Pi and play the audio that the Rhasspy server streams back. This would also make it possible to only stream audio after the wake word is detected, while now Hermes Audio Server is continuously streaming audio (at least when Voice Activity Detection isn’t enabled, and that isn’t working so well).

hassping · June 6, 2019, 12:31pm

This would actually be amazing, I’d like to split the audio from a rtsp stream from my camera and feed it into Rhasspy and see how that goes so this will help with that.