Rhasspy offline voice assistant toolkit

Hey, all my internal services dont use any encryption (mqtt, rhasspy, nodered etc). Only HA is accessible thorugh a nginx reverse proxy with TLS from the outside.

Ok, if you still want to try hermes-audio-server, try removing the tls key in the configuration file then:

        "tls": {
            "ca_certificates": "",
            "client_certificate": "",
            "client_key": ""
        }

I have just confirmed that this leads to the error message you described when using this configuration with a broker thatā€™s not using TLS. Iā€™ll update the documentation (and the error handling).

Ah yes removing the tls config completely is working :smiley:

Cool! By the way, a new release is right around the corner, with logging, a daemon mode, and (still working on it) a way to choose your audio devices.

1 Like

Cool, will try the new release if it is ready.

btw, this is a bit off topic but maybe you can answer me this.
im not a huge linux/audio expert. I was thinking about using my pi not just for voice recognition but turn it into a kind of ā€œsmart speakerā€ using mpd or volumio. But for this to work it would need to lower the volume of the mediaplayer when there is audioplayback using hermes. Can this be done with ā€œpureā€ alsa or is something like a soundserver needed for this?

Iā€™m not a huge audio expert myself, but I think this is possible. I know that a few people have talked about this on the Snips forum, but I canā€™t find the relevant discussions now. Youā€™ll have to try :slight_smile:

Ok figured out the play-wav problem.

WAV urls are supported now instead of raw audio only. (Thanks by the way :slight_smile: )

Because of this you need the correct content-type header (audio/wav) if you are sending raw audio :slight_smile:
If no header is set it will try to read the content as url.


Just to give something back to the community i have created a small github repository where i will add custom rhasspy examples. At the moment there is only my improved timer for nodered available.

3 Likes

After some more testing I noticed that Rhasspyā€™s HermesAudioRecorder only records from the first site ID:

    if len(self.site_ids) > 0:                                              
        self.site_id = self.site_ids[0] 

Is this by design?

I have just released Hermes Audio Server 0.2.0. You can upgrade your current version with:

 sudo pip3 install hermes-audio-server --upgrade

This release has a new option -d which starts the program in the background as a daemon. It logs its output then to syslog, you can follow it with tail -f /var/log/syslog. Instructions to start the programs as systemd services are in the README on GitHub (not yet on the PyPi project page).

With the option -v you now get a lot more debug information, which should help with troubleshooting. If you encounter some problems, please open an issue on GitHub.

I have deferred the option to configure the audio devices to a future version.

Hi,
just updated to 0.2.0 but i have problems running this as daemon. (have no tried systemd service). Running it in normal mode works without any problems but with the daemon flag there is no audio input or output. Nothing of interest in syslog either.

Jun 4 20:53:56 raspberrypi hermes-audio-player[4474]: hermes-audio-player 0.2.0

This is the only message for the player as long as it is running.
Dont have any more time today for testing but will try the systemd setup with a dedicated user tomorrow.

Btw, any plans to add support for the LEDs and Buttons on the respeaker hat?
I thought a about a switch for disable voice recording completely or manualy wake word support ? (Is there a long press). If it was an mqtt switch that you could toggle through ha it would be even better :slight_smile:
LEDs would be nice for a silent mode without sound to indicate listening.

Maybe try to run the daemon with the -v flag temporarily to see if this gives more information in the syslog.

For the leds and buttons on the ReSpeaker HAT I refer you to the excellent Snips Led Control. I consider this out of scope for Hermes Audio Server.

One thing Iā€™m definitely considering is remote control via MQTT, with messages with the status of the player and recorder, to disable and enable the player and recorder on specific sites, volume control, and so on. Then you could for instance create an automation in Home Assistant to automatically disable audio input and output for the satellite in your living room when youā€™re watching something in Kodi (and automatically re-enable it when you pause Kodi). I had something like that for the hotword when I was using Snips, itā€™s perfect to prevent unintended activations.

Thanks! Thereā€™s not currently a way to test the wake word through the browser, but I started working on a websocket-based audio recorder, so you could theoretically stream audio to Rhasspy entirely through a browser!

Iā€™m sure this could be done with PulseAudio and its family of command-line tools. I actually have a version of the Rhasspy Docker image that works with PulseAudio, if youā€™re interested.

It is, but only because Iā€™m not sure what will happen if the two audio streams get mixed (e.g., if speech is detected by two sites simultaneously). If you donā€™t think that will be a problem, I can remove the check :slight_smile:

Thatā€™s a very interesting and unusual error. The relevant line is:

OSError: /profiles/en/porcupine/libpv_porcupine.so: failed to map segment from shared object

Which kind of Pi are you using? There are a ton of libpv_porcupine.so files, and its possible I picked the wrong one for your architecture. Itā€™s worth trying a different one and seeing if it loads.

Iā€™m not sure either what will happen :slight_smile: But I do think it could be a problem: wouldnā€™t both streams getting mixed give the wake word detector and command listener an audio stream where itā€™s difficult to process your voice because there are ā€˜gapsā€™ (you are talking in one stream, silent in another one from another room) or ā€˜duplicationsā€™ (you are talking in both streams, because there are two microphones in the same room)?

Wouldnā€™t it be better to keep the audio streams from each recorder separate and also handle them separately further in the pipeline so the wake word detector and command listener get to process a single audio stream for each site?

Rhasspy (porcupine) is not running on a pi at all, like i said X64 Linux/Debian in Docker.
I am using the pi only for audio streaming using snips (pi zero w).

Just tried to download the .so myself and replaced the one in my profile (linux x64) but still the same error.

Thank you for the offer, but im currently running a setup with using snips/hermes as receiver and player and rhasspy itself is running on a server in the basement. So i would have to setup something on my satellite which i want to keep a light as possible.

Actually we could use something like ā€˜rhasspy-satelliteā€™: a stripped Rhasspy system that only runs something like Hermes Audio Server and the wake word component, so you could install this on a Pi. It could then stream audio to the full Rhasspy server running on a remote system when the wake word is detected on the Pi and play the audio that the Rhasspy server streams back. This would also make it possible to only stream audio after the wake word is detected, while now Hermes Audio Server is continuously streaming audio (at least when Voice Activity Detection isnā€™t enabled, and that isnā€™t working so well).

This would actually be amazing, Iā€™d like to split the audio from a rtsp stream from my camera and feed it into Rhasspy and see how that goes so this will help with that.