Raspberry Pi as a HA Voice Assist CHAPTER 4 satellite

UPDATE: Home Assistant’s “Year of the Voice” chapter 5 seems to use a newer package on the RasPi satellite. Please use the link to Year of the Voice - Chapter 5

Please consider this guide to be out of date


Home Assistant’s “Year of the Voice” suddenly got a lot more interesting and useful with Chapter 4: wake words. I have been using Raspberry Pi Zero, 3A and 3B models with Rhasspy as voice assistant satellites, and decided now is the time to swap to HA Voice Assist.

You will need:

  • a Raspberry Pi (any model). I do NOT recommend buying a Raspberry Pi for use as a satellite, but if you already have one not being used …
  • a microphone and a speaker. These can be integrated in one unit (such as a conferencing speaker, or a reSpeaker 2-mic HAT) or separate USB microphone and speaker. Note that this guide does NOT cover plugging a USB conference mic into your HA machine.
  • Some common sense, ability to do searches, and ability to think for yourself. I find often I need to take a break and come back to a problem the next day.

The procedure is pretty straight-forward…

The easiest variation is with Home Assistant add-ons doing all the ‘heavy lifting’ … openWakeWord detecting the wake word as well as Whisper for speech-to-text and Piper for text-to-speech - and so it’s what we will do and test.

After getting the basic configuration working we can go back and activate openWakeWord locally on the RasPi satellite. Note that we still want to keep openWakeWord on the HA machine as well for RasPi Zero, v1 or v2 satellites, as well as any other satellite devices which are short on CPU power. But that’s for another day.

Install Home Assistant voice assist add-ons

On Home Assistant, install HA add-ons - Whisper (speech-to-text), Piper (text to speech), openWakeWord and Wyoming (to link them all together).

Under “Settings > Voice Assistants” then under “Assist” configure your desired Voice Assistant

After updating, you will want to restart Home Assistant to make sure these add-ons are started correctly

Voices

If there are several options for Text-to-Speech Voice (depending on language), the ones listed as “(medium)” are higher quality and so sound better - but the “(low)” quality ones will process faster on a less powerful computer.

Wake Word

Remember which Wake Word you chose, as that is what you will need to say to get Voice Assist’s attention to give it every command.

The default list is pretty small, and it is possible to add others or even make your own – however I won’t discuss that here, except to say that three syllables is considered the minimum to avoid it activating too often, and of course choose something you are not likely to say in regular conversation. “Computer” sounds great on Star Trek, but would activate too many times when I talk about my favourite hobby ;-).

Setup Raspberry Pi

On your PC, take a fresh microSD card and install Raspberry Pi OS Lite on it. Can be 64-bit (if you have a RasPi 3, 4 or 5 with 8GB or more RAM) or 32-bit. Use the “Lite” version since you won’t use the GUI and it will just slow the RasPi down. If using the raspberry Pi Imager, I also recommend setting up SSH and Wi-fi.

On your Raspberry Pi, attach your mic and speakers, insert your microSD card and turn on.
When RasPi OS is set up, install any necessary drivers, and test the audio hardware. This is probably the hardest part because it depends totally on what mic and speakers you have, and thus it’s impossible to give detailed instructions here for each possible device.

Run arecord -L to list available input devices. Pick devices that start with plughw: because they will perform software audio conversions. In my case ABTWPDQ0222M is the USB mic on my workbench

pi@HA-voice-sat1:~/homeassistant-satellite $ arecord -L
null
    Discard all samples (playback) or generate zero samples (capture)
hw:CARD=ABTWPDQ0222M,DEV=0
    ABTWPDQ-0222-M, USB Audio
    Direct hardware device without any conversions
plughw:CARD=ABTWPDQ0222M,DEV=0
    ABTWPDQ-0222-M, USB Audio
    Hardware device with all software conversions
default:CARD=ABTWPDQ0222M
    ABTWPDQ-0222-M, USB Audio
    Default Audio Device
sysdefault:CARD=ABTWPDQ0222M
    ABTWPDQ-0222-M, USB Audio
    Default Audio Device
front:CARD=ABTWPDQ0222M,DEV=0
    ABTWPDQ-0222-M, USB Audio
    Front output / input
dsnoop:CARD=ABTWPDQ0222M,DEV=0
    ABTWPDQ-0222-M, USB Audio
    Direct sample snooping device
pi@HA-voice-sat1:~/homeassistant-satellite $ 

So I will be using “ --mic-device plughw:CARD=ABTWPDQ0222M,DEV=0 “ to tell the homeassistant-satellite script to use that specific input device.

Run aplay -L to list available output devices. Pick devices that start with plughw: because they will perform software audio conversions. In my case aplay -L gives me a lot of options including for sound through my HDMI monitor, or a speaker port on my microphone … but I am using headphones plugged into my RasPI’s 3.5mm speaker socket which is “plughw:CARD=Headphones,DEV=0

Test mic and speaker

You can use speaker-test -F S16 -r 16000 -D plughw:CARD=Headphones,DEV=0 to test that sound is coming from your speaker – remembering of course to use your device names.

Try recording 5 seconds to file “out.raw” and then listening to it with:

arecord -f S16_LE -r 16000 -D plughw:CARD=ABTWPDQ0222M,DEV=0 -d 5 -t raw out.raw
aplay -f S16_LE -r 16000 -D plughw:CARD=Headphones,DEV=0 out.raw

I started with one of the tiny USB microphones, but its quality was so poor that Voice Assist was unable to determine what command I was giving. Swapping to a different USB mic made a world of difference; and that is why you should test the audio quality before we add anything else.

Install homeassistant-satellite

On your Raspberry Pi satellite:

  1. Install homeassistant-satellite from the instructions at GitHub - synesthesiam/homeassistant-satellite: Streaming audio satellite for Home Assistant. If you are using a SSH terminal program you can simply copy from the Installation section and paste into the terminal window.
    I also installed the Voice Activity Detection and Audio Enhancement options (copy and paste the lines starting “.venv/”…

  2. To run the homeassistant-satellite, you then enter the command “script/run “ followed by all the options you want, and press [enter]. I ended up using many of the options … namely

script/run --host 192.168.1.98 --token HNuI1UfEKX...wUSXzn8xkVKQwgsCDQ \
     --mic-device plughw:CARD=ABTWPDQ0222M,DEV=0 \
     --snd-device plughw:CARD=Headphones,DEV=0 \
     --awake-sound sounds/awake.wav --done-sound sounds/done.wav \
     --auto-gain 5 --vad webrtcvad
  1. You should see a couple of lines displayed as the program initiates, and when it hears a noise it will display “WARNING: root: Speech detected”. This is normal.
    Speak the wake word into the microphone. At this point we are the using openWakeWord add-on on the Home Assistant machine (which is at the IP Address or machine name given in the “–host” option), and you will remember that we go to Settings > Voice Assistants and look under our selected Assist profile to find that my wake word is “Hey Jarvis”

  2. After a couple of seconds (longer the first time the WakeWord is used) you should hear the awake.wav sound to indicate that Assist is listening.
    Then you speak your command (e.g. “Turn on the Study light”), and approx 15 seconds later you should hear the done.wav sound to indicate that Assist is now processing your command. This 15 seconds delay is to make sure that you have actually finished speaking, and are not just pausing mid sentence.

  3. If the command was recognised, Assist will execute the command and play a confirmation through the speaker; otherwise Assist will tell you that it did not understand your command.

  4. And that’s it !

If it works Ok you can try some of the other options; run as a service (automatically starts whenever the RasPi starts and runs in the background); or run in Docker. Also you can see the commands which Voice Assist thought you said in the Home Assistant > Settings > Add-ons > Whisper “Log” tab.

If it doesn’t work … well you can:

  • add the " --debug" option to the script/run command, but be warned it can be hard to find anything useful in all the generated output.
  • add the " --debug-recording-dir " option to hear what homeassistant-satellite heard.
8 Likes

Part 2 – Add local Wake Word on Raspberry Pi Voice Assist satellite

Now that we have the basic RasPi Voice Assist satellite configuration working, its time to activate openWakeWord locally on the RasPi satellite.

Note that I am not going to turn off openWakeWord on the HA machine, since it may still be useful for any RasPi Zero, v1 or v2 satellites, as well as any other satellite devices which are short on CPU power.

Now the tricky bit … we want both openWakeWord and homeassistant-satelite running at the same time on the RasPi satellite. I am sure there are other/better ways to do this, but I’m no linux guru … so for now lets just open a second SSH command line session. Once they are configured and tested we can put one or both into services or docker containers.

I am running my test RasPi headless and using the Remmina program on my main ubuntu linux PC to open a remote terminal session - but you could be using putty or any other terminal program which uses SSH (Secure SHell) protocol - mainly because it’s easy to copy and paste into a terminal session :wink:

Leaving the current terminal session open (in case we want to come back to it for debugging), lets just open a second terminal session. Using 2 terminal sessions means that we can swap between the two terminal sessions to watch any output, error messages, etc.

Install openWakeWord on the RasPi

To install openWakeWord on the RasPi satellite, we follow instructions at GitHub - rhasspy/wyoming-openwakeword: Wyoming protocol server for openWakeWord wake word detection system. Once again we can copy and paste the blocks of instructions in the “Local Install” section. Namely:

git clone https://github.com/rhasspy/wyoming-openwakeword.git 
cd wyoming-openwakeword 
script/setup

There are only a few options for openWakeWord, but one you probably want is to set which Wake Word to use. Look in the models subdirectory to see what is already available and its file size:

pi@HA-voice-sat1:~/wyoming-openwakeword $ ls -l wyoming_openwakeword/models
total 5912
-rw-r--r-- 1 pi pi  855312 Oct 29 08:13 alexa_v0.1.tflite
-rw-r--r-- 1 pi pi 1330312 Oct 29 08:13 embedding_model.tflite
-rw-r--r-- 1 pi pi 1278912 Oct 29 08:13 hey_jarvis_v0.1.tflite
-rw-r--r-- 1 pi pi  860300 Oct 29 08:13 hey_mycroft_v0.1.tflite
-rw-r--r-- 1 pi pi  416140 Oct 29 08:13 hey_rhasspy_v0.1.tflite
-rw-r--r-- 1 pi pi 1092516 Oct 29 08:13 melspectrogram.tflite
-rw-r--r-- 1 pi pi  206380 Oct 29 08:13 ok_nabu_v0.1.tflite
pi@HA-voice-sat1:~/wyoming-openwakeword $ 

These filenames start with the wakeword (except for a couple of other common files which we will ignore for now). Synesthesiam’s instructions use “ok_nabu”, which I guess is because it is smallest so will run on lower powered computers, or maybe because Nabu Casa is currently paying his wages ;-). But I will go with “OK Nabu” to prove that it is the local wakeword being detected, and not “Hey Jarvis” on the HA machine.

script/run --uri 'tcp://0.0.0.0:10400' --preload-model 'ok_nabu'

Tell homeassistant-satellite to use it

OK; we have a local wake word running, but now we have to tell homeassistant-satellite to use it.

Go back to the homeassistant-satellite session, where homeassistant-satellite should still be running, and press [Ctrl-C] to interrupt it.

Going back to the homeassistant-satellite in github, we see under Wake word detection via a direct wyoming connection that we need to enable wyoming, then add a few additional parameters to the script/run command. In my case giving:

.venv/bin/pip3 install .[wyoming] 
script/run --host 192.168.1.98 --token HNuI1UfEKX...wUSXzn8xkVKQwgsCDQ \
     --mic-device plughw:CARD=ABTWPDQ0222M,DEV=0 \
     --snd-device plughw:CARD=Headphones,DEV=0 \
     --awake-sound sounds/awake.wav --done-sound sounds/done.wav \
     --auto-gain 5 --vad webrtcvad --wake-word wyoming

With both programs running, I tries giving the command “OK Nabu, turn on the study light” … which worked and gave feedback on the openWakeWord session:

pi@HA-voice-sat1:~/wyoming-openwakeword $ script/run --uri 'tcp://0.0.0.0:10400' --preload-model 'ok_nabu' --debug 
DEBUG:root:Namespace(uri='tcp://0.0.0.0:10400', models_dir=PosixPath('/home/pi/wyoming-openwakeword/wyoming_openwakeword/models'), custom_model_dir=[], preload_model=['ok_nabu'], threshold=0.5, trigger_level=1, output_dir=None, debug=True, debug_probability=False, model=[])
DEBUG:root:Loading ok_nabu_v0.1 from /home/pi/wyoming-openwakeword/wyoming_openwakeword/models/ok_nabu_v0.1.tflite
DEBUG:wyoming_openwakeword.handler:Started thread for ok_nabu_v0.1
DEBUG:root:Loading /home/pi/wyoming-openwakeword/wyoming_openwakeword/models/melspectrogram.tflite
DEBUG:root:Loading /home/pi/wyoming-openwakeword/wyoming_openwakeword/models/embedding_model.tflite
INFO:root:Ready
DEBUG:wyoming_openwakeword.handler:Client connected: 63323454678340
DEBUG:wyoming_openwakeword.handler:Receiving audio from client: 63323454678340
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
DEBUG:root:Triggered ok_nabu_v0.1 (client=63323454678340)
DEBUG:wyoming_openwakeword.handler:Client disconnected: 63323454678340
DEBUG:wyoming_openwakeword.handler:Client connected: 63354799057683
DEBUG:wyoming_openwakeword.handler:Receiving audio from client: 63354799057683

Now to start tidying up. Firstly I changed my local WakeWord to “Hey Jarvis” so that all satellites will activate with the same WakeWord. I note that when pressing [Ctrl-C] to interrupt the openWakeWord, that homeassistant-satellite also stopped.

Setup as services

Running two terminal sessions is not feasible for ongoing operation. Also I have noticed that homeassistant-satellite crashes if Home Assistant or openWakeWord is interrupted.

I expect most people will say to simply place both in Docker containers per the instructions given in the github repositories … but I personally don’t feel confident with Docker – so I will start Linux services for them both.

Instructions for setting up as a service are at GitHub - synesthesiam/homeassistant-satellite: Streaming audio satellite for Home Assistant, and I would only add to suggest you copy and paste your own /script/run command.

Follow the same procedure for setting up a wyoming-openwakeword service.

5 Likes

Thank you for these great instructions. I wished I had those before I installed my satellite. I just wonder what is the benefit of running OWW on the raspi? Is recognition faster?

Basically, because that’s the way it was working under Rhasspy :wink:

Three reasons (though the benefits are arguable):

  1. Under Rhasspy, the Wakeword was able to run locally on the satellite machines, and it seems a real waste of a $60 RasPi 3 or 4 to only be passing data packets between interfaces like a $15 microprocessor.
  2. I have had issues with devices dropping off my wi-fi, and wi-fi congestion; so I prefer to minimise the audio data streams going through wi-fi 24/7.
  3. Each wakeword running on the HA machine uses memory and processing resources. Not an issue for me personally, but for someone running HA on a RasPi 3 or 4 it could be noticable.

As for being faster … I have found both methods to be negligible compared to the 15 second delay before Whisper decides that you have finished speaking the command.

Note that I do NOT recommend anyone going out and buying a RasPi to run as a satellite - they are expensive overkill for the job - but plenty of people have one lying about currently unused, and they are easy to repurpose when a better satellite device comes along.

1 Like

I have some raspberry pi 3b and 4b leftover. I was planning to use this as Ropieee or Volumio, but now I think I can use these as satellite, but not sure which microphone to get.

  1. Can I use this satellite code as Docker image? It’s easier to manage since I have Portainer installed on HA and already manage multiple Pi satellites.
  2. Do you have any recommendation about microphone? I’m thinking about getting USB Microphone.
  3. Or, can I install this script on Android device that I have root access, and use its capability, speaker and microphone?

Q1 is answered clearly in the homeassistant-satellite instructions (which I gave a link to). The other 2 have been discussed in other threads, and I would rather keep this discussion on topic.

Thank you for this guide, it was super helpful and I was able to get my voice assistant working!

1 Like

first off thanks for the guide…i am getting this error when try to run the script run for the satellite

AssertionError: Pipeline failed to run

any idea?

No idea. I am not a developer who wrote these programs; and even if I was, the error message you quote doesn’t contain any detail to help work out where it is coming from.

Are you even using a Raspberry Pi with Raspberry Pi OS ?
Did you restart Home Assistant after installing the add-ons there ?

Please give us more clues to make it easier to help you.

Another user reported a problem in a different thread that turned out to be caused by him using the same terminal session for both step 1 and step 2.

It doesn’t hurt to repeat that we want both wyoming-openwakeword and homeassistant-satellite to be running at the same time.

I am running my test RasPi headless and using the Remmina program on my main ubuntu linux PC to open a remote terminal session - and so my simple way around this was to open two terminal sessions at the same time. I am sure there are other/better ways to do this, but I’m no linux GOD. It also means that I can swap between the two terminal sessions to watch any output, error messages, etc., and of course it’s easy to copy and paste into a terminal session :wink:

This is an absolutely fantastic resource, thanks so much!!

Solved this using alsamixer to select the specific interface and lower the volume

I’ve managed to get everything working apart from I can’t adjust the playback volume of the voice assistant when she responds… The buttons on the s330 conference device don’t seem to function and I’ve tried various amixer commands but nothing seems to alter the volume. Does anything need to be set within the satellite configuration for this??

1 Like

I’m having the same volume issue. Did you find a solution?

Good day to you Don, thank you so much for taking the time to create these tutorials. I was actually working on youre rhasppy tutorial you made in jan when I was googling some issues im having with my Respeaker 4-mic hat and stumbled acorss this new tutorial.

Im curious as to youre motivation to move from rhasppy and to be using HA voice now? better response, easier integration or other factors?

Im a hobby coder/ tinkerer and the end game is to have my own AI language model running the voice assisstant, but thats a fair ways away still.

So I’m curious(as I’m just starting to create the satellite with my pi zero I bought for your Rhasspy tutorial) the reasons you choose to swap over before I carry on with the Rhasspy guide. I have a dedicated desktop with Debian running HA after I got tired of fighting my Hyper-V VM for usb-passthrough, so I have lots of CPU power to crunch the data. I also already have a PI 3B as well.

Would it be better to use the 3B and be able to do a bit more processing on the satellite side to reduce the delays of the recognition or stick to pumping the signals to the PC?
Any thoughts in the direction I should take it for best efficiency is most appreciated.

1 Like

Thanks for this thread. I am going to try this on saturday. I have RPI3b and a jabre usb speaker/micro phone laying around. I manged to get the Whisper/Piper/OpenWakeword via docker on unraid where I run Home assistant. I order a Mstack ATOM ESP32 but it won’t get here from china until late december.
SO I will use the hardware I have laying around to setup and test my voice assistant.

I am still using Rhasspy 2.11 for my voice assistant. It does what I want, and my partner and I are used to porcupines foibles.

I see HA Voice Assist as the next step, both because it is effectively the next version of Rhasspy by Mike, and because it is designed to be closely integrated with Home Assistant (which is all I am using Rhasspy for).

However … currently HA Voice Assist is only installed on the test RasPi in my study, and there are a couple of aspects I am not happy with - most notably I have a 15 second delay between me stopping talking and Whisper processing the audio.

I have seen comments that the 15 second selay may be caused by the Virtual Machine that HA is running in - but I have not found the settings which were mentioned.
To be honest it has been some time since i looked at it last (I seem always to distract myself with other projects), and hopefully this wrinkle has already been worked out … or the December HA update may offer improvements.

Consequently I am not in a good position to recommend either way. At least with RasPi it is so easy to swap SD cards to change OS/applications.

thats great news, thank you for the response. since I have 2 pi’s I think I’ll try with both and see which works better for my setup. Have a great day

I am going to use a pi 3, what speaker and microphone do you recommend?

Great plan since you already have the units. I would love to hear your results.

Recommend ? I recommend that you wait until there is a better option on the market. :frowning_face:

I am not aware of anything with comparable sound quality to google or amazon devices … which is not chained to a proprietary cloud platform. Conference speakers are reported to be good, but expensive, and assume there will be no background noise.

I personally am hoping that Nabu Casa will leverage its ESPHome expertise to produce a ESP32-S3 based unit which can use some AI to do local wakeword detection, with a 3D printable case that channels the audio in and out to best effect. :crossed_fingers:

For my latest voice satellite unit I followed advice to just use a USB microphone and a basic speaker (I am not trying to play music, so quality isn’t a big issue) - and it is working fine. Most importantly they are cheap and easily reusable (unlike some of the mic cards I have bought previously).

Hey Don,

Sorry to highjack this response. I was on another thread on rhasspy where youre chiming in on good usb mic for rhasppy and I noticed that you mentioned you have both respeaker 2 mic and the 4 mic hats being used. I am struggling getting this 4 mic hat to work. Ive been following the rhasspy tutorial you wrote and im always stuck at the test mic and speakers part before I move on with guide. I can not get it to take a recording.

Ive been diving into the seed forums and hintaks github “issues” and a few other places ive stumbled across on google searches without success. Hence why i was starting to search for recommendations for another, but it would be a waste of money to not get this hat working. Im sure I oculd repurpose it for something else since i did have it working on my windows when I first bought it, but I did specifically buy it(before I seen your guide) for AI assistant usage.

Any tips you could give or do you recall any struggles getting that thing to work in the beginning? Maybe an older rasp OS version? Ive only tried bullseye and now trying bookworm.

thanks ahead of time for anything you cantoss my way :slight_smile:

I do like the reSpeaker 4-mic HAT on a RasPi 3A, particularly because of the patterns you can make with the LEDs to give feedback. The down sides are that all of the reSpeaker-type boards + RasPi are too expensive to deploy everywhere; and the driver does not make use of all the mics (and probably never will).

The reason I put emphasis on testing the audio in and out is because i also had problems with that :frowning: Now, can I remember what my problem was and how I resolved it ? …


Looking back on my initial notes … the 4-mic HAT uses a different sound chip, so instructions for your card (ReSpeaker 4-Mic Array for Raspberry Pi | Seeed Studio Wiki).

HinTak was updating the driver for different Raspi OS versions, and I haven’t touched anything on mine since i got it working (uname -r returns 5.10.103-v7+) … so wondering if the reSpeaker driver is a bit behind the latest RasPi OS’s kernel ?

Which RasPi OS are you using ? the 32-bit Lite version ?

I tested using headphones, for which the aplay -L entry was

plughw:CARD=Headphones,DEV=0
    bcm2835 Headphones, bcm2835 Headphones
    Hardware device with all software conversions

to test with 2 channels (left and right)

$ speaker-test -c2 -Dplughw:CARD=Headphones,DEV=0

To test microphone, arecord -L included

$ arecord -L
plughw:CARD=seeed4micvoicec,DEV=0
    seeed-4mic-voicecard, bcm2835-i2s-ac10x-codec0 ac10x-codec0-0
    Hardware device with all software conversions

I did note:
The reSpeaker 4-mic HAT appears to only allow recording of all 4 microphones (-channels 4) and -format of “S32_LE” with other options giving errors.

$ arecord -Dac108 -fS32_LE -r16000 -c4 hello.wav
Recording WAVE 'hello.wav' : Signed 32 bit Little Endian, Rate 16000 Hz, Channels 4 
^C

Whatever the microphones hear after entering the command and until you press will be saved to file “test.wav”.

Playing the test.wav file to check the recording requires an extra step for the 4-mic HAT because the recording is in S32-LE format but aplay works with S16_LE but not S32_LE :frowning: I installed and ran sox to convert formats

sudo apt-get install sox
sox -v8 hello.wav -c2 -b16 stereo.wav
aplay -Dhw:1 stereo.wav

Well, that does seem sufficiently different from the 2-mic reSpeaker :frowning_face: I hope that this is the hints that you were missing.


Current Rhasspy configuration for this satellite is:

{
    "home_assistant": {
        "access_token": "eyJ0eXAi ... ma_XmCZf2SenOU",
        "url": "http://192.168.1.98:8123/"
    },
    "intent": {
        "system": "hermes"
    },
    "microphone": {
        "arecord": {
            "device": "plughw:CARD=seeed4micvoicec,DEV=0",
            "udp_audio_port": "12203"
        },
        "system": "arecord"
    },
    "mqtt": {
        "enabled": "true",
        "host": "192.168.1.98",
        "password": "password",
        "port": "1883",
        "site_id": "sat-1",
        "username": "rhasspy"
    },
    "sounds": {
        "aplay": {
            "device": "plughw:CARD=Headphones,DEV=0",
            "volume": "0.8"
        },
        "system": "aplay"
    },
    "speech_to_text": {
        "system": "hermes"
    },
    "text_to_speech": {
        "system": "hermes"
    },
    "wake": {
        "porcupine": {
            "keyword_path": "porcupine_raspberry-pi.ppn",
            "sensitivity": "0.3",
            "udp_audio": "12203"
        },
        "system": "porcupine"
    }
}