UPDATE: Home Assistant’s “Year of the Voice” chapter 5 seems to use a newer package on the RasPi satellite. Please use the link to Year of the Voice - Chapter 5
Please consider this guide to be out of date
Home Assistant’s “Year of the Voice” suddenly got a lot more interesting and useful with Chapter 4: wake words. I have been using Raspberry Pi Zero, 3A and 3B models with Rhasspy as voice assistant satellites, and decided now is the time to swap to HA Voice Assist.
You will need:
- a Raspberry Pi (any model). I do NOT recommend buying a Raspberry Pi for use as a satellite, but if you already have one not being used …
- a microphone and a speaker. These can be integrated in one unit (such as a conferencing speaker, or a reSpeaker 2-mic HAT) or separate USB microphone and speaker. Note that this guide does NOT cover plugging a USB conference mic into your HA machine.
- Some common sense, ability to do searches, and ability to think for yourself. I find often I need to take a break and come back to a problem the next day.
The procedure is pretty straight-forward…
The easiest variation is with Home Assistant add-ons doing all the ‘heavy lifting’ … openWakeWord detecting the wake word as well as Whisper for speech-to-text and Piper for text-to-speech - and so it’s what we will do and test.
After getting the basic configuration working we can go back and activate openWakeWord locally on the RasPi satellite. Note that we still want to keep openWakeWord on the HA machine as well for RasPi Zero, v1 or v2 satellites, as well as any other satellite devices which are short on CPU power. But that’s for another day.
Install Home Assistant voice assist add-ons
On Home Assistant, install HA add-ons - Whisper (speech-to-text), Piper (text to speech), openWakeWord and Wyoming (to link them all together).
Under “Settings > Voice Assistants” then under “Assist” configure your desired Voice Assistant
After updating, you will want to restart Home Assistant to make sure these add-ons are started correctly
Voices
If there are several options for Text-to-Speech Voice (depending on language), the ones listed as “(medium)” are higher quality and so sound better - but the “(low)” quality ones will process faster on a less powerful computer.
Wake Word
Remember which Wake Word you chose, as that is what you will need to say to get Voice Assist’s attention to give it every command.
The default list is pretty small, and it is possible to add others or even make your own – however I won’t discuss that here, except to say that three syllables is considered the minimum to avoid it activating too often, and of course choose something you are not likely to say in regular conversation. “Computer” sounds great on Star Trek, but would activate too many times when I talk about my favourite hobby ;-).
Setup Raspberry Pi
On your PC, take a fresh microSD card and install Raspberry Pi OS Lite on it. Can be 64-bit (if you have a RasPi 3, 4 or 5 with 8GB or more RAM) or 32-bit. Use the “Lite” version since you won’t use the GUI and it will just slow the RasPi down. If using the raspberry Pi Imager, I also recommend setting up SSH and Wi-fi.
On your Raspberry Pi, attach your mic and speakers, insert your microSD card and turn on.
When RasPi OS is set up, install any necessary drivers, and test the audio hardware. This is probably the hardest part because it depends totally on what mic and speakers you have, and thus it’s impossible to give detailed instructions here for each possible device.
Run arecord -L
to list available input devices. Pick devices that start with plughw:
because they will perform software audio conversions. In my case ABTWPDQ0222M is the USB mic on my workbench
pi@HA-voice-sat1:~/homeassistant-satellite $ arecord -L
null
Discard all samples (playback) or generate zero samples (capture)
hw:CARD=ABTWPDQ0222M,DEV=0
ABTWPDQ-0222-M, USB Audio
Direct hardware device without any conversions
plughw:CARD=ABTWPDQ0222M,DEV=0
ABTWPDQ-0222-M, USB Audio
Hardware device with all software conversions
default:CARD=ABTWPDQ0222M
ABTWPDQ-0222-M, USB Audio
Default Audio Device
sysdefault:CARD=ABTWPDQ0222M
ABTWPDQ-0222-M, USB Audio
Default Audio Device
front:CARD=ABTWPDQ0222M,DEV=0
ABTWPDQ-0222-M, USB Audio
Front output / input
dsnoop:CARD=ABTWPDQ0222M,DEV=0
ABTWPDQ-0222-M, USB Audio
Direct sample snooping device
pi@HA-voice-sat1:~/homeassistant-satellite $
So I will be using “ --mic-device plughw:CARD=ABTWPDQ0222M,DEV=0
“ to tell the homeassistant-satellite script to use that specific input device.
Run aplay -L
to list available output devices. Pick devices that start with plughw:
because they will perform software audio conversions. In my case aplay -L gives me a lot of options including for sound through my HDMI monitor, or a speaker port on my microphone … but I am using headphones plugged into my RasPI’s 3.5mm speaker socket which is “plughw:CARD=Headphones,DEV=0
”
Test mic and speaker
You can use speaker-test -F S16 -r 16000 -D plughw:CARD=Headphones,DEV=0
to test that sound is coming from your speaker – remembering of course to use your device names.
Try recording 5 seconds to file “out.raw” and then listening to it with:
arecord -f S16_LE -r 16000 -D plughw:CARD=ABTWPDQ0222M,DEV=0 -d 5 -t raw out.raw
aplay -f S16_LE -r 16000 -D plughw:CARD=Headphones,DEV=0 out.raw
I started with one of the tiny USB microphones, but its quality was so poor that Voice Assist was unable to determine what command I was giving. Swapping to a different USB mic made a world of difference; and that is why you should test the audio quality before we add anything else.
Install homeassistant-satellite
On your Raspberry Pi satellite:
-
Install homeassistant-satellite from the instructions at GitHub - synesthesiam/homeassistant-satellite: Streaming audio satellite for Home Assistant. If you are using a SSH terminal program you can simply copy from the Installation section and paste into the terminal window.
I also installed the Voice Activity Detection and Audio Enhancement options (copy and paste the lines starting “.venv/”… -
To run the homeassistant-satellite, you then enter the command “script/run “ followed by all the options you want, and press [enter]. I ended up using many of the options … namely
script/run --host 192.168.1.98 --token HNuI1UfEKX...wUSXzn8xkVKQwgsCDQ \
--mic-device plughw:CARD=ABTWPDQ0222M,DEV=0 \
--snd-device plughw:CARD=Headphones,DEV=0 \
--awake-sound sounds/awake.wav --done-sound sounds/done.wav \
--auto-gain 5 --vad webrtcvad
-
You should see a couple of lines displayed as the program initiates, and when it hears a noise it will display “WARNING: root: Speech detected”. This is normal.
Speak the wake word into the microphone. At this point we are the using openWakeWord add-on on the Home Assistant machine (which is at the IP Address or machine name given in the “–host” option), and you will remember that we go to Settings > Voice Assistants and look under our selected Assist profile to find that my wake word is “Hey Jarvis” -
After a couple of seconds (longer the first time the WakeWord is used) you should hear the awake.wav sound to indicate that Assist is listening.
Then you speak your command (e.g. “Turn on the Study light”), and approx 15 seconds later you should hear the done.wav sound to indicate that Assist is now processing your command. This 15 seconds delay is to make sure that you have actually finished speaking, and are not just pausing mid sentence. -
If the command was recognised, Assist will execute the command and play a confirmation through the speaker; otherwise Assist will tell you that it did not understand your command.
-
And that’s it !
If it works Ok you can try some of the other options; run as a service (automatically starts whenever the RasPi starts and runs in the background); or run in Docker. Also you can see the commands which Voice Assist thought you said in the Home Assistant > Settings > Add-ons > Whisper “Log” tab.
If it doesn’t work … well you can:
- add the " --debug" option to the
script/run
command, but be warned it can be hard to find anything useful in all the generated output. - add the " --debug-recording-dir " option to hear what homeassistant-satellite heard.