It’s pretty straightforward to do, just a little legwork:
- Have a fully operational Airfoil system working and connected to your HomePods or other preferred output devices
- Set the Airfoil source to be Text-To-Speech
- Configure the system running airfoil to accept inbound SSH commands (I use a Mac, you can follow this tutorial to set it up)
- Set up Home Assistant for shell scripts that execute the Speak command on the Mac (which is why you set Airfoil for TTS). If using Windows you’ll have to research the best way to do this.
I enhanced mine quite a bit by getting custom voices for my Mac running Airfoil (there’s a company that sells really nice authentic sounding voices, mine is a beautiful Scottish female voice) and also writing a custom script that Home Assistant will run when it SSH’s into that Mac to give me more control over the volume, which voice to use, which speakers to use, and so forth.
I can also use AppleScript commands to run Apple Music for whole-house audio as well (although with HomePods it’s also easy to just say “Play greatest hits everywhere”, the choice is yours.
The bash script I wrote to do everything actually re-connects to the speaker groups I defined in Airfoil so that if we use Siri for anything (playing music, giving the weather, etc) it will disconnect from Airfoil, so all my TTS commands in HASS have a built in 2 second delay to give Airfoil time to reconnect to the speakers I want to broadcast on.
Here is the bash script I wrote (you may want to tweak it for your own purposes):
#!/bin/bash
Speaker=""
Volume_Level=0
Text=""
Voice=""
Source=""
Delay=.75
IsGroup=0
Help=""
while getopts ":t:s:l:v:r:" arg; do
case $arg in
t) Text=$OPTARG;;
s) Speaker=$OPTARG;;
l) Volume_Level=$OPTARG;;
v) Voice=$OPTARG;;
r) Source=$OPTARG;;
esac
done
# Don't proceed if not text is passed
if [ "$Text" = "" ]
then
printf "\nCommand line options: \n"
printf " -t: Text to speak (required)\n"
printf " -s: Speaker or speaker group (defaults to Default group)\n"
printf " -l: Optional volume level (only for non grouped speakers)\n"
printf " -v: Optional voice (defaults to CereVoice Kirsty)\n"
printf " -r: Optional source (i.e., Text to Speech) - NOT IMPLEMENTED\n"
printf "\n"
exit
fi
# Set default speaker
if [ "$Speaker" = "" ]
then
Speaker="Default"
fi
if [ "$Speaker" = "Default" ]; then IsGroup=1; fi
if [ "$Speaker" = "Upstairs" ]; then IsGroup=1; fi
if [ "$Speaker" = "Downstairs" ]; then IsGroup=1; fi
# Set default voice
if [ "$Voice" = "" ]
then
Voice="CereVoice Kirsty"
fi
# Set default source
if [ "$Source" = "" ]
then
Source="Text to Speech"
fi
# If using group speakers, give a bit longer delay so they can all connect
if [ $IsGroup -gt 0 ]
then
Delay=1.5
fi
osascript <<EOD
tell application "Airfoil"
disconnect from (every speaker whose name starts with "Default")
disconnect from (every speaker whose name starts with "Upstairs")
disconnect from (every speaker whose name starts with "Downstairs")
disconnect from (every speaker whose name starts with "Basement")
disconnect from (every speaker whose name starts with "Bedroom")
disconnect from (every speaker whose name starts with "Office")
disconnect from (every speaker whose name starts with "Family Room")
disconnect from (every speaker whose name starts with "Theater")
disconnect from (every speaker whose name starts with "Gym")
end tell
delay $Delay
tell application "Airfoil"
connect to (every speaker whose name starts with "$Speaker")
end tell
delay $Delay
EOD
if [ $IsGroup -lt 1 ]
then
if (( $(echo "$Volume_Level > 0" |bc -l) ))
then
osascript <<EOD
tell application "Airfoil"
set (volume of every speaker whose name starts with "$Speaker") to $Volume_Level
end tell
EOD
fi
fi
#Text="$Text. TTS version 1."
say -v "$Voice" "$Text"
if [ "$Speaker" != "Default" ]
then
osascript <<EOD
delay 1.5
tell application "Airfoil"
disconnect from (every speaker whose name starts with "$Speaker")
delay 1
connect to (every speaker whose name starts with "Default")
end tell
EOD
fi
And here is how I execute it using HA shell commands:
speak: ssh -i /config/ssh/id_rsa -o 'StrictHostKeyChecking=no' [username]@[IP address] '~/Scripts/tts_v2.sh -t "{{text}}" -s "{{speaker}}" -l {{volume}} -v "{{voice}}" -s "{{source}}"'
And then my automation:
service: shell_command.speak
data:
text: The garage door is opening.