Text to speech and music at the same time

So for some background on my setup I have home assistant running on a raspberry pi in my basement and a second raspberry pi running Mopidy for playing music over speakers in the main part of my house. I wanted to add the ability to have announcements from home assistant to be played over the house speakers but in a way that would play nicely with the music server. At first this seemed pretty straightforward using the text to speech service and specifying the mopidy mpd media_player entity, but this causes currently playing music to stop. Luckily pulseaudio and gstreamer make it possible to play music and the announcements simultaneously.

First on the music server side make sure pulseaudio is installed, sudo apt-get install pulseaudio should get you what you need. Next we need to configure pulseaudio to accept TCP connections from localhost for Mopidy and over the local network for home assistant to connect to. I followed the configuration described here https://docs.mopidy.com/en/latest/running/service/#system-service-and-pulseaudio with the addition of the local network (replace the x for the subnet you are using). Add/uncomment the following line to /etc/pulse/default.pa:

load-module module-native-protocol-tcp auth-ip-acl=;192.168.x.0/24

Next pulseaudio has a feature called “auto ducking” which allows you to reduce the volume of certain streams when other higher priority streams are playing. More information can be found here https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#module-role-ducking. For this I added the following line to /etc/pulse/default.pa

load-module module-role-ducking trigger_roles=announce,phone ducking_roles=music volume=75%

Once this is done make sure pulseaudio is running my executing pulseaudio --start. If you want to set up pulseaudio on boot you can follow the instructions here under the section “Starting pulseaudio on boot” https://www.raspberrypi.org/forums/viewtopic.php?t=235519.

Now that pulseaudio is set up we need to tell Mopidy to use it for playing audio. Again following the instructions at https://docs.mopidy.com/en/latest/running/service/#system-service-and-pulseaudio under the audio section of /etc/mopidy/mopidy.conf add a line:

output = pulsesink server= stream-properties="props,media.role=music"

This tells mopidy to use pulseaudio over TCP connecting to the local host, and sets the media role to music for the auto-ducking.

Next home assistant needs to be configured with a media player that can connect to the music server’s pulseaudio server to play announcements. For this I used the GStreamer component https://www.home-assistant.io/integrations/gstreamer/. In addition to the gstreamer packages mentioned in the documentation make sure you also install the gstreamer pulseaudio plugin with sudo apt-get install gstreamer1.0-pulseaudio. Also if you are using a virtual environment like me be sure to symlink “gi” as described in the doc as well. Next configure the gstreamer media player with the following:

  - platform: gstreamer
    name: gst_main
    pipeline: pulsesink server=<host/ip of music server> stream-properties="props,media.role=announce"

Now we have a media player that we can stream audio from home assistant to the music server without interfering with music being played by Mopidy. You can test this using the TTS service of your choice (I used tts.picotts_say) in the Services tab of the developer tools. Just set the “entity_id” to media_player.gst_main (or whatever you named your media player) and the “message” to whatever you want to be announced. If you have music playing you should hear the music volume reduced while the announcement is being played and then the music return to its normal volume after the announcement is done.


Thanks for this writeup! I was suprised that there wasn’t a built-in solution in HA but this works just fine.

My setup is all local and I’m using mpd instead of mopidy but it works the same way. And since the communication is local I’m using pulseaudio over unix-sockets instead of TCP, this in theory saves some CPU cycles…
For reference this required adding the HA user (root) and the mpd user (mpd) to the group pulse-access and running pulseaudio in system mode, othewise only the TCP socket communication works.

And it seems that with pulseaudio you have a choice between module-role-ducking and module-role-cork which pauses / unpauses your music instead of ducking the volume. I wish both modules were a little more configurable.