HA Announcement: play sound, run TTS - cuts early

Hello everybody.

I read a lot of topics and troubleshoots about voice and sound announcements, and still cannot figure out why HA cuts the last 1-2 seconds every time for TTS or just a short sound effect.

I’ve added a pretty dumb workaround for TTS: a short word or letter after the main message, so it cuts irrelevant info. But I cannot avoid the sound effect to be cut too early.

To double-check everything, I now just run Action directly, without playing anything else in HA or Music HA, but still it cuts early:

action: media_player.play_media
data:
  media:
    media_content_id: media-source://media_source/local/HA_Media/Email-02.wav
    media_content_type: audio/x-wav
  announce: true
target:
  entity_id: media_player.raspberrypi_5

Here is my setup for Snapcast. It’s a pretty basic setup.
I’m using three USB sound cards and three instances with systemd on my RPi5.

Snapcast Multiroom setup

I’ve tried to wait for a few seconds in automation\scripts and alsy I’ve tried this: {{ is_state('media_player.raspberrypi_5', 'idle') }}

I can see the states and statuses of all players, and everything looks fine; there are no concurrent plays. I also restarted everything and recreated HA Music and Snapcast integrations in HA - no luck yet.

This premature cut is always there. I remember it started a few years ago when I used HA Announcements first, and it’s still here. I’ve changed hardware, setups and everything during this time, but no luck - it’s a long-running issue and no ideas left in my mind.

Hello Ole D,

You must be sending a command or something is sent to the speaker before it finished. Add a short delay after you send the message.

Nothing watches for the end of the message, so you need to stop HA from sending anything else too soon. Longer messages will need more delay.

To completely isolate this case, I now only troubleshoot at the player that is not busy with any tasks, with an empty queue and so on.

Moreover, I’m calling the action directly, in dev tools:

action: media_player.play_media
data:
  media:
    media_content_id: media-source://media_source/local/HA_Media/Email-02.wav
    media_content_type: audio/x-wav
  announce: true
target:
  device_id: b3edda5e47dc9d0be44755d8ad67439b

Probable solution:

I almost probably figured it out, by playing with buffer - smaller buffer - better playing short files, but too short is about <150ms sounds pretty bad.

Snapserver config: /etc/snapserver.conf

# Buffer [ms]
# The end-to-end latency, from capturing a sample on the server until the sample is played-out on the client
buffer = 250

Confirmed with TTS and buffer 350. It now reads everything and do not cut anything at the end of the text.

Saved my setup here: simple_setup.md

Cutting down on the buffer is not the best idea.
The buffer is there to handle delays in the sound and it might work when you test it, but then later you might be moving files over the network or your TV buffers a just started TV show and then the buffer will be too small.

Also remember that your devices might do other stuff or some of them might be slower, so a larger buffer might be necessary on those than the ones you test on.