Wait for TTS audio to finish being played before continuing

I use the Text-to-speech component in some of my scripts and automations to send audio messages to my Android TV. It’s working fine for the most part, but in some of my scripts, I’d like the script to wait for the message to finish playing before proceeding with the next action in my sequence - but it seems to consider the action complete as soon as it has successfully sent the audio to the media player. For now, I am using hard-coded delays to work around the issue - but the message is variable-length (weather forecast is part of it) so sometimes I’m waiting unnecessarily and other times, I’m not waiting long enough.

Is there an easy way to know (from within my script) when the audio has finished playing?

Presuming that the media player goes to idle after saying the tts a wait template would work, but you’ll probably need a small delay before the wait template in case the media player is on idle whilst waiting for the tts message to fire.

4 Likes

Yes. I agree with putting a delay in your script.

1 Like

Thanks…. the wait_template worked out. I wasn’t aware that there was such a statement.

I put one wait_template immediately after the tts call that waits until the media_player has a status of ‘playing’, then the script continues, does a few things and then another wait_template that waits for it to switch to ‘idle’ before proceeding with the final few tasks.

2 Likes

You may refer to my post here: Only last service is called?

1 Like

Not sure if something has changed, or if I’m doing something wrong, but this isn’t working for me. I have Alexa announcing the weather which can vary in length, and my script waits for the device to become idle before continuing, but it never does. By watching the following template in Developer Tools to see what it’s doing while Alexa announces the weather, the value never changes from standby:

{{ states('media_player.master_bedroom') }}

in other words, {{ is_state('media_player.master_bedroom','idle') }} is never true.

Any ideas would be appreciated.

1 Like

So I figured out a work around to this problem, at least for me. Depending on your situation, it may not work. I have an automation that runs in the morning whenever my phone’s alarm is turned off. It calls several scripts in order, some of them turning on lights, television, etc, and some of them having Alexa tell me things. Alexa tells me good morning and what time it is, then she reads the weather, and lets me know if the fireplace had been turned on in the hearth room to warm it up and also, reminds me if it’s trash and/or recycling day pickup. The point is, Alexa can be speaking for a long time or a very short time, depending on conditions. Calling each Alexa service in separate scripts (or automations, I tried lots of things) wasn’t working. What eventually worked for me was not putting them all in separate scripts, but putting all of the Alexa service calls in a single script, and using conditions to determine what she says. You don’t have to put any delays in between, just one service call after another. Doing it that way always results in her saying everything I’ve asked her to say, without one announcement speaking over another. This works for me because I have Alexa saying everything all at once, not waiting for tasks to complete in between. I suppose if you wanted her to say something, then turn on a light, then say something else, it still wouldn’t solve the problem for you.

I have some Google Home devices (instead of Alexa, but probably its the same), and HA’s state is idle or off even when asking Google something.
On the other hand using HA’s tts then GH’s status is ‘playing’ (not ‘idle’ or ‘off’).

Same for me, my Good morning routine is telling me the weather, appointments and activates a HA script which plays a tts file via the media player. That worked fine until approx. two months ago, since then Google assistant activates the script before it finishes talking to me. No idea why they changed that, it now pretty much runs the items in the routines all at once. Annoying as hell.
But: a quick and dirty trick is to add ‘Adjust media volume’ just before calling the script in the routine and set it to any arbitrary value (e.g. 0). You now just turn on the media player for the assistant in your script in HA (so it’s gets the volume_level attribute) and wait until the volume level changes to your value. Only then you stream the tts to the speaker. This works because the routine actually waits until Google assistant stops talking and then adjusts the media volume. You can view this like a ‘blocking action’. Pretty much every action which makes the assistant say something is non blocking, that’s why it just activates all items in the routines at once until the first blocking action occurs.


The tts script needs these two actions before streaming to the speaker:

service: media_player.turn_on
target:
  entity_id: media_player.speaker_name

and

wait_template: '{{ state_attr('media_player.speaker_name', 'volume_level') == 0 }}'

After that, remember to set a volume > 0 if you are using my approach, then stream the tts file to the speaker.

It’s very annoying that alexa never goes idle state. Always works in standBy mode except if it does not play any song. If it plays a song then it turns playing state and gets pause state when it stops. I don’t want to start the next action until alexa ends speaking but could not find any way to do it

Well, I’m not sure what changed, but my scripts no longer work as they did before. Alexa starts saying the first item in the list, and then jumps to the very end (I put a “that’s all I have for you today” so I would know when she’s done speaking). I tried the trick with setting the volume in between each message, but even that didn’t work (they all sent at once). This is incredibly annoying because it’s become a big part of our day, getting the weather forecast, any reminders, announcing if it’s a family member’s birthday or anniversary, etc. A daily briefing, so to speak. The only thing I can think to try next is to create a variable with text from all of the announcements, and then run the script to speak whatever is contained in the variable. I need to try to figure out a way to get the weather into the text though, because that’s the only part of it that comes from an external source. Any suggestions would be appreciated. I’ll come back and post after I’ve had a chance to try it out.