VOICE: How to estimate the length of a sentence to be spoken in seconds

I have a script that speaks various reminders on various media players:
“Its time to let the cats out”
“Its time to feed projectile vomiting dogs white bread”
“Hot tub has reached 102”

Before the TTS plays, I increase the volume of the media player to level 7 for 10 seconds so the TTS can be heard. then I reset it to 4 after TTS where we usually keep it to play background music.
The problem is short sentences get loud background music, long sentences get cut off.
How to calculate the number of seconds it will take to spit out TTS?

Automation:

alias: "Alarm: Cats: Reminder  Let Cats out "
description: ""
trigger:
  - platform: time
    at: "09:45:00"
condition: []
action:
  - service: script.alexa_kitchen_play_tts
    data:
      message: please let the adorable kitty cats out to hunt
mode: single

Common use Script

alias: script.alexa_kitchen_play_tts
sequence:
  - service: media_player.volume_set
    data:
      volume_level: 0.7
    target:
      entity_id: media_player.kitchen_original
  - service: notify.alexa_media_kitchen_original
    data:
      message: "{{ message }}"
  - delay:
      seconds: 10
  - service: media_player.volume_set
    data:
      volume_level: 0.4
    target:
      entity_id: media_player.kitchen_original

This is what I do.

This automatically adjust the volume for the announcement and then returns it to where it was. This is on a Sonos. Don’t know what you have but give it a try.

    - service: media_player.play_media
      data:
        entity_id: media_player.living_room
        media_content_id: >
           media-source://tts/google_translate?message="{{ trigger.to_state.state }}"
        media_content_type: music
        announce: true
        extra:
          volume: 40          

Bummer! That would have been too easy!

And I don’t think I have this on Alexa

media-source://tts/google_translate?message=%22%7B%7B%20trigger.to_state.state%20%7D%7D%22```

I stumbled on this post

I think this is a better way to go! I can extract the current volume and save it in a variable. Then set new vol, play msg, restore old volume.

alias: Play msg on Kitchen Alexa with volume set and reset
sequence:
  - variables:
      initial-kitchen: "{{ state_attr('media_player.kitchen_original','volume_level') }}"
  - service: media_player.volume_set
    data:
      volume_level: 0.65
    target:
      entity_id: media_player.kitchen_original
  - service: notify.alexa_media_kitchen_original
    data:
      message: "{{ message }}"
  - service: media_player.volume_set
    data:
      volume_level: "{{initial-kitchen}}"
    target:
      entity_id: media_player.kitchen_original

But, The value isn’t being restored for some reason? The template for the volume attribute evaluates correctly in dev tools, so I know that’s right.

That’s how I am doing it. First a little chime on my sonos, then a short wait, then the tts, then the calculated delay length depending on text length - I figured my tts engine speaks at around 150 words per minute

      - service: media_player.play_media
        data:
          media_content_id: /local/jarvis-chime.wav
          media_content_type: music
          announce: true
          extra:
            volume: >-
              {{ states('input_number.jarvis_' + myplayer |
              replace('media_player.','') + '_volume')  }}
        target:
          entity_id: "{{ myplayer }}"
      - delay:
          seconds: 1
      - service: media_player.play_media
        data:
          media_content_type: music
          announce: true
          media_content_id: >-
            media-source://tts/tts.piper?message={{ mymessage  | replace('&',
            'and') }}
          extra:
            volume: >-
              {{ states('input_number.jarvis_' + myplayer |
              replace('media_player.','') + '_volume')  }}
        target:
          entity_id: "{{ myplayer }}"
      - delay:
          seconds: >
            {% set text = mymessage | replace('&', 'and') %} {{ (text.split(' ')
            | length * 60 / 150) | round(0, 'ceil') }} 

This can also be done with the custom integration Chime TTS.
When calling the chime_tts.say service, you can specify the volume you want for the TTS announcement. It will set the volume, play the announcement, and then reset it once the playback ends.

Ok, that’s cool. Looks like you’ve solved this one and put a lot of work into it.

I’ll give it a try as soon as I get my butt outta bed :slight_smile:

Well, the Chime TTS doesn’t look like it works on Alexa.
So I got my script working. I had to add 3 seconds to the time function just to pad it. variable names dont seem to like hyphens…

alias: Play msg on Kitchen Alexa with volume set and reset
sequence:
  - variables:
      initialkitchen: "{{ state_attr('media_player.kitchen_original','volume_level') }}"
  - service: media_player.volume_set
    data:
      volume_level: 0.60
    target:
      entity_id: media_player.kitchen_original
  - service: notify.alexa_media_kitchen_original
    data:
      message: "{{ message }}"
  - delay:
      hours: 0
      minutes: 0
      seconds: |
        {% set text = message | replace('&', 'and') %} {{ (text.split(' ')
           | length * 60 / 150) +3 | round(0, 'ceil') }} 
      milliseconds: 0
  - service: media_player.volume_set
    data:
      volume_level: "{{initialkitchen}}"
    target:
      entity_id: media_player.kitchen_original
1 Like

Perhaps this could work for others.

What i did is use the media_duration property of the speaker entity to set a delay after the message was sent, this prevented messages being cut off when a subsequent message command was issued.

Here is my custom message handler script.

alias: CustomMessageSender
mode: queued
sequence:
  - wait_template: "{{ is_state('inMediaDevice', 'idle') }}"
    timeout: "00:00:02"
    continue_on_timeout: true
  - service: tts.cloud_say
    metadata: {}
    data:
      cache: false
      entity_id: "{{ inMediaDevice }}"
      message: "{{ inMessage }}"
  - delay:
      hours: 0
      minutes: 0
      seconds: "{{ (state_attr(inMediaDevice, 'media_duration') | float) }}"
icon: mdi:speaker-message

Then, in my automations or other scripts I just call the script above with inMediaDevice, and inMessage, here’s an example.

alias: Viking
sequence:
  - service: script.custommessagesender
    data:
      inMediaDevice: media_player.office_mini
      inMessage: >-
        This is message number 1, and it is a long message.  We should wait for
        it to finish.  In fact, it''s a very long message now, and I wonder how
        much this would work.
  - service: script.custommessagesender
    data:
      inMediaDevice: media_player.office_mini
      inMessage: This is message number 2.
mode: single
icon: mdi:test-tube

I need to try your solution!

Alexa does have this property and it evaluates to true

{{ is_state('media_player.kitchen_original', 'standby') }}

But unfortunately Alexa doesn’t have this

{{ (state_attr('media_player.kitchen_original', 'media_duration') | float) }}

instead of actually measuring the sound files length, I’ve settled on doing this:

      - wait_template: >
          {{states(states('sensor.tune_player')) == 'playing'}}
      - wait_template: >
          {{states(states('sensor.tune_player')) != 'playing'}}

and then proceed with whatever you want to do next ;=)

maybe this can help you out?

Alexa doesn’t change states when doing TTS unfortunately.