Centralized handling for text to speech (TTS)

I recently decided to experiment with alternate TTS components. This turned out to be a chore because I had service calls all over the place that needed to be updated.

So I created a script that I now run all TTS calls through. This allows me to easily make global changes in how TTS is processed. Want to change the TTS engine? Want to choose a different target for TTS playback? This’ll make it easy to make changes.

In my case, this also simplified my HA code. I use input_booleans to keep track of who’s awake so that I only play TTS when nobody’s sleeping. This method centralized all of the condition code that I had peppering my configuration.

Start by creating a script:

speak: 
  alias: TTS Handler
  sequence: 
    - condition: and
      conditions:
        - condition: state
          entity_id: input_boolean.person_one_awake
          state: 'on'
        - condition: state
          entity_id: input_boolean.person_two_awake
          state: 'on'
    - service: tts.google_say
      entity_id: media_player.squeezelite
      data_template:
        message: "{{ message }}"

In my case, I’m now using Google’s TTS and routing it through media_player.squeezelite. You can aim that at a Google Home, etc. Whatever you use. Drop the conditions clause if you don’t need that.

Then, in your automations/scripts/etc, you use your newly-created script like this:

- service: script.speak
  data_template:
    message: "The garage door has closed."

You can also use variables in the message. Here’s a script that speaks the current “feels like” temp from Dark Sky:

speaktemperature:
  alias: 'Speak current temperature'
  sequence:
    - service: script.speak
      data_template:
        message: "The temperature feels like {{ states.sensor.dark_sky_apparent_temperature.state | int }} degrees."

I also had some code that lives outside HA (Tasker on my phone) that can route through this script. This’d also let you easily create TTS for IFTTT applets.

https://ih8gates.dyndns.org/api/services/script/speak?api_password=apipassword

{"message":"This is only a test."}
2 Likes

Right after I posted this, I realized that I could improve it a bit. Say there are times where you want to override the “asleep” input_booleans. I added a bit so that I can pass an additional argument of “unmute” as an override.

speak: 
  alias: TTS Handler
  sequence: 
    - condition: or
      conditions:
        - condition: template
          value_template: "{{ unmute == 'true' }}"
        - condition: and
          conditions:
            - condition: state
              entity_id: input_boolean.person_one_awake
              state: 'on'
            - condition: state
              entity_id: input_boolean.person_two_awake
              state: 'on'
    - service: tts.google_say
      entity_id: media_player.squeezelite
      data_template:
        message: "{{ message }}"

Then you can do this:

- service: script.speak
  data_template:
    message: "The garage door has closed."
    unmute: "true"

You could even extend this idea with service_templates to pass an argument to choose which TTS engine to use.

I like it!

I haven’t implemented TTS yet but’s on my to-do list. This will likely be very useful.

Cool; I also have similar but merged with my iOS notifications so I can tell the notification_engine script if it should speak; send by text, or both.

The only challenge is when lots going on you can get missed notifications as scripts are single instances and any script call while it is already running will get skipped.

1 Like

John, Would you mind sharing how you used text to speech with iOS notifications? Thanks!

EDIT: I just realised @jwelter’s comment is 4 months old. I guess this is probably old news…

This is not my own, at least one other person will recognise this code but I have to credit @anon43302295 as that is where I got it.

All three notification options (text, HA persistent notification and Sonos TTS) are all called in turn with them only being actioned when relevant (meaning a message can have more than one destination)…

I have made my own changes to each notification-type script as well as to the ‘mechanics’ of how the notify script itself is called but specifically in answer to @jwelter I added the wait_template for exactly that reason.

script:
  notify:
    sequence:
      - service: script.text
        data_template:
          tell: "{{ tell }}"
          message: "{{ message }}"

      - service: script.ha_notify
        data_template:
          no_show: "{{ no_show }}"
          message: "{{ message }}"

      - wait_template: >
          {{ is_state('script.announce' , 'off') }}      

      - service: script.announce
        data_template:
          no_say: "{{ no_say }}"
          room: "{{ room }}"
          volume: "{{volume}}"
          message: "{{ message }}"
          voice: "{{ voice }}"

The issue here is the notify script will block in the wait template so another call will fail during that time.

Your better to move the wait template to before you call the notify script, so you wait for any prior script call to complete. Makes yamls a bit messy as you have the wait template before any script notify call but solves the problem nicely. I use a 1 minute timeout on the wait template as don’t like these things to be unbounded.

1 Like

Good point…
Thanks!

Interestingly enough this doesn’t work either. The situation I just had was that we had four people arrive at home together so a lot of automations and notifications underway. The automations and scripts blocked in the wait_template as expected.

But when the script finished I had 3 waiting that all tried to execute and missed 2 and one made it through; with two “script already running” in the logs… So a race condition at the end of the wait_template.

To solve this I have the wait_template; then a random delay; then another wait_template; then continue… This is messy and a hack-and-a-half.

We need to get some sort of mechanism to allow proper handling here. Like a message queue or something.

This is exactly what I have been thinking for the last few days. I can’t see how it can be done with yaml though, but I am still looking…

I’d also like to have a module that can build messages from several parts, something others have done but they all rely on the parts always being in the same order. A message queue would probably allow the order to be varied.

A message queue would also possibly allow media clips to be interspersed with the TTS. Why? I can’t think of a good reason but a frivolous one would be to play a happy birthday clip on appropriate dates.

Hang on, isn’t a lot of what we do in HA frivolous? :slight_smile:

1 Like