Script to send TTS to the right Google Home (based on voice commands)

In August I have shared a script to send TTS to a Google Home and resume the stream (TuneIn / Spotify) which was playing afterwards. It also restores the volume, and you can work with Google Home speakers groups.

In my home I have several Google Nest/Home speakers, and I had some scripts where a TTS would be sent which was triggered by a Google Home routine. However, I could only set it to a predefined Google Home speaker, and not to the specific speaker on which I gave the voice command.

This made me thinking, and I found a solution for this, using the ambient sounds which you can start in your routines.

So I made a new script, which can be used in cooperation with the other script, to send a TTS to specific Google Home after you ask a question to it. Please note I had to change the other script a bit, to make it work with this new one properly, so if you are already using that one, you need to update it.

The script itself relies on the other script, so make sure that script is running correctly first, including the prerequisites.

Additional prerequisites

  1. You need to expose scripts to Google Assistant (either use Nabu Casa, or the manual setup) Google Assistant - Home Assistant
  2. You need a separate script per TTS messsage
  3. You need to create a routine in the Google Home app which starts the script (you can find your scripts under Adjust Home DevicesAdjust scenes, the last action in your routine should be to start the ambient sound (Play and control mediaSleep sounds → Any sound you only use for this script
  4. Define the right variables for your home in the top of the 2nd script below (script.google_home_say_voice). For the title_to_check, play the ambient sound you want, and check for the media_title in developer tools → states.

The script to be used in the Google Home routine
Let’s say you have the Waze and proximity integrations set up, and want to send out a TTS message with the ETA.
The script will then be something like this (you can use the same data fields as for the other script):

eta_thefes:
  alias: "ETA TheFes"
  icon: mdi:car
  sequence:
    - variables:
        message: >
		  {% set eta = (as_timestamp(now()) + 60 * states ('sensor.thefes_home') | float(0)) | timestamp_custom('%H:%M') %}
          {% if is_state('person.thefes', 'home') %}
            TheFes is already home
          {% elif is_state_attr('proximity.thefes', 'dir_of_travel', 'towards') %}
            TheFes will be home at {{ eta }} thuis.
          {% elif is_state_attr('proximity.thefes', 'dir_of_travel', 'away_from') %}
            TheFes is going the wrong direction, but if he turns now he will be home at {{ (as_timestamp(now()) + 60 * states ('sensor.martijn_naar_huis') | float) | timestamp_custom('%H:%M') }}.
          {% else %}
            TheFes is not on his way yet, but if he will leave now, he will be home at {{ eta }}.           
          {% endif %}
    - alias: "TTS data for script"
      service: script.google_home_say_voice
      data:
        tts_message: "{{ message }}"
        tts_volume: 35
		restore_volume_all: True

The script above, will then call this script (and this script will call the original TTS script)

google_home_say_voice:
  alias: "TTS for Google Home by voice command"
  icon: mdi:cast-audio
  mode: single
  max_exceeded: silent
  variables:
    speaker_groups:
      - media_player.home_group
      - media_player.upstairs_group
      - media_player.ground_floor_groep
    check_for_title: "Witte ruis"
    primary_spotcast: thefes
  sequence:
    - variables:
        entity_list: "{{ states.media_player | map(attribute='entity_id') | list }}"
        friendly_name_list: "{{ states.media_player | map(attribute='attributes.friendly_name') | list }}"
    - variables:
        media_content_list: >
          {% set ns = namespace(media_content = []) %}
          {% for entity in entity_list %}
            {% set ns.media_content = ns.media_content +
              [
                states.media_player 
                  | selectattr('entity_id', 'eq', entity) 
                  | map(attribute='attributes.media_content_id') 
                  | join
                  | default('idle', true)
              ]
            %}
          {% endfor %} 
          {{ ns.media_content }}
        media_artist_list: >
          {% set ns = namespace(media_artist = []) %}
          {% for entity in entity_list %}
            {% set ns.media_artist = ns.media_artist +
              [
                states.media_player 
                  | selectattr('entity_id', 'eq', entity) 
                  | map(attribute='attributes.media_artist') 
                  | join
                  | default('idle', true)
              ]
            %}
          {% endfor %} 
          {{ ns.media_artist }}
        media_title_list: >
          {% set ns = namespace(media_title = []) %}
          {% for entity in entity_list %}
            {% set ns.media_title = ns.media_title +
              [
                states.media_player 
                  | selectattr('entity_id', 'eq', entity) 
                  | map(attribute='attributes.media_title') 
                  | join
                  | default('idle', true)
              ]
            %}
          {% endfor %} 
          {{ ns.media_title }}
        app_name_list: >
          {% set ns = namespace(app_name = []) %}
          {% for entity in entity_list %}
            {% set ns.app_name = ns.app_name +
              [
                states.media_player 
                  | selectattr('entity_id', 'eq', entity) 
                  | map(attribute='attributes.app_name') 
                  | join
                  | default('idle', true)
              ]
            %}
          {% endfor %} 
          {{ ns.app_name }}
        media_picture_list: >
          {% set ns = namespace(media_picture = []) %}
          {% for entity in entity_list %}
            {% set ns.media_picture = ns.media_picture +
              [
                states.media_player 
                  | selectattr('entity_id', 'eq', entity) 
                  | map(attribute='attributes.entity_picture') 
                  | join
                  | default('idle', true)
              ]
            %}
          {% endfor %} 
          {{ ns.media_picture }}
        groups_playing: >
          {% set speaker_group_list = speaker_groups.replace(' ' , '').split(',') if speaker_groups is string else speaker_groups %}
          {# determine which media_players are playing #}
          {{
            states.media_player 
              | selectattr('entity_id', 'in', speaker_group_list)
              | selectattr('state', 'eq', 'playing') 
              | map(attribute='entity_id') 
              | list  
          }}
    - variables:
        spotcast_list: >
          {% set ns = namespace(spotcast = []) %}
          {% for name in friendly_name_list %}
            {% set ns.spotcast = ns.spotcast +
              [  
                states.media_player 
                  | selectattr('entity_id', 'search', 'spotify') 
                  | selectattr('attributes.source', 'eq', name)
                  | map(attribute='entity_id') 
                  | join
                  | default('pepijn', true)
                  | replace('media_player.spotify_', '') 
              ]
            %}
          {% endfor %}
          {{ ns.spotcast }}
    - alias: "Wait until white noise started"
      wait_template: >
        {{ 
          expand(states.media_player)
            | selectattr('attributes.media_title', 'eq', check_for_title)
            | map(attribute='entity_id') 
            | list 
            | count > 0 
        }}
    - variables:
        tts_target: >
          {{ 
            expand(states.media_player)
              | selectattr('attributes.media_title', 'eq', check_for_title)
              | map(attribute='entity_id') 
              | join
          }}
    - variables:
        list_index: "{{ ((entity_list | join(',')).split(tts_target)[0]).split(',') | count - 1 }}"
    - alias: "TTS"
      service: script.google_home_say
      data:
        voice_tts_target: "{{ tts_target }}"
        tts_message: "{{ tts_message }}"
        tts_volume: "{{ tts_volume if tts_volume is defined else undefined }}"
        restore_volume_all: "{{ restore_volume_all if restore_volume_all is defined else undefined }}"
        speaker_group_split: "{{ speaker_group_split if speaker_group_split is defined else undefined }}"
        volume_non_playing: "{{ volume_non_playing if volume_non_playing is defined else undefined }}"
        voice_media_content: "{{ media_content_list[list_index] }}"
        voice_media_artist: "{{ media_artist_list[list_index] }}"
        voice_media_picture: "{{ media_picture_list[list_index] }}"
        voice_media_title: "{{ media_title_list[list_index] }}"
        voice_app_name: "{{ app_name_list[list_index] }}"
        voice_groups: "{{ groups_playing if groups_playing | count > 0 else undefined }}"
        voice_spotcast: "{{ spotcast_list[list_index] }}"
2 Likes

Excellent stuff, something that’s bugged me forever. So, am I right in saying, these scripts, use the ‘trailing’ playing of ambient sounds for us uniquely identify the speaker in that state and therefore direct the TTS back to it?

Wow, using the ambient sound to check which speaker was talked to, is a great idea! Cumbersome to integrate but very interesting!