Music stops on Sonos speakers while using Cloud TTS

Total newb here especially when it comes to programming (but willing to learn), please bear with me!

I want to be able to listen to music using my Sonos speakers and Plex integration, while also receiving TTS announcements. As it is now, the TTS announcement replaces the music being played, and I will then have to pick a new song from Plex manually in order to keep listening.

I realize this has been discussed here before, but the solutions appear mostly quite old and possibly no longer functional. I also didn’t see anything specific to using Cloud TTS instead of Google TTS. I found an integration solution made by kevinvincent. in one of the threads, but while it works great for Google TTS, the problem is that I much, much prefer to instead use the more natural and nicer sounding Cloud TTS that comes with the Nabu Casa subscription.

kevinvincent’s above solution relies on a server being created alongside HA that processes the TTS, and that way as I understand it, it works more or less by simultaneously playing both files, my music on Plex with Sonos, and the TTS clip processed on the lateral server. I can’t find any way to change the setting to Cloud TTS, though.

I’ve been trying to come up with a solution… To start with, there are a few issues that come to mind:

  • I have seen no way of editing an integration within HA in order to try and change the TTS service used. It’s currently looking like it’s not worth attempting, considering the below point.
  • Ability to use Cloud TTS is tied to my Nabu Casa subscription, which is single user. I would guess that means that this server wouldn’t be able to access it as it isn’t freely available.
  • Use “pre-printed” TTS clip sound files. This would sadly remove the ability for the TTS to announce any real time things such as outside temperatures - not ideal.

Two possible solutions come to mind:

  1. Process the TTS within my HA instance, then send the file over to the server, from where it will then be played over the Plex music playing on Sonos speakers. This seems like a fiddly solution, possibly causing tons of latency on the TTS announcements, but I imagine it’s at least possible?
  2. Abandon the idea of using this solution and try another script. I have little experience working with scripts so I would have to largely rely on the good community here.

Speaking of trying a script, I found the following from here, posted by @ianadd :

test_tts:
  alias: Test for TTS
  sequence:
    - service: script.turn_on
      entity_id: script.say
      data:
        variables:
          where: 'office'
          what: 'Test.'


say:
  alias: Sonos Text To Speech
  sequence:
  - service: sonos.snapshot
    data_template:
      entity_id: "{{ 'media_player.' ~ where }}"
  - service: tts.google_translate_say
    data_template:
      entity_id: "{{ 'media_player.' ~ where }}"
      message: "{{ what }}"
  - delay:
      seconds: 1
  - delay: >-
      {% set duration = states.media_player[where].attributes.media_duration %}
      {% if duration > 0 %}
        {% set duration = duration - 1 %}
      {% endif %}
      {% set seconds = duration % 60 %}
      {% set minutes = (duration / 60)|int % 60 %}
      {% set hours = (duration / 3600)|int %}
      {{ [hours, minutes, seconds]|join(':') }}
  - service: sonos.restore
    data_template:
      entity_id: "{{ 'media_player.' ~ where }}"

It looks like it could work according to my limited knowledge, but I keep getting an error saying “extra keys not allowed”. It could be any mistake, but I’m also wondering if a script all the way back from '19 would still work today as HA is constantly being updated.

I would love some help on this! At this point this feature is pretty much the only major thing I’m missing at the moment. Thanks in advance!

Don’t know Plex, but the Sonos integration has two commands: sonos.snapshot and sonos.restore which pause whatever is playing and then resume from the same point. You have to include a delay command between them in your script/automation to allow time for the TTS annoucement to complete. I use this with tts.cloud_say quite a lot - it’s simple and it works well.

  - service: sonos.snapshot
    data: {}
    entity_id: media_player.kitchen
  - service: tts.cloud_say
    data:
      entity_id: media_player.kitchen
      message: Too slow. A text message has been sent to the owner of the house.
      language: en-GB
      options:
        gender: male
  - delay:
      hours: 0
      minutes: 0
      seconds: 5
      milliseconds: 0
  - service: sonos.restore
    data: {}
    entity_id: media_player.kitchen

Hi @Stiltjack, I appreciate your answer. This makes sense, I’m trying to implement it - and it looks like it could be improved with a piece of the code in my opening post so it would calculate the delay time on the fly as well, to match the TTS clip. So I assume this is something I can put into an automation? I’m asking this because I’m getting an error, saying

Message malformed: expected dictionary @ data ['action'][0]

I changed in my own media_player entity, as far as I can see it should be pretty much working after that. Something like this is what basically always happens when I’m trying to implement something I found on the forums, and the error is telling me very little :sweat_drops: from what I’ve seen using Google, this could be a spacing issue or something like that, but I wonder why it works for your HA setup and not mine?

post the full automation code you have currently. It’s very difficult to troubleshoot and tell you what’s wrong just from a snippet of an error message.

Also, instead of trying to calculate a delay using a template… just use a wait template:

- wait_template: "{{ is_state(where, 'paused') }}"

I don’t use Cloud TTS or Nabu Casa, but I do have Sonos and use Google TTS as well as playing other sound bites with the Sonos snapshot and restore services. I know one common reason TTS won’t work on Sonos is because of self signed SSL certificates. Again, I don’t know how Nabu Casa handles it, so SSL may not even be an issue in this case.

@wunderdog I use CloudTTS with Sonos, I call a script like this:

sonos_say:
    alias: "Sonos TTS script"
    sequence:
      - service: media_player.volume_set
        data_template:
          entity_id: media_player.kitchen_sonos
          volume_level: "{{ volume|default(0.5) }}"
      - service: tts.cloud_say
        data_template:
          entity_id: media_player.kitchen_sonos
          message: "{{ message }}"
          cache: false

with an action inside an automation like this:

  - service: script.turn_on
    entity_id: script.sonos_say
    data:
      variables:
        message: 'This is the message I want to say.'

or without the script, like this:

action:
  - service: tts.cloud_say
    entity_id: media_player.kitchen_sonos
    data_template:
      message: "The temperature in the house is {{ states('sensor.temperature') | round | int }} degrees."

If you want to play a static file, put it in your config/www folder (the one I have here is in config/www/sounds)

  - service: media_player.play_media
    data:
      entity_id:
        - media_player.kitchen_sonos
      media_content_id: http://192.168.1.15:8123/local/sounds/message.wav
      media_content_type: music

It’s important to note that if you are using Spotify, the music will not resume after TTS, it’s a limitation on the Spotify side. I used to try using the delay, but never got it to work very well. Since I mostly use Spotify, I don’t bother anymore. I don’t have any experience with Plex.

1 Like

Sorry! I just tried to use what Stiltjack posted, I only changed the message and the entity ID. Here it is:

  - service: sonos.snapshot
    data: {}
    entity_id: media_player.living_room
  - service: tts.cloud_say
    data:
      entity_id: media_player.living_room
      message: This is a test
      language: en-GB
      options:
        gender: male
  - delay:
      hours: 0
      minutes: 0
      seconds: 5
      milliseconds: 0
  - service: sonos.restore
    data: {}
    entity_id: media_player.living_room

I feel like I’m missing a key factor about putting the key I find into use. For instance, should I remove spaces or dashes. I tried doing that, but I’m still getting the same error. I also tried using this in an automation, as well as inserting it as a script, but I’m still getting the error. This is is what it looks like:

Thanks for the tip about the wait template, I will definitely try to implement it once I get the snapshot thing working :slight_smile:

@anwen, this doesn’t resume the Sonos music after the TTS though, right? Unless… I’m missing something here? Maybe I was being unclear in my post, I did get TTS through Sonos to work, but I’m looking to have the music quiet down or pause as the TTS says its piece, and then the music would continue playing after the TTS message :slight_smile: sorry if that’s the case!

But I did want to find out how to set the volume for the TTS as well as it plays, and your code solves that mystery for me, thank you!

No, sorry, I don’t restore because it doesn’t work with Spotify, but the sonos.snapshot and sonos.restore functions should work as long as you aren’t using a cloud queue. Did you try using those without any delay code to see if they work for you?

I don’t use Spotify, but I use the sonos queue for amazon music and my own library. this is my TTS automation/script engine and it works even with speakers in a group. I have other scripts/actions that call the TTS engine as well. The automation I show allows someone to enter text into a field on the frontend and choose a speaker to play it on.

I self host and self-sign an SSL cert, so I had to use a little DNS loopback/redirect trick to allow my Sonos to resolve and accept my external URL, otherwise I would see an error in the Sonos S2 Controller app about the content for TTS not being available.

automation:

  - id: text_to_speech
    alias: 'Text To Speech'
    trigger:
      - platform: state
        entity_id: input_text.text_to_speech
        from: ''
    action:
      - service: script.tts_engine
        data:
          speaker: "media_player.{{ states('input_select.speaker') | replace(' ','_') }}"
          volume: "{{ states('input_number.text_to_speech_volume') }}"
          message: "{{ states('input_text.text_to_speech') }}"

script:

  tts_engine:
    alias: TTS Engine
    sequence:
      - service: sonos.snapshot
        data:
          entity_id: "{{ speaker }}"
          with_group: true
      - service: sonos.unjoin
        data:
          entity_id: "{{ speaker }}"
      - service: media_player.volume_set
        data:
          entity_id: "{{ speaker }}"
          volume_level: "{{ volume }}"
      - service: tts.google_translate_say
        data:
          entity_id: "{{ speaker }}"
          message: "{{ message }}"
      - delay:
          seconds: 1
      - wait_template: "{{ is_state(speaker, 'paused') }}"
      - delay:
          seconds: 1
      - service: sonos.restore
        data:
          entity_id: "{{ speaker }}"
          with_group: true
      - service: input_text.set_value
        entity_id: input_text.text_to_speech
        data:
          value: ''

Yes, I tried taking it out… I also tried everything else I can think of: removing each component at a time then testing it, trying to fix the indentation as best I can and using the HA resources I could find, but I’m still getting the same error :frowning: I’m starting to lose my mind a bit with this to be honest - I can’t wrap around my head why a piece of code works for someone else but not for me once I’ve changed the entities and things to match my setup.

Maybe there’s something wrong with my HA setup, is it even possible? I’ve noticed that HA refuses to sometimes save the YAML I put in for the automation action - it lets me press the save button, but once I go back and return, it’s back to the version it was before I did changes to it.

Thank you - this definitely looks pretty good to me, I will try it down the line at the very least just once I get my YAML to work :slight_smile: I would think a similar thing would work just as well with Plex (which I mainly use for streaming my own music library as well)

In your screenshot in this post, you do not have any quotes around your message text. Should be

message: "This is a test"

(With regular quotes, not curly quotes, can’t get my iPad to make the right kind.)

edited to add - fixed the quotes now that I am back at my PC. Also, I don’t use automation editor. I started with HA 3 years ago when there was no editing in the front end, so I learned to write my automations in YAML.

I think I forgot the quotes but I believe it works anyway. Nevertheless, I did try putting in quotes other times as well, both ’ and " kind.

That’s a very interesting point about not using the automation editor, I wonder if it’s the same for a lot of people here… And the reason why my stuff isn’t working. That must mean you put your automations into the config file, as apparently it’s not advisable to edit the automations.yaml? Or do you like to live dangerously and are using the automations file directly?

I actually have my automations split up so that each automation is in an individual file. For my config, I have an automation.yaml file that contains

automation: !include ../automations.yaml
automation split: !include_dir_list ../automations

and then I have a folder called automations with a subfolder for each category, and in each subfolder one file per automation.

The first line in the automation.yaml file includes all the automations in the automation.yaml file that the UI editor uses, and the next line includes all the automations in my automations folder. That way if I decided to use the automations editor for anything, those automations would work too. Hopefully that isn’t too confusing. It uses a concept in HA called packages. These might be of interest:

I started HA nearly 4 years ago (circa version 0.4X). I’ve always built my automations and scripts in YAML, never touched the editor. I have all of my automations in the automations.yaml file with IDs and aliases so the editor can see them, but I only do that for the brief troubleshooting and never editing. I also have a handful of automations in a separate folder which I include as noted in the splitting the configuration link posted above. These are more security based automations so they can’t be viewed/edited from the frontend. You can edit automations in both places (YAML and the frontend editor). However, the frontend editor puts them into a not always easy to follow format and as soon as you save an automation from the frontend editor, if you had put any comments (lines preceeded with a #) in the YAML file, it will delete those and re-organize your automation coding in the YAML file.

Thanks Anwen, I will read up on these! Hoping that it’s some kind of issue with using the editor and creating my own automation files would fix it, I think I had some kind of odd errors that could actually be chalked up to Home Assistant’s infancy - I don’t tend to want to be the carpenter blaming the tools as it’s still likely to be my own fault somehow, but seeing you guys talk about not using the editor is ringing a bell for me and is absolutely worth a try.

Yeah, I’m slightly concerned about what the editor is doing to the YAML behind my back. For example, I tried putting some of the stuff I saw on here into the editor, saved it since it for once didn’t give me an error (it seems a little less likely to give an error if I typed the thing out myself, but anything that’d meaningfully move me towards my goal still wouldn’t work without errors), and once I re-opened the automation I’d saved, my changes to it were gone?

Nevertheless, it’s a relief to know that the automations I already created with the editor can still be used even if I created new automations in the different way as I already created quite a few simple ones.

This is a much more elegant solution than what I’d been using. I’m glad I finally stumbled across this suggestion. Thank you.

Thank you, seems to be exactly what I’m looking for! Quick question: How do I call this up from an automation? Sorry for this trivial question, but I’ve only been with Home Assistant for a few weeks.

If you’re asking how to do this from the automation editor on the frontend, I can’t help you with that. I don’t use the automation editor and write all of my automations in YAML.

If you want to know how to call the TTS script from an automation, I provided that in the post you replied to. It’s a two part piece of code that goes in your YAML files. 1 script that actually does the TTS and pausing/resume and 1 automation that calls the script and passes data to it such as where to play, what to play, and how loud to play it.

1 Like