Google Translate TTS leaving out parts of sentences?

Experimenting with TTS, today, I found that the Google TTS leaves out parts of sentences, albeit showing the complete sentence in the ID3 tag of the generated file.

Here are some examples:

Screenshot_2020-07-08 Entwicklerwerkzeuge - Home Assistant

What I typed What Google speaks
Reminder: It’s time to put the rubbish bin out. It’s time to put the rubbish bin out
Reminder. It’s time to put the rubbish bin out. Reminder. It’s time to put the rubbish bin out.
Reminder? It’s time to put the rubbish bin out. Reminder?
Hello! Who’s there? Hello! Who’s there?
Hello? Who’s there? Hello?
Hello: Who’s there? Who’s there?

Now what is going on here?
Can anyone reproduce that? I’m using HA Core 0.112.3.

Note: If I use PicoTTS instead of Google Translate TTS, all works just fine:
Screenshot_2020-07-08 Entwicklerwerkzeuge - Home Assistant 2

If I go to Google Translate’s web page and use the “Listen” button, it also says “Reminder: It’s time to put the rubbish bin out.”

Kind of makes sense. It stops after the first question since input from the user is being requested and it appears the colons are ignored perhaps in the belief it is some sort of tag which would not have a verbal presence. Interesting.

I consider this a severe bug. We couldn’t rely on anything anymore, if some kind of “AI” would start modifying the text we wish to say in home automation, probably changing its entire meaning!

Any more testers, please?

I hope it’s a bug in HA (which could be repaired more easily than Google …)

I doubt this is HA. If I put “78°F” in a text string google tts will say “seventy eight degrees Fahrenheit” not “seven eight smallcirclesymbol F”. Google or whoever has likely made some design decisions regarding how to verbalize certain written cues. You could choose to use an offline replacement that you feel confident interprets your written intent correctly.

Well, I don’t wish to start a discussion about what bugs should be sold as a feature, I just want to nail it down to either Home Assistant or Google. Personally, I use PicoTTS anyway, but I do have to support lots of “Hassio” installs that use Google TTS.

I assume HA uses gTTS internally (?), so I tried using gTTS with the same sentences directly in Python3, and they turn out ok.

This, and the fact that ID3 from HA-generated Google TTS contains the full message, lets me believe it might be a bug in HA.

Any devs around who could test/answer this, maybe?

Your image indicates you’re using Google Translate not plain TTS, so that may help explain why more interpretation is being invoked.

Have you tried double-quoting? eg: message: '"Reminder: It’s time to put the rubbish bin out."'

Now that’s interesting: Using double-quoting, it speaks the whole sentence!
How did you ever come up with that?

It takes away the chance of using either variant to include the other kind of apostrophes, the “real” (Unicode) apostrophe was actually a typo. Normally it would have been:

message: "Reminder: It's time …"

This works:

message: '"Reminder: It''s time to put the rubbish bin out."'

but is of course unusable since the same message goes out to email, Telegram and TTS (often using notify).

Wonder how that could be done without introducing manual escaping … yuck.

One set of quotes is a requirement for the YAML (Python?) syntax but they are stripped away before the output is sent to TTS, so the second set of quotes is purely for the TTS interpreter. You could try inverting the quote types to keep the syntax legible for reading – I know YAML/Python will accept either, I’m just not sure about Google TTS:
message: "'Reminder: It's time to put the rubbish bin out.'"

If that throws an error you might try adding a fourth ' to make an even set:
message: "''Reminder: It's time to put the rubbish bin out.'"
Edit: Nope, that last one wouldn’t work.

The first one wouldn’t work either, it would breakt at the It's.

Anyway, this is a difference to the PicoTTS implementation, and I’m almost positive it worked with a single set of quotes somewhere in the range of HA 0.103–0.105 (but can’t prove it since we all upgraded to 0.112.x).

From a user’s standpoint, I think it should only require one set of quotes, like almost everything else in HA. Who knows, maybe some extra stripping/unescaping was introduced in the code in the meantime?

Problem has been resolved in HA 0.113.3, together with the “%20” Google TTS/yarl double encoding problem.