TTS component enhancement

I’ve been working on integrating TTS notifications to my system. I have a USB sound device piped into my monoprice 10761.

It appears that both the polly and google tts components are limited to a single voice, defined in configuration, with no ability to override the setting on the fly.

Google has a number of standard and wavenet voices listed at https://cloud.google.com/test-to-speech/docs/voices, but we are only able to specify the language code in configuration.

Amazon Polly has a number of voices listed here (which all appear to be supported, in static configuration): https://docs.aws.amazon.com/polly/latest/dg/voicelist.html

proposal: allow ‘voice’ to be specified in the json call to override/dictate which voice setting is sent to the respective TTS service.

Alternatively, allow the creation of multiple different tts profiles/services that allow for creation of services named tts.arbitrary_name here (e.g., tts.amazon_polly_salli, tts.amazon_polly_matthew, tts.google_say_whatever)

examples…

tts:
  - platform: amazon_polly
    aws_access_key_id: my_key
    aws_secret_access_key: my_secret_access_key
    name: polly_salli
    voice: "Salli"
  - platform: amazon_polly
    aws_access_key_id: my_key
    aws_secret_access_key: my_secret_access_key
    name: polly_matthew
    voice: "Matthew"
  - platform: google
    language: 'en'
    voice: 'en-US-Wavenet-A'
    name: google_weather_alerts
  - platform: google
    language: 'en'
    voice: 'en-US-Wavenet-B'
    name: google_door_alerts

these would create services…
tts.amazon_polly_salli, tts.amazon_polly_matthew, tts.google_weather_alerts, tts.google_door_alerts

(or just let me specify the voice code in tts.amazon_polly_say and tts.google_say!)

This way I could have notifications with a voice that’s specific for my kids to listen to/respond to, which would be different from the voice telling the house that the garage has been left open, or another use case.

I think this makes sense, and I like this idea! Looking at the YAML you write up, my guess is that doing the service routing like you suggest is going to be a bit of a pain.

On the code side for google, just passing in a different wavenet ‘language’ is sufficient to get a different voice, and this could be pretty easily supported with extra data I think, although I don’t know much about how home assistant passes data around.

FWIW, I just published a github gist that lets you get wavenet voices now if you replace your google.py file and prefer better voice generation – the post is here: Wavenet support in tts.

1 Like

This is what I’m after too. Was any progress ever made on implementing this?

1 Like