Piper TTS and Google Home products that have a screen

SentryBot2281 · October 10, 2024, 2:47pm

I’ve finally started to implement some announcements into my system. I’ve so far been using Piper and it works beautifully so long as I’m not targeting a (Google) Nest Hub or Hub Max. The old first generation Google Home speakers, a Nest Audio speaker, some 2nd gen Sonos Play:1 speakers, and even my Sony HT-A7000 soundbar all play back the Piper messages without incident and in their entirety. The devices with screens however have a delay that cuts off the start. I’ve looked around but haven’t noticed others complain about this. So is this a known issue? Or an issue with me doing something wrong?

My inelegant workaround for the time being is to have “Nest Hub delay” prepended to the Piper TTS directed at a screen device. That phrase seems to be exactly the amount of time that gets cut off at the beginning. Well the timing is right if you’re using the voice “hfc female (medium)”. I havent actually tested it with other voices.

putch · October 10, 2024, 7:35pm

Can you provide a specific example of the code you are using to send audio that is getting the start cut off.

SentryBot2281 · October 10, 2024, 8:43pm

Yeah it might have been helpful if I’d done that from the beginning…

action: tts.speak
data:
  message: Nest Hub delay.  The washing machine is finished
  media_player_entity_id: media_player.cast_kitchen_hubmax
target:
  entity_id: tts.piper

I just don’t include “Nest Hub delay” when sending devices without a screen. But without that delay, all you hear is “finished”.

putch · October 10, 2024, 9:06pm

When I try it your way, I get your result. However, try this instead:

action: media_player.play_media
data:
  media_content_id: media-source://tts/tts.piper?message=The washing machine is finished&cache=true&voice=en_US-amy-low
  media_content_type: audio/mp3
target:
  entity_id: media_player.cast_kitchen_hubmax

Obviously you can change the voice to your liking.

putch · October 10, 2024, 9:15pm

And if you want to make it fancym with extra text and a graphic…

action: media_player.play_media
data:
  media_content_id: media-source://tts/tts.piper?message=The washing machine is finished&cache=true&voice=en_US-amy-low
  media_content_type: audio/mp3
  extra: 
    stream_type: "BUFFERED"
    title: Laundry Status
    thumb: https://img.freepik.com/free-psd/washing-machine-isolated-transparent-background_191095-32349.jpg
    metadata: 
      subtitle: Washing Machine

target:
  entity_id: media_player.cast_kitchen_hubmax

SentryBot2281 · October 10, 2024, 9:22pm

See, I knew the problem was between the keyboard and the chair I really can’t wait to give this a shot when I get home. Thank you!

jschollenberger · October 12, 2024, 12:52am

Thanks for sharing this! It’s working great for me and has the added benefit of being able to display an image and text. I’ve just been dealing with the cut-off bug for years with throwaway text at the start, but now that I moved to piper and I’m dynamically generating the TTS input, it has come back to plague me.

SentryBot2281 · October 12, 2024, 4:54pm

putch:

action: media_player.play_media
data:
  media_content_id: media-source://tts/tts.piper?message=The washing machine is finished&cache=true&voice=en_US-amy-low
  media_content_type: audio/mp3
  extra: 
    stream_type: "BUFFERED"
    title: Laundry Status
    thumb: https://img.freepik.com/free-psd/washing-machine-isolated-transparent-background_191095-32349.jpg
    metadata: 
      subtitle: Washing Machine

target:
  entity_id: media_player.cast_kitchen_hubmax

There must be some inherent issue with the Hub and Hub Max. If I change the voice to anything “medium”, the problem persists. amy-low works fine and amy-medium brings me back to just hearing “finished”. Also the stream_type: “BUFFERED” seems to make no difference with either low or medium voices.

Well for the time being I’m just going to use amy-low, which honestly sounds just fine to me.