Quite often I have a TTS line that is too long for my Pi4 to process using Piper. Since I am not aware of an option to set a timeout, I want(ed) to split the text and then have the TTS run multiple times. One challenge is to detect when the TTS is finished. For as far as I know, there is no option for that, so I used the delay solution I found here: I finally solved my issue of multiple Alexa notifications interrupting one another
So for the sake of sharing, here’s what I’ve done. I’m always open for suggestions of course.
alias: Multi-line TTS
sequence:
- variables:
text: "{{ states('input_text.multi_line_tts') }}"
- repeat:
count: "{{ text.strip('.').split('.') | count }}"
sequence:
- variables:
text_current: "{{ text.split('.')[repeat.index - 1] | trim }}."
text_word_length: >-
{{ ((text_current | count - text_current.split(' ') | count) /
text_current.split(' ') | count) | round(1) }}
- service: tts.speak
data:
cache: true
media_player_entity_id: "{{ speaker }}"
message: "{{ text_current }}"
target:
entity_id: tts.piper
- service: logbook.log
continue_on_error: true
data:
name: >-
Debug Multi-line TTS to speaker {{
state_attr(speaker,"friendly_name") }}
entity_id: script.multi_line_tts
message: >-
The current text '{{ text_current }}' has {{ text_current.split('
') | count }} words with an average of {{ text_word_length }}
characters a word. The delay is {{ (text_current.split(' ') |
count * 0.5 ) | round(0, 'floor', default) }} sec.
enabled: false
- delay:
seconds: >-
{{ (text_current.split(' ') | count * 0.5 ) | round(0, 'floor',
default) }}
alias: Delay 0,5s per word
mode: queued
icon: mdi:microphone-message
In short, it splits the text into single lines using the dot as separator. Made sense to me, you just have to keep the lines short. Since it’s a script, one could of course pass variables to it from an automation. For example:
service: script.multi_tts_lines
data:
speaker: media_player.<your speaker>
or
service: script.turn_on
target:
entity_id: script.multi_tts_lines
data:
variables:
speaker: media_player.<your speaker>
Known issues:
- If the text is not yet in the cache, it messes up. But the next time, it’s fine
- A very short line with long words could result in an overlap, but I haven’t bumped into it other than while testing.
Edit 1: Small code improvement and a (disabled) debug line that could be useful in solving issue 2. One could for example decide to make the delay factor dynamic based on the average length of words. Next I’ve put the speaker into a variable. If you want to always use the same speaker, you can set the variable speaker
as first action in the script.
Edit 2: I ended up using an input_text helper since I bumped into issues with longer lines not being sent over to the script. If you use the repeat in the same automation where you define the variable text
, then there is no need for a helper.