Pause between sentences in Piper TTS

kibuan · January 11, 2024, 5:42pm

I have noticed that Piper does not provide any means to insert pauses between sentences or paragraphs using a designated command in the input text. Additionally, the add-on lacks support for paragraph marking in the input text and does not offer a paragraph pause setting. However, I have observed that Piper does allow pauses between each individual sentence.

Therefore, I would like to propose adding the setting called “sentence-silence” in the Piper add-on configuration. This setting should enable users to configure the “–sentence-silence” parameter according to their preference when generating TTS with Piper.

I believe that by incorporating the “sentence-silence” setting into the Piper add-on, users would be able to create more natural and intelligible voice output, particularly when it comes to narrating longer texts or documents.

The sentence-silence parameter is mentioned here:

Nimrod_Dolev · January 11, 2024, 8:20pm

Hi @kibuan. You might want to consider using the custom Chime TTS integration to achieve pauses between sentences or paragraphs. Among other features of the integration is the ability to combine together as many TTS audio segments, “chime” sounds and delays as you wish.

So for example: you could call the chime_tts.say service with the following TTS and delay segments:

service: chime_tts.say
data:
  tts_platform: tts.piper
  message:
    - type: tts
      message: |
        That's interesting, let me think about that for a second.
    - type: delay
      length: 1000
    - type: tts
      message: Ok, I thought about it.
target:
  entity_id: media_player.kitchen

kibuan · January 11, 2024, 9:30pm

Thanks for the suggestion @Nimrod_Dolev. It seems like the chime_tts integration can help me get a more clean TTS audio with natural pause length between paragraphs. I will definitely look into it.

I have a lot of fun using ChatGPT to generate messages. As an example one of the messages is a greeting in the morning. It’s a combined message with a randomized prompt. The message can include quotes, weather update, fun facts, jokes etc. It’s just a shame that I have to cut it up in separate parts (separate prompts for each paragraph) and then mock it together to get a proper TTS audio. I get the best result when I let GPT generate the full message in one prompt. If the TTS service had a pause command I could include in the message, I would just instruct ChatGPT to use that command between sentences.

Hm… maybe I could mock a solution together using existing components. I guess It should be possible to write a script I can call with a variable holding the TTS message including a custom delay command - maybe three dots (…) for a pause/delay. The script could then call chime_tts with the separated paragraphs in the message.

Nimrod_Dolev · January 11, 2024, 9:47pm

I think you could combine ChatGPT with Chime TTS to achieve what you want to do.
You could include in your prompt to Chat GPT that it should format the output to match the YAML used by the message parameter.

andyfrei · March 12, 2024, 9:30am

Is there a way to use parts of textmessage sent to chime as a pause option with chime?

Something like
‘message part 1 message part 2’ would result in a pause of the tts announcement between the two message parts?

Nimrod_Dolev · March 12, 2024, 9:34am

You could use a template to split the text into different tts and delay segments (see above for the structure)