Problem with Cloud/STT/TTS

Hello everybody,

I have a weird issue and I hope someone might have some suggestion how to get to the root cause of it.

The issue is -as the title suggests- with STT/TTS or with cloud/my internet. Not 100% sure.
It can manifest itself in either of several ways:
a) a command tts.cloud_say is sent to my network speaker, speaker goes into Playing mode but no sound comes out
b) I give a voice command to my HA Voice device, it goes to Listening mode but stays in it for up to several minutes, then regains and does what was told

In any case, when it happens, and it does happen several times per day (what I’ve noticed, possibly even more), nothing related to STT/TTS works, including Assist app on my Android phone. There is no problem with internet at that specific time, as someone is watching youtube or surfing during period of STT/TTS inactivity. Also internet connection is not “under pressure” then.

I’m running Core 2025.4.4 & Supervisor 2025.05.3 on RasPi, but it’s not version dependent -it’s been going on for quite some time now. I have Fritz! network modem with the official AVM FRITZ!Box Tools integration. Also Nabu Casa account, that reports to be Connected all the time.

So, my question is: does anybody have any idea where to look? I haven’t seen anything in any log (Fritz log is the most complicated and there is possibility that I missed something, if it’s even related in any way).

My head is currently all over the place, so apologies if it’s a bit incoherent, or if I’ve forgotten to add some important detail.

I’ll appreciate any help. Thanks in advance!

At the bottom I’ll paste part of the HA Voice log, with some timestamps just for reference:

> stage: done
> run:
>   pipeline: 01jmyms427bk6fxw6db0yze60s
>   language: en
>   conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
>   tts_output:
>     token: H1ZUBuMTTIrJoC7yVYni8w.flac
>     url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
>     mime_type: audio/flac
> events:
>   - type: run-start
>     data:
>       pipeline: 01jmyms427bk6fxw6db0yze60s
>       language: en
>       conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
>       tts_output:
>         token: H1ZUBuMTTIrJoC7yVYni8w.flac
>         url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
>         mime_type: audio/flac
>     timestamp: "2025-05-11T08:58:29.799873+00:00"
>   - type: stt-start
>     data:
>       engine: stt.home_assistant_cloud
>       metadata:
>         language: en-US
>         format: wav
>         codec: pcm
>         bit_rate: 16
>         sample_rate: 16000
>         channel: 1
>     timestamp: "2025-05-11T08:58:29.800678+00:00"
>   - type: stt-vad-start
>     data:
>       timestamp: 1740
>     timestamp: "2025-05-11T09:04:57.973846+00:00"
>   - type: stt-vad-end
>     data:
>       timestamp: 3620
>     timestamp: "2025-05-11T09:04:58.071016+00:00"
>   - type: stt-end
>     data:
>       stt_output:
>         text: Switch to PC.
>     timestamp: "2025-05-11T09:04:58.566546+00:00"
>   - type: intent-start
>     data:
>       engine: conversation.home_assistant
>       language: en
>       intent_input: Switch to PC.
>       conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
>       device_id: d4b98d174b2001bca1f52c0aa49ffb7b
>       prefer_local_intents: true
>     timestamp: "2025-05-11T09:04:58.567326+00:00"
>   - type: intent-end
>     data:
>       processed_locally: true
>       intent_output:
>         response:
>           speech:
>             plain:
>               speech: Done
>               extra_data: null
>           card: {}
>           language: en
>           response_type: action_done
>           data:
>             targets: []
>             success: []
>             failed: []
>         conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
>         continue_conversation: false
>     timestamp: "2025-05-11T09:04:58.601916+00:00"
>   - type: tts-start
>     data:
>       engine: tts.home_assistant_cloud
>       language: en-GB
>       voice: OliviaNeural
>       tts_input: Done
>     timestamp: "2025-05-11T09:04:58.603146+00:00"
>   - type: tts-end
>     data:
>       tts_output:
>         media_id: >-
>           media-source://tts/tts.home_assistant_cloud?message=Done&language=en-GB&tts_options=%7B%22audio_output%22:%22mp3%22,%22voice%22:%22OliviaNeural%22,%22preferred_format%22:%22flac%22,%22preferred_sample_rate%22:48000,%22preferred_sample_channels%22:1,%22preferred_sample_bytes%22:2%7D
>         token: H1ZUBuMTTIrJoC7yVYni8w.flac
>         url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
>         mime_type: audio/flac
>     timestamp: "2025-05-11T09:04:58.606172+00:00"
>   - type: run-end
>     data: null
>     timestamp: "2025-05-11T09:04:58.607281+00:00"
> stt:
>   engine: stt.home_assistant_cloud
>   metadata:
>     language: en-US
>     format: wav
>     codec: pcm
>     bit_rate: 16
>     sample_rate: 16000
>     channel: 1
>   done: true
>   stt_output:
>     text: Switch to PC.
> intent:
>   engine: conversation.home_assistant
>   language: en
>   intent_input: Switch to PC.
>   conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
>   device_id: d4b98d174b2001bca1f52c0aa49ffb7b
>   prefer_local_intents: true
>   done: true
>   processed_locally: true
>   intent_output:
>     response:
>       speech:
>         plain:
>           speech: Done
>           extra_data: null
>       card: {}
>       language: en
>       response_type: action_done
>       data:
>         targets: []
>         success: []
>         failed: []
>     conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
>     continue_conversation: false
> tts:
>   engine: tts.home_assistant_cloud
>   language: en-GB
>   voice: OliviaNeural
>   tts_input: Done
>   done: true
>   tts_output:
>     media_id: >-
>       media-source://tts/tts.home_assistant_cloud?message=Done&language=en-GB&tts_options=%7B%22audio_output%22:%22mp3%22,%22voice%22:%22OliviaNeural%22,%22preferred_format%22:%22flac%22,%22preferred_sample_rate%22:48000,%22preferred_sample_channels%22:1,%22preferred_sample_bytes%22:2%7D
>     token: H1ZUBuMTTIrJoC7yVYni8w.flac
>     url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
>     mime_type: audio/flac

Small update - I’ve tried disabling Fritz! integration for the past couple of days, but the issue is still exactly the same, leading me to believe that it has to do with my Cloud or internet connectivity. Only question is, why is there a problem with HA connectivity, but everything else works fine? :thinking:

Update 2: I’ve set an automation for HA Voice to speak every 5mins, and it seems that the connection is lost every 90-120min for max 10mins.
I’ve also tried enabling and disabling various integrations, but the behaviour stays exactly the same. I’ve tried using PING integration (although I’m not completely sure what it does) that is triggered every minute, but according to it my internet connection is stable and never loses the connection to NabuCasa’s page.
I’m running out of ideas.