Hello everybody,
I have a weird issue and I hope someone might have some suggestion how to get to the root cause of it.
The issue is -as the title suggests- with STT/TTS or with cloud/my internet. Not 100% sure.
It can manifest itself in either of several ways:
a) a command tts.cloud_say is sent to my network speaker, speaker goes into Playing mode but no sound comes out
b) I give a voice command to my HA Voice device, it goes to Listening mode but stays in it for up to several minutes, then regains and does what was told
In any case, when it happens, and it does happen several times per day (what I’ve noticed, possibly even more), nothing related to STT/TTS works, including Assist app on my Android phone. There is no problem with internet at that specific time, as someone is watching youtube or surfing during period of STT/TTS inactivity. Also internet connection is not “under pressure” then.
I’m running Core 2025.4.4 & Supervisor 2025.05.3 on RasPi, but it’s not version dependent -it’s been going on for quite some time now. I have Fritz! network modem with the official AVM FRITZ!Box Tools integration. Also Nabu Casa account, that reports to be Connected all the time.
So, my question is: does anybody have any idea where to look? I haven’t seen anything in any log (Fritz log is the most complicated and there is possibility that I missed something, if it’s even related in any way).
My head is currently all over the place, so apologies if it’s a bit incoherent, or if I’ve forgotten to add some important detail.
I’ll appreciate any help. Thanks in advance!
At the bottom I’ll paste part of the HA Voice log, with some timestamps just for reference:
> stage: done
> run:
> pipeline: 01jmyms427bk6fxw6db0yze60s
> language: en
> conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
> tts_output:
> token: H1ZUBuMTTIrJoC7yVYni8w.flac
> url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
> mime_type: audio/flac
> events:
> - type: run-start
> data:
> pipeline: 01jmyms427bk6fxw6db0yze60s
> language: en
> conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
> tts_output:
> token: H1ZUBuMTTIrJoC7yVYni8w.flac
> url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
> mime_type: audio/flac
> timestamp: "2025-05-11T08:58:29.799873+00:00"
> - type: stt-start
> data:
> engine: stt.home_assistant_cloud
> metadata:
> language: en-US
> format: wav
> codec: pcm
> bit_rate: 16
> sample_rate: 16000
> channel: 1
> timestamp: "2025-05-11T08:58:29.800678+00:00"
> - type: stt-vad-start
> data:
> timestamp: 1740
> timestamp: "2025-05-11T09:04:57.973846+00:00"
> - type: stt-vad-end
> data:
> timestamp: 3620
> timestamp: "2025-05-11T09:04:58.071016+00:00"
> - type: stt-end
> data:
> stt_output:
> text: Switch to PC.
> timestamp: "2025-05-11T09:04:58.566546+00:00"
> - type: intent-start
> data:
> engine: conversation.home_assistant
> language: en
> intent_input: Switch to PC.
> conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
> device_id: d4b98d174b2001bca1f52c0aa49ffb7b
> prefer_local_intents: true
> timestamp: "2025-05-11T09:04:58.567326+00:00"
> - type: intent-end
> data:
> processed_locally: true
> intent_output:
> response:
> speech:
> plain:
> speech: Done
> extra_data: null
> card: {}
> language: en
> response_type: action_done
> data:
> targets: []
> success: []
> failed: []
> conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
> continue_conversation: false
> timestamp: "2025-05-11T09:04:58.601916+00:00"
> - type: tts-start
> data:
> engine: tts.home_assistant_cloud
> language: en-GB
> voice: OliviaNeural
> tts_input: Done
> timestamp: "2025-05-11T09:04:58.603146+00:00"
> - type: tts-end
> data:
> tts_output:
> media_id: >-
> media-source://tts/tts.home_assistant_cloud?message=Done&language=en-GB&tts_options=%7B%22audio_output%22:%22mp3%22,%22voice%22:%22OliviaNeural%22,%22preferred_format%22:%22flac%22,%22preferred_sample_rate%22:48000,%22preferred_sample_channels%22:1,%22preferred_sample_bytes%22:2%7D
> token: H1ZUBuMTTIrJoC7yVYni8w.flac
> url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
> mime_type: audio/flac
> timestamp: "2025-05-11T09:04:58.606172+00:00"
> - type: run-end
> data: null
> timestamp: "2025-05-11T09:04:58.607281+00:00"
> stt:
> engine: stt.home_assistant_cloud
> metadata:
> language: en-US
> format: wav
> codec: pcm
> bit_rate: 16
> sample_rate: 16000
> channel: 1
> done: true
> stt_output:
> text: Switch to PC.
> intent:
> engine: conversation.home_assistant
> language: en
> intent_input: Switch to PC.
> conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
> device_id: d4b98d174b2001bca1f52c0aa49ffb7b
> prefer_local_intents: true
> done: true
> processed_locally: true
> intent_output:
> response:
> speech:
> plain:
> speech: Done
> extra_data: null
> card: {}
> language: en
> response_type: action_done
> data:
> targets: []
> success: []
> failed: []
> conversation_id: 01JTZ92DH4SXQE31YZ2W36XZTC
> continue_conversation: false
> tts:
> engine: tts.home_assistant_cloud
> language: en-GB
> voice: OliviaNeural
> tts_input: Done
> done: true
> tts_output:
> media_id: >-
> media-source://tts/tts.home_assistant_cloud?message=Done&language=en-GB&tts_options=%7B%22audio_output%22:%22mp3%22,%22voice%22:%22OliviaNeural%22,%22preferred_format%22:%22flac%22,%22preferred_sample_rate%22:48000,%22preferred_sample_channels%22:1,%22preferred_sample_bytes%22:2%7D
> token: H1ZUBuMTTIrJoC7yVYni8w.flac
> url: /api/tts_proxy/H1ZUBuMTTIrJoC7yVYni8w.flac
> mime_type: audio/flac