Boosting Speed: Instant Audio Feedback in Home Assistant

I am currently employing a HomePod as an interface for initiating commands within a home automation framework, specifically to control a fan via a SwitchBot mechanism. The central challenge I am encountering is not with the device actuation delay—approximately 13 seconds for the fan to respond—but rather with the latency in receiving auditory feedback from my speaker system, which stands at about 8 seconds. This latency nearly nullifies the utility of the feedback, as the purpose is to receive immediate confirmation of the command execution.

In my attempts to resolve this issue, I have explored various methods to expedite the auditory response. Ideally, leveraging a local audio file for instant playback would address my requirement. However, I have encountered difficulties in implementing this solution effectively and have thus resorted to using text-to-speech (TTS) services. It appears that the TTS processing time significantly contributes to the observed delay.

For a comprehensive understanding, I have documented the automation sequence and made it accessible via Pastebin.[alias: Fläktendescription: ""trigger: - platform: state entity_id: - Pastebin.com](https://Automation Pastebin)

Additionally, a visual demonstration of the entire process is available through a YouTube link provided.[https://www.youtube.com/shorts/WQNRbYLJm2c](https://Youtube demonstration)

I am seeking assistance to refine my approach, ensuring that the auditory feedback is delivered with minimal delay, thereby enhancing the responsiveness and user experience of my home automation setup.

Your expertise and suggestions on how to achieve this would be greatly appreciated.

I managed to get it to run locally, the code i changes can be found here [service: media_player.play_media data: media_content_id: media-source: - Pastebin.com](https://Pastebin updated local)

Tho it cut it off by 1-2 seconds but not instantly which i was hoping for.

Rarely are you going to get instant results when using cloud-based services. So, Siri (via Apple’s cloud servers) → [however you are hosting HA] → switchbot (depending on bluetooth or cloud) → back to Siri again. Using TTS and/or media files, that’s upload time to the media player (Homepod) through the cloud and for TTS, time to generate the TTS media to upload. Depending on several conditions (latency of your own internet connection, latency of Apple’s cloud, latency of Switchbot’s cloud, etc), I doubt you’re going to get below 6-8 seconds round trip time.

You could go with a local speaker (something like a Sonos or other type of local speaker). That’s about the only way I can think of that will reduce that audio latency. I’ve also found that Alexa devices are slightly better in terms of cloud latency compared to Apple’s cloud. My commands to Alexa are always faster than my commands to Siri.

If I’m understanding it correct everything is local? When i say the command siri changes a home kit flip which is integrated in HA so it going right away, then the speaker is also integrated as a speaker in HA. So it should be local?

The strange pare which i noticed now it that if i enter the automation and only run the send voice it takes 3-4 second for the speaker to say it, this would have been perfect.

I don’t care actually how long time until it turns off as i know bluetooth switchbots are slow i just want to know its working or having trouble,

solved it,

If i put the sound in a script and then the first this i do in the automation is to call for it, then it takes 5-6 seconds to play the audio and then I already know that my slow fan will be turned off :slight_smile: Give my less anxiety if my automation are working or not :yum:

Sound file i put in script to boost it up

if anyone wants to try this themself i would recommend this site, it’s free, many langues, volume control etc. I would also recommend to use mp3 instead of wav as it a little bit easier performance wise on they system so it might go quicker:

[Narakeet - Easily Create Voiceovers and Narrated Videos Using Realistic Text to Speech!](https://Narakeet, website i described where i make my audio files)