My HAOS is running as a VM with 8gb ram. The host is an i5/32gb running debian 13. Voice hardware is Voice PE (newly installed). Not much else is running on this server, it’s mainly for NAS storage. For a simple “what’s the time” query HA is taking around 7 or 8 seconds to respond. Is this typical performance?
Look at debug under Voice Assistant and see the times for each step to complete the query.
You need to have Prefer Handling Locally checked if you don’t want the query sent to an LLM.
7-8 seconds would be pretty normal if you are doing a call to an LLM.
There are so many possibilities to to use voice assistant.
Are you using local SST and or TTS?
Or Nabu Casa Cloud or another cloud service?
An LLM (which one) based assist or not? And if you use one, do you have ‘prefer local’ activated?
Possible response times might vary a lot based on these settings, used services or hardware.
An example from my end:
I use Nabu Casa Cloud for SST and TTS, and only LLM based (no prefer locally) with a fast online hoster for LLM models instead of the large providers like OpenAI and Google that are often not optimized for low latency answers (time to first token).
Asking ‘What’s the time’ (end of my speech until the voice response begins) take about 2 seconds.
But that might not help you in case you‘re after a complete different solution.
Sorry for leaving out detail.
Voice Assistants / Speech-to-text has two selections:
- Home Assistant
- Focused local assistant
Both are set to use speech-to-phrase. I can set either to preferred, the time for my query (what’s the time) to be answered is around 7 seconds. I don’t have an LLM connection.