Voice assistant upload buffering latency?

Hey everyone. This may be expected but it has seemed odd to me and I think I may have done something wrong. I recently setup a whisper/piper audio pipeline using the Wyoming protocol integration and found that the latency was very noticeable when talking to the assistant. At first I thought this was just underpowered hardware until I went in and saw that the voice assistant debug panel was only recording 0.1 seconds on average for the voice transcription. After more testing what I eventually found was the latency was correlated to how long the audio clip was. If I send quick silence then its almost instant. But a 2 second clip has much more noticeable latency (5 - 8 seconds). My thought is that this has something to do with buffering the audio file before sending? Is there any way to improve this piece of it?

The overview of my setup is I have 2 nixos hosts as Proxmox VMs. One running HA in podman with nginx as the proxy for SSL. And then a second nixos host runs my ‘AI’ stuff with a gtx 1080 passed through which is where whisper is running also in podman with nvidia container toolkit. All this connected over tailscale. I’ve tested bypassing nginx and tailscale by hitting the tailscale DNS name directly and also my local ip address directly and all ways the latency is there. Another point that makes me think the transcription itself is not the issue is I can tail the whisper logs and the ‘processing’ and ‘done’ logs for whisper are almost imperceptibly close to each other.

Happy to answer any follow up questions for config examples. I really don’t modify HA much beyond its defaults except for I use a postgres instance for my recorder integration. Pretty much everything else in configuration.yaml is untouched.

Have you updated to the latest version?

Finally figured this out and it was kind of stupid but also not immediately intuitive. I just had the specs on my homeassistant VM to low to encode the audio quickly enough. Once I bumped from 1CPU core to 2 and 2GB of memory to 4 it got much better. I was just initially focused on the performance of the VM running AI workloads which is where most of the resources of the host machine were put to that I didn’t even think about checking the performance graphs of the homeassistant VM.