Whisper speech to text very slow despite allocated cores

GForce2010 · December 15, 2023, 4:36pm

I’ve been trying my hand at the local voice control in HA.

I’ve followed Everything SMart Home’s guide and go to the point where I am just testing it through the Lovelace dashboard on my PC and phone.

The problem I have is even if the assistant correctly works out what I am saying the speech-to-text takes in the region of 30 seconds.

My HA instance is running in a LXC container on my Proxmox server, in case it was a processing power issue I upped the allocated cores to 8 and gave it 8GB ram.

When processing a command I can see the CPU usage of the LXC is hovering around 50% so this suggests there is plenty of processing power available.

Are there any setting I should be looking at to improve the speed of this? I’ve tried changing the model and have checked that is it set to English.

Thanks.

WallyR · December 15, 2023, 5:10pm

It is probably an one core process so th speed of the core is more important than the number of cores.
Also the access to the core has influence, but I do not know how LXC containers work.
With normal VMs you have a lot of options in the BIOS that should be activated to gain a more direct access to the hardware resources.
STT is heavily dependent on floating point operations, so a graphic card is often WAY better than a CPU.

GForce2010 · December 28, 2023, 12:33pm

Well that’s a bit disappointing, I really wasn’t expecting an enterprise server to give worse performance than a PI4 seeing as the HA website stated that a PI4 normally responds in 8 seconds.

Guess I’m going to be sticking with cloud based voice control.

Thanks.