I tried this guide $13 voice assistant for Home Assistant - Home Assistant and I’ve first set up local wyoming-whisper and wyoming-piper containers. I’m now getting to try whisper, in French. The word recognition is terrible. I tried multiple language models, and even with the best model available (regardless of processing time), whisper keeps making up new words that doesn’t exist. I couldn’t trigger a single action in HA with it.
I get that french is probably much less accurate. But is there a way to improve Whisper recognition ? In its current state, it is not usable on my server. Is there things I can do ?
I could switch to an online service like Google STT but it annuls the whole idea of having everything self-hosted. Is there services that are better on the privacy side ?
Thanks in advance for any answer and have a great day.
If it’s a model problem, this won’t help much, but if it’s a poor audio recording issue, resulting in poor recognition, you can try this in HA’s configuration.yaml:
You will get the wav files of what Assist records. You can set whatever folder you want of course. Don’t forget to remove it afterwards, it creates a lot of files…
Thanks for the tip. I just tried and the sound is correct. Not perfect but correct.
EDIT : I don’t understand why, but the audio debug stopped working. I left the option in my config, but now it no longer creates an audio file when asking assist. I tried disabling and re-enabling debug option but no luck.
Well, if you know Python enough, you could try setting up a simple test with faster-whisper directly, your recorded audio (not sure what you used?), vs a better recorded audio (if you have a good mic you can record manually with). You’d have a definitive answer of the question of recognition vs recording quality.
I assume you tried increasing beam_size? I’ve found 5 with small-int8 model to be the best performance/quality combo on my machine. That’s with English though; it’s far from perfect, but decent. I’ve learnt a few words are almost never recognized properly (fault to my accent, or poor recognition, who knows…); I use aliases to avoid them.
The only thing that should stop the debug recording thread is the end of the pipeline. Otherwise, it could skip because of exceptions… did you look at the HA logs?