The latest voice blog post mentioned a new speech to text option: Rhasspy-Speech:
I have hugh delays with Whisper due to limited hardware. Since Rhasspy standalone worked great for me I really would like to use the new Wyoming-Rhasspy-Speech.
first tests are great. I did not do any professional comparison and can only report the"felt" improvement. Accuracy is very good and its much faster than Whisper (tiny-int8) for me. You need to manually provide the sentences and train tho…
I have not yet tested that implementation but I had tested rhasspy in the past, basically 100% success and very fast at garbage hardware (RPi4 iirc) + 1€ USB Mic
I have so much false detects ~50% with whisper (could be language related) even at medium llvm because the recognised words are slightly off so nothing works.
So it detects “of” or “offer” instead of off - nothing works.
I have a pretty bad experience with faster-whisper, it’s often recognize words wrong. So am I waiting for a working Rhasspy-Speech docker image to try.
I have the same error in my container when triggering voice in the Companion app (I have extracted a model in :
~/docker/wyoming-rasspy $ ls config/models/nl_NL-cgn/
config.json frequent_words.txt g2p.fst lexicon.db LICENSE model phoneme_examples.txt README.md SOURCE
rhasspy-speech | INFO:root:Ready
rhasspy-speech | * Serving Flask app 'rhasspy_speech'
rhasspy-speech | * Debug mode: off
rhasspy-speech | INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
rhasspy-speech | * Running on all addresses (0.0.0.0)
rhasspy-speech | * Running on http://127.0.0.1:8099
rhasspy-speech | * Running on http://172.18.0.2:8099
rhasspy-speech | INFO:werkzeug:Press CTRL+C to quit
rhasspy-speech | WARNING:root:Skipping model nl_NL-cgn (not trained)
rhasspy-speech | WARNING:root:No trained models found.
rhasspy-speech | ERROR:root:No model selected
In HA I have configured the ‘faster-whisper’ integration.
I would like to improve my tests, with Dutch language.
Hi,
i tried “Whisper” and also “Rhasspy-Speech”.
i am extremly impressed with the accuracy and incredible speed of rhasspy-speech.
When i use “Whisper” it understands me in general, but the text “result” is a little bit different each time, i.e. the text results of three tries starting the script for WebRadio in German:
“Bayern-1-Webradio starten”
“Bayern eins Webradio starten”
“Bayern 1 Web Radio starten”
→ Home Assistant never understood me, because the script is called “Bayern 1 Webradio starten”
But it does not understand “new” words.
When i want to “name” a timer, or add an item to the to-do / Shopping-list it does not understand me.
Would this “Improvement” be possible?
First try speech-recognition with “rhasspy-Speech” when it does not recognize the sentence, the “Audio” is automatically handed over to “whisper” which then recognizes the sentence and gives it back to Home-Assistant.
This would be an awesome feature!
It would combine the best of both worlds and would be fast but also flexible.
Yes, by whisper ein mean “faster whisper” running as a Docker on my Intel N100 Unraid “Server”.
I tried HomeLLM – but the N100 is much too weak for that.
Faster-whisper is running “ok” - so response in 5 to 10 Seconds.