Assist: Rhasspy-Speech for Speech-to-Text

pipip · December 23, 2024, 10:03am

The latest voice blog post mentioned a new speech to text option: Rhasspy-Speech:

I have hugh delays with Whisper due to limited hardware. Since Rhasspy standalone worked great for me I really would like to use the new Wyoming-Rhasspy-Speech.

So far I did not manage to run it via docker:

docker-compose.yml
’services:
rhasspy-speech:
container_name: rhasspy-speech
image: rhasspy/wyoming-rhasspy-speech:1.0.0
volumes:
- $PWD/models:/models
- $PWD/train:/train
ports:
- 10300:10300
restart: unless-stopped
logging:
driver: journald’
Put a model from here into models (extracted) (rhasspy/rhasspy-speech at main)

Startup fails with “WARNING:Skipping model en_US-rhasspy (not trained)
rhasspy-speech | WARNING:No trained models found.”

Any ideas what I missed?
I believe I also need to provide a sentences.yml somewhere?

Cadster · December 23, 2024, 10:17am

I did not try myself but picked up this docker-compose example:

services:
  rhasspy-speech:
    container_name: rhasspy-speech
    image: "rhasspy/wyoming-rhasspy-speech:1.0.0"
    restart: unless-stopped
    volumes:
      - "./config/models:/models"
      - "./config/training:/training"
    ports:
      - "10300:10300"
      - "8099:8099"     
    command: 
      - "--hass-token=your_token_here"
      - "--hass-websocket-uri=ws://home_assistant_url:8123/api/websocket"

Heres a link to the source. (discord)

8-1-2025: updated the link

pipip · December 23, 2024, 10:27am

Thanks. I manually build the docker image to have v1.4.3 and used your provided yaml. Works now Thanks

Pkkrusty · December 24, 2024, 7:43am

How are your results? Speed compared to whisper? Accuracy compared to whisper?

pipip · December 24, 2024, 9:18am

first tests are great. I did not do any professional comparison and can only report the"felt" improvement. Accuracy is very good and its much faster than Whisper (tiny-int8) for me. You need to manually provide the sentences and train tho…

CvH · December 25, 2024, 4:15pm

I have not yet tested that implementation but I had tested rhasspy in the past, basically 100% success and very fast at garbage hardware (RPi4 iirc) + 1€ USB Mic

I have so much false detects ~50% with whisper (could be language related) even at medium llvm because the recognised words are slightly off so nothing works.
So it detects “of” or “offer” instead of off - nothing works.

dumbdevice · December 25, 2024, 5:28pm

I have a pretty bad experience with faster-whisper, it’s often recognize words wrong. So am I waiting for a working Rhasspy-Speech docker image to try.

pipip · December 25, 2024, 9:09pm

I have just built the docker image myself.
You can just use the provided Dockerfile and update the version: wyoming-addons/rhasspy-speech/Dockerfile at eb1985688e429d217e0de788567e53f0fea898bc · rhasspy/wyoming-addons · GitHub

dumbdevice · December 26, 2024, 1:19pm

thanks, I gonna wait for official image update.

Pkkrusty · December 29, 2024, 10:00pm

For me also, rhasspy has seemed more accurate and faster from first impressions.

ronnie_j · January 4, 2025, 6:48pm

Hi,

I’m unable to open the Discord link.

I have the same error in my container when triggering voice in the Companion app (I have extracted a model in :

~/docker/wyoming-rasspy $ ls config/models/nl_NL-cgn/
config.json  frequent_words.txt  g2p.fst  lexicon.db  LICENSE  model  phoneme_examples.txt  README.md  SOURCE




rhasspy-speech  | INFO:root:Ready
rhasspy-speech  |  * Serving Flask app 'rhasspy_speech'
rhasspy-speech  |  * Debug mode: off
rhasspy-speech  | INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
rhasspy-speech  |  * Running on all addresses (0.0.0.0)
rhasspy-speech  |  * Running on http://127.0.0.1:8099
rhasspy-speech  |  * Running on http://172.18.0.2:8099
rhasspy-speech  | INFO:werkzeug:Press CTRL+C to quit
rhasspy-speech  | WARNING:root:Skipping model nl_NL-cgn (not trained)
rhasspy-speech  | WARNING:root:No trained models found.
rhasspy-speech  | ERROR:root:No model selected

In HA I have configured the ‘faster-whisper’ integration.

I would like to improve my tests, with Dutch language.

pipip · January 4, 2025, 8:07pm

@ronnie_j

Open the web ui via port 8099 in order to download the model and train your custom commands.

Ladidaa · January 17, 2025, 12:57pm

Hi,
i tried “Whisper” and also “Rhasspy-Speech”.
i am extremly impressed with the accuracy and incredible speed of rhasspy-speech.

When i use “Whisper” it understands me in general, but the text “result” is a little bit different each time, i.e. the text results of three tries starting the script for WebRadio in German:
“Bayern-1-Webradio starten”
“Bayern eins Webradio starten”
“Bayern 1 Web Radio starten”
→ Home Assistant never understood me, because the script is called “Bayern 1 Webradio starten”

But it does not understand “new” words.
When i want to “name” a timer, or add an item to the to-do / Shopping-list it does not understand me.

Would this “Improvement” be possible?

First try speech-recognition with “rhasspy-Speech” when it does not recognize the sentence, the “Audio” is automatically handed over to “whisper” which then recognizes the sentence and gives it back to Home-Assistant.

This would be an awesome feature!
It would combine the best of both worlds and would be fast but also flexible.

Thank you for your help in advance!

dumbdevice · January 17, 2025, 4:02pm

The same Idea as fallback to LLM that we have now?

How did you install rhasspy-Speech? via addon?
By “whisper” you mean faster-whisper?

Ladidaa · January 21, 2025, 4:35pm

Yes, i installed rhasspy-Speech via Addon

Yes, by whisper ein mean “faster whisper” running as a Docker on my Intel N100 Unraid “Server”.
I tried HomeLLM – but the N100 is much too weak for that.

Faster-whisper is running “ok” - so response in 5 to 10 Seconds.