Assist: Rhasspy-Speech for Speech-to-Text

The latest voice blog post mentioned a new speech to text option: Rhasspy-Speech:

I have hugh delays with Whisper due to limited hardware. Since Rhasspy standalone worked great for me I really would like to use the new Wyoming-Rhasspy-Speech.

So far I did not manage to run it via docker:

  • docker-compose.yml
    ’services:
    rhasspy-speech:
    container_name: rhasspy-speech
    image: rhasspy/wyoming-rhasspy-speech:1.0.0
    volumes:
    - $PWD/models:/models
    - $PWD/train:/train
    ports:
    - 10300:10300
    restart: unless-stopped
    logging:
    driver: journald’
  • Put a model from here into models (extracted) (rhasspy/rhasspy-speech at main)

Startup fails with “WARNING:Skipping model en_US-rhasspy (not trained)
rhasspy-speech | WARNING:No trained models found.”

Any ideas what I missed?
I believe I also need to provide a sentences.yml somewhere?

I did not try myself but picked up this docker-compose example:

services:
  rhasspy-speech:
    container_name: rhasspy-speech
    image: "rhasspy/wyoming-rhasspy-speech:1.0.0"
    restart: unless-stopped
    volumes:
      - "./config/models:/models"
      - "./config/training:/training"
    ports:
      - "10300:10300"
      - "8099:8099"     
    command: 
      - "--hass-token=your_token_here"
      - "--hass-websocket-uri=ws://home_assistant_url:8123/api/websocket"

Heres a link to the source. (discord)

Thanks. I manually build the docker image to have v1.4.3 and used your provided yaml. Works now :slight_smile: Thanks

1 Like

How are your results? Speed compared to whisper? Accuracy compared to whisper?

first tests are great. I did not do any professional comparison and can only report the"felt" improvement. Accuracy is very good and its much faster than Whisper (tiny-int8) for me. You need to manually provide the sentences and train tho…