Best practise for Whisper

Hi folks,

I will get a new mini PC this week with a NVIDIA 2060 and I would like to hear your opinions / best practise for using Whisper.
Currently I am using it on a n100 mini PC with tini-int8 // Beam size 2, and the results are not what I expected. Other models will result in processing times of > 5 secs.
So should I install HA on the new mini PC (which is used for other stuff) or could I just ‘outsource’ whisper (and use the GPU) to the new PC and keep my current installation?

Lemme know if anything is unclear or missing.

Hey, this is strange, I use medium-int8 on a similar GPU and its nearly instant. So GPU is the answer :slight_smile:

I use this docker compose if it helps:

  faster-whisper:
    image: lscr.io/linuxserver/faster-whisper:gpu
    container_name: faster-whisper
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Prague
      - WHISPER_MODEL=medium-int8
      - WHISPER_LANG=en
      - WHISPER_BEAM=5
      - LOG_LEVEL=DEBUG
    volumes:
      - ./faster-whisper/data:/config
    restart: unless-stopped
    network_mode: host
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities:
                - gpu
                - utility
                - compute

  wyoming-piper-gpu:
    image: slackr31337/wyoming-piper-gpu:latest
    container_name: wyoming-piper-gpu
    environment:
      - PIPER_VOICE=glados
      - PIPER_PROCS=4
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Prague
    volumes:
      - ./wyoming-piper-gpu:/data
    restart: unless-stopped
    network_mode: host
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities:
                - gpu
                - utility
                - compute

Hey,

thanks for the configs. How do I integrate this external services in HA?

Using the wyoming protocol integration. Both piper and faster-whisper run on localhost at some port. So add both under wyoming and they will be detected as TTS or STT properly. Then you will see them appear under voice assist options for these categories.
Similarly if you would add openWakeWord itd appear there under too

But I do not have them locally, they run at another server in the same network

Then just change IP from localhost to whatever and make sure your firewall does not interfere

Works like a charm now. Thanks for your help!

1 Like

keep an eye on linuxserver piper too. I believe they will re-add the gpu again

??? wyoming-piper-gpu is running just fine. What are the benefits of switching to the linuxserver version?

  • Increase beam size
  • Use the initial prompt option

@tannisroot what beam size do you recommend? What what initial prompt option, I dont see this in docs. Thanks!

@tannisroot I found online what the initial prompt is, but I failed to find anywhere how to enable it with my docker container. could you please help?

if you are using the wyoming-faster-whisper container, the option is --initial-prompt “” (contents are inside the quotes)