edit: Sorry, cant post all the proper links as I’m restricted as a new forum signup.
This was one of the few threads I found online whilst searching high and low for the same info so I’m reposting here in case it helps anyone else. Perhaps someone more eloquent than me could write a guide.
The key info for me was in a thread on these forums where someone mentioned passing
“–streaming” to the piper config. They were talking about docker but I just worked through leaning how to edit files on native HAOS. Maybe search for “documentation-to-test-streaming-tts-feature”
Here’s what I did to get streaming working.
In Piper 1.6.1 the release notes showed they had enabled much faster streaming of LLM responses. Essentially what’s happening is the speech generation begins before the LLM has finished fully formulating its response. The result is that longer responses and conversational interactions go from quite slow to useably fast. My experience prior to this was:
- Turning on a light might take upto 10 seconds or slightly longer
- Anything not simplistic would take upto 90 seconds to start responding.
This was with prefer local processing enabled and when waking up my main PC to run the LLMs the first query of the day required loading up the LLM into its memory which also slowed things down. That aspect is still true but you can tell HAOS to keep the LLM alive with a timeout of -1 for infinite.
Here’s my home setup:
Home Assistant Voice PE
Satellite1 from FutureProofHomes
Raspberry Pi 5 16Gig
Mid tier Local PC with LLM(s) running on RX 7900 XTX 24Gig
All using todays very latest versions of software and firmware except for HAOS which is 2025.11.3
I’ve not done any extensive benchmarks for different LLMs but did notice an improvement when using smaller models, but of course YMMV depending on your needs and hardware. For managing a home and dealing with simple longer queries I’m using ministral-3:3b ollama com /library/ministral-3 which is only a 3Gig download. I’ll probably go to AMD 395+ miniPC when budgets allow.
Step 1 - Make sure your HAOS is up to date and backed up. Back it up. I really mean it.
Step 2 - Read this article about editing configuration.yaml home-assistant.io docs configuration Then install the file editor addon:
home-assistant.io common-tasks os #installing-and-using-the-file-editor-add-on They describe using samba or Studio code. If you have these already you could edit the same config using those. YMMV.
Step 3 - Open file editor and from the HomeAssistant root find configuration.yaml, click on it and an editor pane will appear to the right. See picture below.
Step 4 - Add the content from line 13 onwards , the rest was already there. Don’t edit anything else. Don’t worry about it looking like a short configuration.yaml, that’s normal.
Enable Streaming for PIPER TTS
tts:
- platform: piper
model: en_US-piper
streaming: true
If you get it right there’s a green tick on the top right as an indicator. Click the adjacent save button and restart the whole HAOS so all the scripts/addons get a fresh boot. If you test a longer query, e.g. “How do I stroke a cat” or “Tell me a short story” you should get a response generated much faster than previous attempts. If your LLM isn’t loaded the first attempt will have to wait for it to load so you might want to set it to be held in memory with -1 on the ollama keepalive option.