Leaving this here for anyone who is curious… Ollama 0.8.0 (released May 28th 2025) now supports streaming text responses for models with tool function!
(I’m on Intel-IPEX for my local box - have to wait for it to port to the IPEX build.) Will definitely start watching for this one to flow through their git.
Confirm it finally working in the text chat. I really really looking forward that our great HA team will implement response streaming for voice as well. this would extend the voice experience by A LOT!
I upgraded and it works fine. Now i wish Whisper did streaming. It would also be cool if Piper did it too. Shaving those two or three extra seconds between responses is a great user experience enhancer.
Glad it worked for you, I have two different rigs both running ollama 0.9 and both give me the response I gave screenshot for above… I’ll try and troubleshoot more but Ollama upgrade is only new thing. Which model do you run? Gemma w/ Tools or Qwen? Something else? thanks!
Edit: Now voice is giving me proper responses but prompting with text gives me tool call nonsense pictured above… I’ll keep working on it