Ollama Now Supports Streaming Text Responses with Tools

Leaving this here for anyone who is curious… Ollama 0.8.0 (released May 28th 2025) now supports streaming text responses for models with tool function!

1 Like

Rock on - if you can upgrade your builds :slight_smile:

(I’m on Intel-IPEX for my local box - have to wait for it to port to the IPEX build.) Will definitely start watching for this one to flow through their git.

oh and @Rudd-O - that was your bug originally…

1 Like

Confirm it finally working in the text chat. I really really looking forward that our great HA team will implement response streaming for voice as well. this would extend the voice experience by A LOT! :smiley:

1 Like

Pretty sure this Ollama release broke my assistant pipeline…

Gonna upgrade Ollama N O W. Woohoo!

1 Like

Ok then i’ll hold off on upgrading.

Crap. :expressionless:

1 Like

I upgraded and it works fine. Now i wish Whisper did streaming. It would also be cool if Piper did it too. Shaving those two or three extra seconds between responses is a great user experience enhancer.

3 Likes

Glad it worked for you, I have two different rigs both running ollama 0.9 and both give me the response I gave screenshot for above… I’ll try and troubleshoot more but Ollama upgrade is only new thing. Which model do you run? Gemma w/ Tools or Qwen? Something else? thanks!

Edit: Now voice is giving me proper responses but prompting with text gives me tool call nonsense pictured above… I’ll keep working on it

1 Like

Qwen2.5 7B 4bit. Hugh Mungus context window.

1 Like

Did you have to do your own model file for that I think im landing on qwen for the monastery