Ollama Addon: Running SLMs Locally on the Same Box and Not Just Remotely

a-d-r-i-a-n-d · April 4, 2024, 10:06pm

With the recent Ollama integration into Home Assistant, I’ve been exploring its capabilities and finding it quite good. However, I believe there’s even more potential if we could run Ollama directly as an addon on the same hardware. Currently, I’m using an Asus Chromebox 3 with an Intel® Core™ i7-8550U Processor and 16GB of RAM, and running Ollama locally as an addon has been a positive experience. By leveraging small language models like tinyllama , tinydolphin , phi , etc., I’m achieving quick response times of 2-3 seconds from my Assist device, which is an esp32-s3-box integrated with the new Ollama integration.

Sir_Goodenough · April 4, 2024, 10:11pm

Perhaps voting for your own request would be a good idea. I did.

Feature Request Guidelines 📔.

SirUli · April 23, 2024, 11:49pm

I’ve just put together such an addon: GitHub - SirUli/homeassistant-ollama-addon: Provides an Home Assistant addon configuration for Ollama.

ultimatejacob27 · October 12, 2024, 5:32pm

Great addon, works for me. But I’m unsure how to enable GPU support. With HAOS, I’m not even sure the gpu has a driver. Is there a way to get this working? The readme points to the ollama website, (and maybe I’m just not piecing it together) but that seems to be a guide for a different environment.

criminosis · November 8, 2024, 1:13pm

@SirUli once your add on is running, can you change the model? The readme doesn’t give much detail for that.

mclever · November 20, 2024, 12:18am

to me it looks like you just delete the integration and re-add to use a different model

Hedda · December 14, 2024, 5:02pm

Anyone tries this Ollama addon with the new Hailo 10H M.2 module (Hailo 10 series) AI accelerator hardware?

Generative AI Accelerator: The Efficient Hailo-10H M.2 Module
- https://hailo.ai/files/hailo-10h-m-2-et-product-brief-en/

PS: Apparently the less expensive modules in the Hailo family like the Hailo 8 series will not work unless the LLM model has specifically using thier compiler on very powerful hardware.

gutifarra · December 14, 2024, 6:27pm

Are you using these models for assist or only as a fallback?

In general I’d be interested in knowing what people have tried with local gen AI, not sure if there is somewhere it is already being discussed.