Ollama Addon: Running SLMs Locally on the Same Box and Not Just Remotely

a-d-r-i-a-n-d · April 4, 2024, 10:06pm

With the recent Ollama integration into Home Assistant, I’ve been exploring its capabilities and finding it quite good. However, I believe there’s even more potential if we could run Ollama directly as an addon on the same hardware. Currently, I’m using an Asus Chromebox 3 with an Intel® Core™ i7-8550U Processor and 16GB of RAM, and running Ollama locally as an addon has been a positive experience. By leveraging small language models like tinyllama , tinydolphin , phi , etc., I’m achieving quick response times of 2-3 seconds from my Assist device, which is an esp32-s3-box integrated with the new Ollama integration.

Sir_Goodenough · April 4, 2024, 10:11pm

Perhaps voting for your own request would be a good idea. I did.

Feature Request Guidelines 📔.

SirUli · April 23, 2024, 11:49pm

I’ve just put together such an addon: GitHub - SirUli/homeassistant-ollama-addon: Provides an Home Assistant addon configuration for Ollama.

ultimatejacob27 · October 12, 2024, 5:32pm

Great addon, works for me. But I’m unsure how to enable GPU support. With HAOS, I’m not even sure the gpu has a driver. Is there a way to get this working? The readme points to the ollama website, (and maybe I’m just not piecing it together) but that seems to be a guide for a different environment.

criminosis · November 8, 2024, 1:13pm

@SirUli once your add on is running, can you change the model? The readme doesn’t give much detail for that.

mclever · November 20, 2024, 12:18am

to me it looks like you just delete the integration and re-add to use a different model

Hedda · December 14, 2024, 5:02pm

Anyone tries this Ollama addon with the new Hailo 10H M.2 module (Hailo 10 series) AI accelerator hardware?

Generative AI Accelerator: The Efficient Hailo-10H M.2 Module
- https://hailo.ai/files/hailo-10h-m-2-et-product-brief-en/

PS: Apparently the less expensive modules in the Hailo family like the Hailo 8 series will not work unless the LLM model has specifically using thier compiler on very powerful hardware.

gutifarra · December 14, 2024, 6:27pm

Are you using these models for assist or only as a fallback?

In general I’d be interested in knowing what people have tried with local gen AI, not sure if there is somewhere it is already being discussed.

bf8392 · February 17, 2025, 5:54pm

I also would be interested in how to select a model

SirUli · February 25, 2025, 10:17pm

Run the integration as described in the Readme. If you need more models at the same time, you need to make sure that you have the integration twice and sufficient resources. If you want to change the model, you delete the integration and run the integration again.