With the recent Ollama integration into Home Assistant, I’ve been exploring its capabilities and finding it quite good. However, I believe there’s even more potential if we could run Ollama directly as an addon on the same hardware. Currently, I’m using an Asus Chromebox 3 with an Intel® Core™ i7-8550U Processor and 16GB of RAM, and running Ollama locally as an addon has been a positive experience. By leveraging small language models like tinyllama
, tinydolphin
, phi
, etc., I’m achieving quick response times of 2-3 seconds from my Assist device, which is an esp32-s3-box
integrated with the new Ollama integration.
Perhaps voting for your own request would be a good idea. I did.
I’ve just put together such an addon: GitHub - SirUli/homeassistant-ollama-addon: Provides an Home Assistant addon configuration for Ollama.
Great addon, works for me. But I’m unsure how to enable GPU support. With HAOS, I’m not even sure the gpu has a driver. Is there a way to get this working? The readme points to the ollama website, (and maybe I’m just not piecing it together) but that seems to be a guide for a different environment.
@SirUli once your add on is running, can you change the model? The readme doesn’t give much detail for that.
to me it looks like you just delete the integration and re-add to use a different model
Anyone tries this Ollama addon with the new Hailo 10H M.2 module (Hailo 10 series) AI accelerator hardware?
PS: Apparently the less expensive modules in the Hailo family like the Hailo 8 series will not work unless the LLM model has specifically using thier compiler on very powerful hardware.
Are you using these models for assist or only as a fallback?
In general I’d be interested in knowing what people have tried with local gen AI, not sure if there is somewhere it is already being discussed.