LLaMA Conversation Integration

I wanted to share the custom component I’ve been working on for running local AI models as conversation agents. It is all done locally via the Llama.cpp library which allows for running models on your CPU and even runs on a Raspberry Pi (if you can find a small enough model that is).

The goal was to have a solution that allowed me to experiment with different local AI models that can be quantized to run on actual Home Assistant hardware in the near future.


  • Run models locally using Llama.cpp as part of Home Assistant or as an addon via oobabooga/text-generation-webui
  • Connect to OpenAI API compatible model backends
  • Output parsing to execute Home Assistant services using JSON function calling
  • A provided example model that is fine tuned to work with the extension

The dataset is also provided if others want to fine tune their own models to work with this extension.

Installation instructions are on the GitHub page and it should be installable via HACS.