I’ve been experimenting with a different approach to natural language control in Home Assistant and wanted to share a quick demo.
Instead of relying on sentence templates or intent definitions, I’m experimenting with an entity-grounded natural language router.
The basic idea is fairly simple:
The system feeds a small language model a structured list of entities in a domain along with their current state and the services that domain supports. When a command comes in, the model selects the correct entity and service directly from that list and returns a deterministic JSON action.
Example prompt input looks something like this:
light.back_mudroom = off
light.back_porch = off
light.dog_house = on
light.kitchen = off
If the user says:
“turn off the dog house light”
the model returns something like:
{“type”:“ACTION”,“domain”:“light”,“service”:“turn_off”,“entity_id”:“light.dog_house”}
That JSON is then routed back into Home Assistant to execute the service call.
Natural language flexibility
Because the router is reasoning over entity names and states instead of sentence templates, it can handle a variety of natural phrasings without needing predefined intent patterns.
For example, the following phrases can all resolve to the same entity and service:
“turn off the dog house light”
“please turn off the dog house”
“hey can you shut that dog house light off”
“can you turn off that bedroom light”
“I don’t want that light on anymore”
The router simply maps the language to the closest matching entity and chooses an appropriate service.
Architecture
Right now the prototype pipeline looks like this:
Assist satellite
↓
Whisper speech recognition
↓
MQTT topic
↓
Python AI router
↓
LLM selects entity + service
↓
MQTT action topic
↓
Home Assistant executes service
MQTT is simply used as the message bus between the voice pipeline and the router.
Models Tested
I’ve been testing very small local models for the routing step, including:
- Qwen3 0.6B
- Qwen2.5 0.5B
Surprisingly these tiny models perform very well when the prompt is structured with entity/state information.
Current round-trip voice latency in testing is around ~2.5–3 seconds end-to-end, fully local.
Important notes
This is very early experimentation and nowhere near production ready.
The goal isn’t to replace Home Assistant’s existing intent system, but to explore whether a lightweight LLM router could provide a more flexible natural language layer without maintaining sentence templates.
Right now I’m mostly experimenting to see how far this approach can be pushed with very small local models.
Demo video:
