Programmatically manage exposed entities for Assist/Ollama (WebSocket API)

The problem

When using an LLM conversation agent (Ollama, OpenAI, etc.) with Home Assistant, you need to carefully control which entities are exposed to avoid saturating the LLM context window. With 300+ entities exposed by default, the model dumps the entity list instead of answering.

The HA UI only lets you toggle entities one by one in Settings > Voice Assistants > Expose. There’s no bulk management, no REST API, and no documented programmatic way to do this.

The solution: WebSocket API

Home Assistant has undocumented WebSocket commands for managing exposed entities. I found them in the HA core source code:

List exposed entities

{
  "id": 1,
  "type": "homeassistant/expose_entity/list"
}

Returns all entities with their exposure status per assistant (conversation, cloud.alexa, cloud.google_assistant):

{
  "id": 1,
  "type": "result",
  "success": true,
  "result": {
    "exposed_entities": {
      "light.living_room": {
        "conversation": true,
        "cloud.alexa": false,
        "cloud.google_assistant": false
      }
    }
  }
}

Expose or unexpose entities

{
  "id": 2,
  "type": "homeassistant/expose_entity",
  "assistants": ["conversation"],
  "entity_ids": ["light.living_room", "sensor.temperature"],
  "should_expose": true
}

Key details:

  • Immediate effect - no HA restart needed
  • Bulk operations - pass multiple entity IDs in one call
  • Per-assistant - control exposure separately for conversation, cloud.alexa, cloud.google_assistant

Python script

I wrote a script that manages this automatically. It reads target entities from a YAML file and syncs the exposure state:

"""
Manage entities exposed to Home Assistant voice assistants via WebSocket API.

Uses the undocumented HA WebSocket commands:
  - homeassistant/expose_entity/list  - list exposed entities per assistant
  - homeassistant/expose_entity       - expose/unexpose entities

Usage:
  python ha_expose.py --list-only                  # list currently exposed
  python ha_expose.py --dry-run                    # show what would change
  python ha_expose.py                              # apply entities from YAML

Requires: pip install websockets pyyaml
Environment: HA_TOKEN (long-lived access token)
"""
import argparse
import asyncio
import json
import os
import sys
from pathlib import Path

import websockets
import yaml

DEFAULT_URL = "ws://homeassistant.local:8123/api/websocket"

msg_id = 0

def next_id():
    global msg_id
    msg_id += 1
    return msg_id

def load_entities(path):
    """Load target entity IDs from a YAML file."""
    with open(path) as f:
        data = yaml.safe_load(f)
    entities = []
    for category in data.values():
        if isinstance(category, list):
            entities.extend(category)
    return entities

async def send_and_receive(ws, payload):
    """Send a WebSocket message and wait for the matching response."""
    await ws.send(json.dumps(payload))
    while True:
        resp = json.loads(await ws.recv())
        if resp.get("id") == payload.get("id"):
            return resp

async def main():
    parser = argparse.ArgumentParser(description="Manage HA exposed entities via WebSocket API")
    parser.add_argument("--url", default=os.environ.get("HA_WS_URL", DEFAULT_URL))
    parser.add_argument("--token", default=os.environ.get("HA_TOKEN"))
    parser.add_argument("--entities", type=Path, required=True, help="YAML file with target entities")
    parser.add_argument("--assistant", default="conversation")
    parser.add_argument("--list-only", action="store_true", help="Only list currently exposed")
    parser.add_argument("--dry-run", action="store_true", help="Show what would change")
    args = parser.parse_args()

    if not args.token:
        print("ERROR: Set HA_TOKEN env var or use --token"); sys.exit(1)

    async with websockets.connect(args.url) as ws:
        # Authenticate
        msg = json.loads(await ws.recv())
        assert msg["type"] == "auth_required"
        await ws.send(json.dumps({"type": "auth", "access_token": args.token}))
        msg = json.loads(await ws.recv())
        assert msg["type"] == "auth_ok", f"Auth failed: {msg}"

        # List current state
        resp = await send_and_receive(ws, {"id": next_id(), "type": "homeassistant/expose_entity/list"})
        currently_exposed = [
            eid for eid, assistants in resp["result"]["exposed_entities"].items()
            if assistants.get(args.assistant) is True
        ]
        print(f"Currently exposed to {args.assistant}: {len(currently_exposed)}")

        if args.list_only:
            for eid in sorted(currently_exposed):
                print(f"  {eid}")
            return

        # Load targets and compute diff
        target = load_entities(args.entities)
        to_unexpose = set(currently_exposed) - set(target)
        to_expose = set(target) - set(currently_exposed)
        print(f"Target: {len(target)} | To expose: {len(to_expose)} | To unexpose: {len(to_unexpose)}")

        if args.dry_run:
            for eid in sorted(to_expose): print(f"  + {eid}")
            for eid in sorted(to_unexpose): print(f"  - {eid}")
            return

        # Apply
        if to_unexpose:
            resp = await send_and_receive(ws, {
                "id": next_id(), "type": "homeassistant/expose_entity",
                "assistants": [args.assistant], "entity_ids": list(to_unexpose), "should_expose": False,
            })
            print(f"Unexposed {len(to_unexpose)} OK" if resp.get("success") else f"ERROR: {resp}")

        if to_expose:
            resp = await send_and_receive(ws, {
                "id": next_id(), "type": "homeassistant/expose_entity",
                "assistants": [args.assistant], "entity_ids": list(to_expose), "should_expose": True,
            })
            print(f"Exposed {len(to_expose)} OK" if resp.get("success") else f"ERROR: {resp}")

        # Verify
        resp = await send_and_receive(ws, {"id": next_id(), "type": "homeassistant/expose_entity/list"})
        final = [eid for eid, a in resp["result"]["exposed_entities"].items() if a.get(args.assistant) is True]
        print(f"Final count: {len(final)} {'SUCCESS' if len(final) == len(target) else 'MISMATCH'}")

if __name__ == "__main__":
    asyncio.run(main())

YAML entities file

# entities.yaml - edit this to match your setup
temperature:
  - sensor.living_room_temperature
  - sensor.bedroom_temperature
  - sensor.outdoor_temperature

climate:
  - climate.living_room
  - climate.bedroom

lights:
  - light.all_lights
  - light.living_room
  - light.bedroom

My use case

I’m running Ollama (qwen2.5:7b on RTX 5070 Ti) as a conversation agent for Home Assistant Voice PE. With 397 entities exposed, the 8K context window was completely saturated. After reducing to 94 carefully selected entities using this script, everything works perfectly:

  • Temperature, climate, lights, covers, security sensors, weather, pool, energy
  • Groups for bulk control (“turn off all lights”)
  • Individual entities for specific control (“turn on the bedroom light”)

What I learned

  • Do NOT edit .storage/homeassistant.exposed_entities directly - HA keeps the state in memory and overwrites the file on shutdown
  • The WebSocket API is the only reliable programmatic way to manage exposed entities
  • Changes take effect immediately - no restart needed
  • With “Prefer handling commands locally” ON in the pipeline, simple intents (get state, toggle) are handled by native Assist, while complex queries go through Ollama

What’s missing from HA core

  • No REST API endpoint for expose/unexpose (only WebSocket)
  • No bulk management UI (only toggle per entity)
  • No documentation of these WebSocket commands
  • No way to define exposed entities in YAML configuration

Would love to hear if others have run into this and how you solved it!

I agree with this part wholeheartedly… I’m actually thinking of a feature request for Spook.boo for same because exact reason.

How do I handle it. Hmm well this part is at the later end of a loooooong chain…
see Friday's Party: Creating a Private, Agentic AI using Voice Assistant tools

if you’re interested. Cause entity state is only about 1/5th of your problem :wink: (working on smarter context not more) the index let’s you reduce exposed ents when there callable Info only so expose becomes live state v indexed state, not live state v gone… And it’s structured to boot. Exposed becomes more a decision… Do I need real-time active control? Than -do I need to read this for context.

1 Like

Wow @NathanCu , thanks for the detailed pointer — I just went through your Friday’s Party thread and it’s incredibly well thought out.

The Grand Index approach using labels as semantic discovery is brilliant. Your finding that exposing labels improved response quality by “an order of magnitude” is exactly the kind of insight I was hoping to find. I’m doing the opposite end of the spectrum right now — 94 carefully selected entities with a local qwen2.5:7b on an RTX 5070 Ti, $0/month — but I can see how labels would make even my small set much more discoverable.

The Kung Fu Components pattern is fascinating too — dynamic context loading via input_boolean switches is elegant. I could see implementing “modes” (pool, security, energy) that load different prompt sections depending on the conversation context.

A few things that resonated:

  • “Entity state is only 1/5 of the problem” — totally agree now that I’ve seen your architecture. I solved the exposure part, but context richness is the next frontier.
  • NINJA Summarizer — smart way to compress 200K tokens into structured JSON. Even with a local model and smaller context window (32K), a periodic state summary template sensor could free up a lot of context for actual reasoning.
  • The ~800 entity ceiling — good to know. My 94 is very conservative, gives me room to grow.
  • Spook.boo for expose/unexpose — would love to see that happen. A REST API would be much cleaner than the undocumented WebSocket commands.

Different approaches, same problem space. You went deep on smart context with cloud models, I went lean on local inference with smart filtering. Both valid paths — and I think there’s a lot to learn from combining ideas.

Really impressive work. Following your thread closely now!

1 Like