The problem
When using an LLM conversation agent (Ollama, OpenAI, etc.) with Home Assistant, you need to carefully control which entities are exposed to avoid saturating the LLM context window. With 300+ entities exposed by default, the model dumps the entity list instead of answering.
The HA UI only lets you toggle entities one by one in Settings > Voice Assistants > Expose. There’s no bulk management, no REST API, and no documented programmatic way to do this.
The solution: WebSocket API
Home Assistant has undocumented WebSocket commands for managing exposed entities. I found them in the HA core source code:
List exposed entities
{
"id": 1,
"type": "homeassistant/expose_entity/list"
}
Returns all entities with their exposure status per assistant (conversation, cloud.alexa, cloud.google_assistant):
{
"id": 1,
"type": "result",
"success": true,
"result": {
"exposed_entities": {
"light.living_room": {
"conversation": true,
"cloud.alexa": false,
"cloud.google_assistant": false
}
}
}
}
Expose or unexpose entities
{
"id": 2,
"type": "homeassistant/expose_entity",
"assistants": ["conversation"],
"entity_ids": ["light.living_room", "sensor.temperature"],
"should_expose": true
}
Key details:
- Immediate effect - no HA restart needed
- Bulk operations - pass multiple entity IDs in one call
- Per-assistant - control exposure separately for
conversation,cloud.alexa,cloud.google_assistant
Python script
I wrote a script that manages this automatically. It reads target entities from a YAML file and syncs the exposure state:
"""
Manage entities exposed to Home Assistant voice assistants via WebSocket API.
Uses the undocumented HA WebSocket commands:
- homeassistant/expose_entity/list - list exposed entities per assistant
- homeassistant/expose_entity - expose/unexpose entities
Usage:
python ha_expose.py --list-only # list currently exposed
python ha_expose.py --dry-run # show what would change
python ha_expose.py # apply entities from YAML
Requires: pip install websockets pyyaml
Environment: HA_TOKEN (long-lived access token)
"""
import argparse
import asyncio
import json
import os
import sys
from pathlib import Path
import websockets
import yaml
DEFAULT_URL = "ws://homeassistant.local:8123/api/websocket"
msg_id = 0
def next_id():
global msg_id
msg_id += 1
return msg_id
def load_entities(path):
"""Load target entity IDs from a YAML file."""
with open(path) as f:
data = yaml.safe_load(f)
entities = []
for category in data.values():
if isinstance(category, list):
entities.extend(category)
return entities
async def send_and_receive(ws, payload):
"""Send a WebSocket message and wait for the matching response."""
await ws.send(json.dumps(payload))
while True:
resp = json.loads(await ws.recv())
if resp.get("id") == payload.get("id"):
return resp
async def main():
parser = argparse.ArgumentParser(description="Manage HA exposed entities via WebSocket API")
parser.add_argument("--url", default=os.environ.get("HA_WS_URL", DEFAULT_URL))
parser.add_argument("--token", default=os.environ.get("HA_TOKEN"))
parser.add_argument("--entities", type=Path, required=True, help="YAML file with target entities")
parser.add_argument("--assistant", default="conversation")
parser.add_argument("--list-only", action="store_true", help="Only list currently exposed")
parser.add_argument("--dry-run", action="store_true", help="Show what would change")
args = parser.parse_args()
if not args.token:
print("ERROR: Set HA_TOKEN env var or use --token"); sys.exit(1)
async with websockets.connect(args.url) as ws:
# Authenticate
msg = json.loads(await ws.recv())
assert msg["type"] == "auth_required"
await ws.send(json.dumps({"type": "auth", "access_token": args.token}))
msg = json.loads(await ws.recv())
assert msg["type"] == "auth_ok", f"Auth failed: {msg}"
# List current state
resp = await send_and_receive(ws, {"id": next_id(), "type": "homeassistant/expose_entity/list"})
currently_exposed = [
eid for eid, assistants in resp["result"]["exposed_entities"].items()
if assistants.get(args.assistant) is True
]
print(f"Currently exposed to {args.assistant}: {len(currently_exposed)}")
if args.list_only:
for eid in sorted(currently_exposed):
print(f" {eid}")
return
# Load targets and compute diff
target = load_entities(args.entities)
to_unexpose = set(currently_exposed) - set(target)
to_expose = set(target) - set(currently_exposed)
print(f"Target: {len(target)} | To expose: {len(to_expose)} | To unexpose: {len(to_unexpose)}")
if args.dry_run:
for eid in sorted(to_expose): print(f" + {eid}")
for eid in sorted(to_unexpose): print(f" - {eid}")
return
# Apply
if to_unexpose:
resp = await send_and_receive(ws, {
"id": next_id(), "type": "homeassistant/expose_entity",
"assistants": [args.assistant], "entity_ids": list(to_unexpose), "should_expose": False,
})
print(f"Unexposed {len(to_unexpose)} OK" if resp.get("success") else f"ERROR: {resp}")
if to_expose:
resp = await send_and_receive(ws, {
"id": next_id(), "type": "homeassistant/expose_entity",
"assistants": [args.assistant], "entity_ids": list(to_expose), "should_expose": True,
})
print(f"Exposed {len(to_expose)} OK" if resp.get("success") else f"ERROR: {resp}")
# Verify
resp = await send_and_receive(ws, {"id": next_id(), "type": "homeassistant/expose_entity/list"})
final = [eid for eid, a in resp["result"]["exposed_entities"].items() if a.get(args.assistant) is True]
print(f"Final count: {len(final)} {'SUCCESS' if len(final) == len(target) else 'MISMATCH'}")
if __name__ == "__main__":
asyncio.run(main())
YAML entities file
# entities.yaml - edit this to match your setup
temperature:
- sensor.living_room_temperature
- sensor.bedroom_temperature
- sensor.outdoor_temperature
climate:
- climate.living_room
- climate.bedroom
lights:
- light.all_lights
- light.living_room
- light.bedroom
My use case
I’m running Ollama (qwen2.5:7b on RTX 5070 Ti) as a conversation agent for Home Assistant Voice PE. With 397 entities exposed, the 8K context window was completely saturated. After reducing to 94 carefully selected entities using this script, everything works perfectly:
- Temperature, climate, lights, covers, security sensors, weather, pool, energy
- Groups for bulk control (“turn off all lights”)
- Individual entities for specific control (“turn on the bedroom light”)
What I learned
- Do NOT edit
.storage/homeassistant.exposed_entitiesdirectly - HA keeps the state in memory and overwrites the file on shutdown - The WebSocket API is the only reliable programmatic way to manage exposed entities
- Changes take effect immediately - no restart needed
- With “Prefer handling commands locally” ON in the pipeline, simple intents (get state, toggle) are handled by native Assist, while complex queries go through Ollama
What’s missing from HA core
- No REST API endpoint for expose/unexpose (only WebSocket)
- No bulk management UI (only toggle per entity)
- No documentation of these WebSocket commands
- No way to define exposed entities in YAML configuration
Would love to hear if others have run into this and how you solved it!