Trio Edge - Open-source camera intelligence for your network

Not my work, but the author of this reached out to me and I would like to raise awareness of this: GitHub - machinefi/trio-core: Real-time vision intelligence engine for Apple Silicon. YOLO counting + VLM scene understanding + auto-calibration. REST API in one command. · GitHub

The pain point: a frustration with HA camera automations is that you can trigger on “person detected” but you can’t trigger on what that person is actually doing. The current workaround is Frigate plus GenAI through Ollama, but that takes 5 to 10 seconds per frame on a snapshot, not the live stream. That’s too slow for automations like “if someone is at the door with a package, unlock the locker” or “if the garage door is open after 10pm, send me an alert.”

Trio Core runs VLM on live camera feeds at 279ms per frame, fully local on Apple Silicon.

  • Plain English conditions. Ask anything about the scene, no model retraining required
  • KV cache reuse across video frames giving a 1.71x speedup on sequential frames
  • 73% visual token compression with under 1% accuracy loss. This is why it’s fast where Ollama isn’t
  • REST API built in. Just run pip install trio-core[mlx] && trio serve and you get a local API at localhost:8100, ready for HA automations
  • No Docker, no MQTT, no API keys, no YAML. One pip install and you’re done

The goal is HA automations that understand scenes, not just object labels. Trio Core makes that possible without cloud dependency or multi-second latency.