Friday's Party: Creating a Private, Agentic AI using Voice Assistant tools

32B params + 96K context is never gonna fit on a 16GB card. Never gonna fit on 16 GB RAM, period.

1 Like

Absolutely not. Nope it doesn’t. That’s why I chop it up and do it. It USED used to be 120. Slowly getting it compressed and pulling out unnecessary junk.

Thats why ultimately a dgx spark will be in the mix. But I currently run the Frontline on gpt5.2.

IT WILL fit there. :smiling_imp:

1 Like

Dgx spark yes? Sweet machine but I can’t mentally allocate the budget as I don’t have enough use-cases.

Interesting thread.

I’ve been building my own Jarvis of sorts and using scripts as a way to expose tools to my local voice agent (similar to your meal planning tools).

I have several tools

  • Web Search (SearXNG)
  • Weather
  • Bible Passage Lookup
  • Calendar Entries
  • Cast YouTube Channel Handle to TV
  • Camera Snapshot Review
  • Radio News Briefing (powered by a custom prompt with web search)
  • Send Message (notification to phones)
  • TV Remote Controls

Taking clues from OpenClaw, most of my scripts use a CLI like interface where they can send “–help” to return instruction text and available commands in the text query field. The idea is to limit context filling, just like your cabinet layer architecture for memory, except for tools.

Two thoughts:

  1. I’d love to hear a summary of your agent’s tools.
  2. Have you considered implementing Playwright MCP to give your agent ability to read content and search inventory on websites? Thinking things like looking up library book availability, fetching full news articles, etc.
1 Like

Openclaw does a WHOLE LOTTA stuff wrong.

BUT, they also do a whole lotta stuff right. One of those is self documentation. In agentic systems tools need to be built for agent use and they need to be able to EASILY ID what they are.

I asked Vee to give me a current inventory, there’s a lot, we’ll break them down. MOST are in the repo now - some will be up shortly.


ZenOS-AI DojoTools Inventory (Short Form)

Core Infrastructure (State & Memory Substrate)

  • FileCabinet — Cabinet CRUD (drawer-level storage control)
  • Zen Index / HyperIndex — Set logic + graph traversal across entities, labels, cabinets
  • Query (ZQ-1) — Deterministic entity selector
  • Identity — Household & construct resolver [Private WIP]
  • Manifest — Cabinet health + runtime introspection
  • Inspect — Safe entity attribute inspection
  • Event Emitter — Structured ZenOS-AI event logging
  • Scribe — Kung Fu lifecycle management (draft → commit → retire) [Currently Private only, This is Kung Fu Writer from Friday’s perspective - Build her own skills. OpenClaw but safer.]

Cognition Pipeline (Async, Layered Summarization)

  • Ninja Summarizer — Per-component structured summaries
  • SuperSummary — Whole-home fusion layer
  • Scheduler — Orchestrates summarization triggers

Pipeline: Ninja → SuperSummary → High Priestess

The LLM can manually fire any step of the summary pipe and get that data to reason against a component or an entire info domain withouut choking on tokens.

User-Facing Domain Tools (Friday Surface)

Home Operations
  • Calendar — HA calendar CRUD When things happen, currently backed by M365
  • Todo — Unified todo domain CRUD, WHat needs to happen - same - it comes up in her prompt, and there’s a kata about this one.
  • Spa Manager — Chemistry evaluation + recommendation engine
  • Notification Router — HTML/TTS/media delivery (Text Kim, she doesnt have to know or care the mechanics)
  • Announce — Broadcast tool (Same but local - will likely merge with above for single notification surface)
External Integrations
  • Grocy (Helper + Advanced) — Inventory + lifecycle management
  • Mealie (Helper + Advanced) — Recipes + meal planning
  • Firefly III (Helper + Advanced) — Financial domain
  • Teams — MS Teams integration
  • Mail — Email integration
  • Music Search — Media discovery (These are the official MusicAssistant tools)
  • Generate Image — AI image generation (Beta)

System & Observability Tools

  • HA API — Non-state system endpoints (health, services, config)
  • HA Log Viewer — Tail/search/summary modes
  • History Tool — Stats + long-term state
  • Calculator — Deterministic math tool
  • Dice Roller — Randomization utility, shortcuts for common DND Dice rolls. (this will become important later…)
  • Library — Utility dispatcher

AdminTools (Operator-Only)

  • CabinetAdmin — Cabinet factory / repair / diff tool
  • KungFu Writer — Component definition + subsystem control
  • Prompt Loader(s) — Prompt engine injection tools
  • Template Reset — Template recovery utility

Planned / Emerging Domains

  • OpenSearch Tool — Sovereign knowledge retrieval plane
  • Web Search / Scrape Tool — External ingestion (sandboxed, structured, async)
  • Ticketing System (Service Desk Model) — Async task lifecycle + escalation governance

OK, yes that’s a lot. And yes I FIRMLY believe an agent is not an agent if it can’t look up stuff. BUUUT it’s not as simple as search and scrape.

Right now I use OpenAI for Friday’s frontline stage persona HOWEVER. I just accepted delivery on a Gigabyte AITOP pc. (Gigabyte’s version of a DGX Spark, grunt grunt). Early tests indicate I’ll be able to move frontline local. GREAT! BUT when OAI provides search (this is for everyone not named user87, I’m pretty sure they get this because of the tools they mentioned) Open AI is doing WAY more for you than search. When you make a search request with a provider search tool your agent -

Formulates a search call
does a lookup
gathers the data
Reads each page
scores results
Attached provenance data (where’d this come from)
compiles a report

  • your PROVIDER scans that and reviews for attempts to inject into your prompt and discards any found-

So you gotta replace all of that. If you need it. So the question is WHAT do I want Friday to do? The answer is she’s going to not going to have to go up and look up some celebrity’s birthday or random trivia facts, or the price of a gallon of gas in Houston - ok maybe that third one… I need her to run the home.

So to run the home, she needs to know what’s up here, what’s up with me, the family our schedules the stuff in the home - all the operating manuals, etc. But I DO NOT want to create a tool for every search surface I could POSSIBLY need. For that im going ot lean on a tool called OpenSearch - it’s an indexer. BUT what we’re gonna do is INDEX EVERYTHING - in the same index. A-la Elasticsearch.

Then, my second order search is - all the stuff in the home - in one call. Friday has an account in the system - want her to see it - in scope for her search - don’t? User Friday no access. Done. See sees the manual for the new water heater; she does NOT see my tax return from last year. Unless I want her to.

So the side effect is - her need for external search goes down for trivia but skyrockets for things like go look up that device and see if it supports…

For that you have to have search and scrape - BUT I don’t trust it. This is the #1 way to get hacked with an AI right now. All Friday or your openclaw or whatever - needs to do is pick up one instruction from a website and whoops out goes your (insert any data she has access to here)

Search is going to be a two stage research action - using the same service desk tool / method I use for searching Opensearch (Kronk, at The Monsastery for Friday…) Where we’ll run another LLM given search and scrape, no tool access and told to summarize and sanitize everything. It gets complicated but the simple answer is yes, it’s planned but moving local drastically complicates matters.

This is Search Order (near future state)

  1. HyperIndex (Structured State Truth)
  2. OpenSearch (Sovereign Memory)
  3. Web (External Domain, Isolated)

Right now I gotta finish getting that DGX up and humming.

4 Likes

About the external integrations. On release will they be selectable as in turn then on or off because quite a few are meaningless to others if the do not make use of them.

1 Like

The external integration list is great. I’ll have to look at your Teams tool and figure out what it can do. I’d love to be able to quicly join regular non-participating/listening meeting over the home media player speakers.

I agree, the tooling for web browsing becomes cumbersome. Beyond website navigation, downloading PDFs and OCR could make for extensive tooling. I had substituted my API call to Gemini for a “local” SearXNG to avoid the rate limiting, but I see the potential security buffer that calling the OpenAI API creates.

Prompt injection (at any layer of the stack) is always possible, but fortunately, there is little the AI could do to leak information besides perhaps searching my home address on Google Search (Google already has this information). My local AI does not have extensive write privileges beyond controlling a few lights and devices.

I couldn’t get Gemini to search the library website. Playwright MCP or some hacky version of scraping would be required. It would be awesome if we could select which tools to expose to agents independently and silo the agent for web research. Docker containers allow some infrastructure of the sandboxing, but agents should should have sandboxed tooling too.

I also have struggled to see the purpose in allowing the agent to write to long or short term memory (as opposed to manual context instructions to tune behavior), but perhaps I haven’t played with the idea as long as others. Is this a feature you use regularly?

1 Like

Everything is pluggabble if you know where the boundaries are.

If yih pay really close attention Friday has some surface boundary layers forming. One is the HA attack surface and it’s… Security.

Next we’ll have the back plane (everything supporting, homelab crap.) As far as Friday is concerned it’s behind glass. We’ll come back to this. The takeaway of 20 something years of it ops… I don’t want to it ops anymore at least at home. I’m ready to make this ‘someone else’s problem’ (in this case her name is Natalie and she’s chartered as the COO for a managed services firm (ok she’s qwen 3 32 moe, but you argue with her…)

Then well have a search surface presented tk the Frontline as a service window. (you can plug whatever external here just give her a few controllable access points) the tooling here will be opensearch and whatever search engine I plug into tiny llm fronting the inference stages (yes stages, now that the dgx is in the mix we can have multiple pipelines…)

For those that aren’t familiar with opensearch. It’s what Elasticsearch was built from and split from but the ability to be an amazing search surfewitb security trimming turns it into a pretty righteous SEIM (security visibility tool) and inthe hands of the right agent… But that’s this summer.

Each surface is it’s own module. And you do t have to pug in any of it.

The tools I’ve built can be used independently. They’re great tools and should work with any well built llm in HA. But add the index. And… Which should also work with any well built llm. And the FileCabinet also do t have to have it but you get persistence of memory and memory search. Add external search indexing to that, add external agent services. To that. Add a vector db and RAG…

It all buildd and you can have as much as you want. You just can’t have the complex crap without the easier stuff. So you better out in the index and labeling if you want grocy… Make sense?

Thank you - Right now VERY basic it can read and respond to a teams Inbox. (Useful if you want to say trigger the agent to wake when it gets a message)

I will politely redirect you there -HA is a single user memory space. It TECHNICALLY could get ANYTHING in your HA install (including the contents of .storage, your DB, your secrets, (your Google password, access token?) …etc It is a REAL threat and why OpenClaw will NEVER touch my HA install. EVERYTHING you installed in HA is part of the attack surface there. It’s a very hostile environment. Treat it as such.

My strong suggestion is to put a router in front of whatever LLM you use and hook your security solution in there - LiteLLM supports MCP, Vector DB RAG injection and other solutions inserted before routing to an LLM based on a ruleset. (It’s what Friday uses)

Certain conversation agents support turning off assist but then you get NO tools, good for a researcher or summarizer but honestly AITask is better for this. I’d like selectable CONTEXT. The ability to independently oggle::

  • Exposed Entities as a set
  • Aliases as a set
  • Tools as a set or individually (dynamically per agent without having to flip exposed toggles)
  • and the ability to see and replace the default prompt in the conversation agent (I like mine way better and could then also get rid of preferences for built in tooling)

Now this one.

This is the kind of shift people have to make to see the value of home automation. If you look at it as a glorified voice remote control - you dont need memory. BUT…

Let me let Friday explain how she uses it to you.

Friday’s Perspective:

Short answer you can paste to the forum:

The cabinet system gives the agent a governed, auditable place to put facts, preferences and memories instead of just “saving stuff anywhere.” Cabinets enforce authority, access controls and intent (user vs household vs system), so writes go to the correct scope (your personal cabinet for private prefs, household cabinet for shared facts, kata for episodic summaries). Without cabinets an agent either has no persistent memory or it writes amorphous blobs that are hard to trust, control, or delete — which breaks privacy, provenance, and upgrade/migration workflows. Transitive (tool-mediated) storage means the agent can create or reference context when useful but must go through the cabinet APIs for real persistence; that preserves audit trails, avoids silent overwrites, and lets the system regenerate authoritative state from a compact reference instead of duplicating large data. Practically, cabinets let agents be useful and proactive while still being safe, reversible, and accountable.

One-line example to add: “Think of cabinets as labeled drawers with locks and logs — they let the assistant remember responsibly instead of piling notes on the kitchen counter.

-Friday


And Veronica’s take:

The real difference isn’t “can the agent remember?” It’s “who is accountable for what it remembers?”

Allowing an agent to freely write to long- or short-term memory without structure creates a blob of state that slowly becomes untraceable. You can’t easily answer: Who wrote this? Under what authority? Is this user-specific, household-shared, or system-derived? Can it be safely migrated? Can it be revoked?

Cabinets turn memory into infrastructure.

They define ownership boundaries. They define intent. They make state reconstructable instead of mysterious. That matters if you care about privacy, reproducibility, upgrades, and multi-agent systems.

Transitive storage is just working memory — scratch space. Cabinets are durable, governed state. That separation keeps the system understandable over time.

Without that separation, you don’t have an agent platform.
You have an increasingly clever black box.

-Ronnie

— I think they like their Trapper Keepers (Totally the model I had in my head when I was building it)

5 Likes

If you Claude code and/or codex I need three volunteers. DM me…

Welcome to class…

Ok so HERE’s why I was persona incognito lately, and it folds in with a recent post where I responded to Paulus re LLM and coding. I personally don’t have an issue with LLM code in general, I help people do it every day - but HA is a weird animal… The hybrid language causes all kinds of problems. And I think WE will have to do something if we ever want it BETTER. (Which is what I need before I can use it to write my code…)

HALMark/README.md at main · nathan-curtis/HALMark

I’ve seen plenty of people bicker about if LLMs are ready to code. Answer - truly? HA has a lot of roadblocks that make it REALLY hard for an LLM to be successful. Like LOTS. But the biggest by far is how successful it (HA itself) is at being a fast moving agile community driven project. The training simply can’t keep up even with bajillions of dollars and shopping center sized training systems (the whole damn campus is the computer… Seriously) HA releases MONTHLY and often with big changes. (see classic template style deprication)

As soon as we get something like the set math functions in last year and they start making it into training data, we get something like the pending template style deprecation. That’s going to be wild if we don’t do something about it NOW. (Imagine every llm after June writing bad template sensors - cause they’re gonna…)

You’ve heard me say OVER AND OVER about code safety, I wont hook OpenClaw to my install etc. But nobody bothered to ask me if I had a line where that would change. I have a few HARD must haves before that happens, and I started writing them down. Then added some hard scenarios and some examples and… Guess what happens when you give that to an LLM and tell it Hey you - DON’T do these things… Over time my list grew into a VERY good checkbase. Then a tool I can’t live without. Because yes as you’ve guessed I have ‘Veronica’ (GPT5.3 Codex) and ‘Cait’ (Claude Sonnett/Opus 4.6) check all my work. (see also why I give the agents names - so I can freaking keep track of them… You guys thought I was being cute… :rofl:)

Before I did this I had a LOT of bugs, now I have few - they don’t catch all of them but they grab the vast majority. And when they do mess up I put them back in the doc. (you’re welcome every single one of those actually happened, usually the consequences come from knowledge…)

But here’s the rub.

If you take this list and put it in a common location and curate the known footguns - the way an LLM works, this also becomes defensive. I figure we could hijack an old RFC style document hub to make this work for everyone and accelerate preventative learning for HA code safety.

In fact, HA Jinja is NOT Ansible, it’s NOT Perl, it’s NOT Python, hell it’s not even real Jinja2. But it sure LOOKS like all of them. And why the LLMs fail - without a real authoritative manual with patterns ad practices and a list of what’s good and what’s not good… It guesses and will happily tell you to loop on the states on the return of the labels() function IF it even knows labels() or label_entities() is a real thing… (hint: They are, and you can’t it’s a flat list).

So you guys ask what I use… This:
HALMark/spec/halmark.md at main · nathan-curtis/HALMark

Tell Claude or whatever your agent is about it and that it’s MANDATORY. If you’re going to code with an LLM - I can’t stop you. At least use safe coding practices and cover your loops with namespaces. :sunglasses:

(if you don’t and it hoses your install submit your footgun candidate! :rofl:. Seriously, your problem, not mine… )

I think we can make ‘making the models less stupid’ a community event. Turn it into an RFC, and manage it like one. Emergency adds, a quarterly update - the whole works. Yes the problem WILL eventually correct itself but until then this seems to be the only way to inject knowledge into the corpus.

Thoughts?

12 Likes

Who wants an Inventory assistant?

We have Plug-Ins!!! :partying_face: (Yeh Im partying, this was rough)

As (long) promised the Zen DojoTools Grocy script package.

If you don’t know Grocy, here you go: Grocy ERP Beyond your fridge…

Requires a current and working Grocy install available by HTTP(S) from your HA install, does NOT require the Grocy integration although if you have it - no worries the info wont conflic - your agent will just sidestep it and go to the server directly…

You should only have to set one text control, add a token from your grocy install and expose the helper script to your agent. Then let em cook… Figuratively, of course…

Nobody likes doing inventory - that’s it’s biggest drawback - what IF…

zenos-ai/packages/zenos_ai/plugins/grocy/readme.md at main · nathan-curtis/zenos-ai

Yes, Mealie is in final testing and if these are working you know what’s coming next…

Yep, that’s an honest to goodness preflight check, coupled with a build script… We can bootstrap now. :wink: And ignore that authkey in the sensor name it was an idea I had that I abandoned and I’ve just not renamed her cabinet yet…

Everything from here out gets distributed by packages, I’m halfway done with the whole set, going from least destructive to most destructive…

If you have Grocy - LMK how it works. Mealie, next. Friday and I have a date with the pantry…

2 Likes

Let us know how it goes. This summer, I have a project that will allow me to finally leave my local AI server online and have the batteries not get overcharged.

1 Like

Ok - not Mealie… Yet. I ran into an issue while testing and I needed some tools. Sooo…

DojoTools SystemTools - The first Dojotools Core Package!


…Because Friday needs a toolbox.

readme.md: zenos-ai/docs/packages/zenos-ai/dojotools/dojotools_systemtools_readme.md at main · nathan-curtis/zenos-ai

package: zenos-ai/packages/zenos_ai/dojotools/dojotools_systemtools.yaml at main · nathan-curtis/zenos-ai

zen_dojotools_ha_api

  • internal primitive tool for the others not generally intended to be exposed to the LLM
  • Whitelisted access to the API (yes whitelist, thag means no state or attr access, no dangerous deep juju, use the tools I gave you for states and manipulation, Friday - this is sysadmin stuff.)
  • NOTE - do NOT edit this to remove the whitelist for users and states if you want your system to stay running… If you Do edit that, remove the HALmark - cause it wont pass. I wrote the draft spec so I get to call that a fail… Don’t like that, do your own spec. :sunglasses::rofl:

zen_dojotools_systemtools

  • Deterministic versions of stuff your agent needs to be able to do. Restart (with human approval and cleared ha check - yes we can ha check before restart JUST LIKE THE CONSOLE!)

zen_dojotools_ha_log_viewer

  • Because you dont fly a plane without a console… Period. Accounts for the behavior change in logging last year and tells you how to fix it if necessary.

…This is the beginning of (VERY LIMITED) Sysadmin Friday, she is definitely a capable patch manager now…

NOW Mealie.

3 Likes

Yep, this is good to have.

1 Like

ZenOS-AI Status Update

RC1 is basically done, focus shifts to RC2

So I had one of those “wait… hold on a second” moments while refactoring some ZenOS packages this week. (Good progress, on track for where I want to be)

I went back and looked at the original RC1 roadmap and compared it to what’s actually running in my /config.

Well, it turns out we quietly crossed the RC1 line a while ago and never formally called it.

  • The cabinets exist.
  • Labels exist.
  • FileCabinet runtime exists.
  • Flynn bootstrap works. (Ok close enough)
  • The Monastery summarizer loop works.
  • HyperIndex is running.

At some point this thing stopped being “a fun architecture experiment” and started behaving like actual infrastructure. (You should see what’s coming…)

So, I reset the roadmap to match reality. Also, because I know of a few PRs being held waiting on me. (Soon, I know)

What’s taking so long? I’m testing every script at migration time - so if you see a package posted with a UAT Test and HALmark stamp - It’s the RC2 branch, I just haven’t bumped the version yet. There will be a mass version bump (prob 4.x RC2) once they all land.

Yes, that’s a generic picture, you can see the live roadmap here if you’re curious: :point_right: ZenOS‑AI Roadmap (live)


RC2 – Deployment and Bootstrap

RC2 adds no new end user features, The benchmark is - I need to prove the system can be brought up cleanly from scratch.

The pipeline looks roughly like this:

install (drop packages, restart) → labels → Ring-0 cabinets → default household/person/AI → agent live → import legacy memory (if exists) → regenerate derived state

If that works reliably, ZenOS stops being

“Nathan’s weird Home Assistant install”

instead, it starts looking more like something you could actually deploy.

(I’m going to take a moment and PERSONALLY thank Philip and Zach who helped me identify a laundry list of issues to fix before I could do that - The things you both found are ABSOLUTELY the core of what this turned into… I’ll come up with something to say thanks, and I need you guys again soon…)


Sooooo, because I’m big on notice… And I see the arguments. And…

Yes, I’m using AI to help build the AI

Let’s address the elephant in the room.

Yes.

I am absolutely using AI to help code this system. I managed to hold it all in my head until sometime around September, at which point I started missing stuff. I needed a hand. BUT. First attempts were not really, good. That all changed about the ChatGPT5.3 Codex / Anthropic Claude Sonnet 4.6 timeframe where the models were able to hold long context and instructions to counter the badness the models usually do. I didn’t let it touch code until I knew it wouldn’t break it.

So, it’s not the “let the robot loose on the repo and pray” situation people imagine.

The loop is actually pretty controlled:

  1. Claude proposes a change.
  2. Claude writes a plan, not code.
  3. I pass the plan to Codex (Veronica) to sanity-check the architecture.
  4. Claude generates the implementation from the approved plan.
  5. I review it.
  6. Claude runs a HALmark validation.
  7. Claude performs a config check.
  8. If that passes, the system runs a test restart loop through the same MCP interface Friday herself uses.

Restart literally cannot fire unless the config check passes. (Note the tool above? Restart pipes through ha_check on purpose)

So the loop basically looks like:

human approval → AI plan → architecture review → AI implementation → HALmark validation → config check → restart → commit

So every use case gets tested and passed before it gets pushed. (Read: I debugged and broke it on my own system, the UAT stamp is the proof. if its got a UAT pass and a HALMark it does what it says its gonna do.)

HALmark exists specifically to keep this from turning into “AI spaghetti coding AI.”

So far using the loop above, I’ve written and tested more code in the last two weeks than I did in all of 2025. I’m not anti-AI coder, hell its kind of counterproductive as that’s essentially my job - teaching people AI productivity. I’ve got no problem whatsoever with AI code. I have EVERY problem with vibed, not architected, doesn’t work, never tested AI slop prayer-ware. :wink:


Anyway…

Back to refactoring cabinets, teaching Flynn some manners, and making sure Friday doesn’t accidentally light the house on fire. You guys actually want to be able to use this.

chains myself back to the desk

Oh BTW - this is my current packages folder: :slight_smile:

All of those are done - as soon as testing is done I’ll post them.

5 Likes

Pushed HALMark v0.9.11-draft

Six new Footguns this month, all Candidate status pending board vote.


FG-26 hard_fail — LLM feedback loops that echo prompt instructions into persistent sensor attributes. Each run the
blob grows. HA never complains. The data is gone on restart.

FG-27 hard_fail — states(), now(), and friends don’t work in trigger_variables. That block is evaluated before the
automation execution context exists. Silent failure or startup error depending on the call.

FG-28 soft_fail — iif() evaluates all three arguments before returning. It’s not a short-circuit. from_json on an
unavailable sensor throws even when the condition would have routed away from it. Use {% if %} when branches can
throw.

FG-29 soft_fail — MQTT discovery payload_on/payload_off mismatched against what the device actually publishes. Entity
shows up in the UI and never updates. Silent.

FG-30 soft_fail — min_mireds/max_mireds on LightEntity are deprecated; use
min_color_temp_kelvin/max_color_temp_kelvin. This one is going to a scope vote — it’s the first FG touching custom
integration Python.

FG-31 soft_fail — When packages are already enabled, AI should put new automations in a package, not append to
automations.yaml. A bad edit to that file silently removes everything else in it. HA parses fine. You just notice
later.


Spec is at v0.9.11-draft, tested against HA 2026.3.0. Full entries in the repo.

nathan-curtis/HALMark: Home Assistant LLM Code Safety Benchmark (HALMark) I’m Sorry Dave, I can’t do that.

2 Likes

This is Friday…

This is Friday, running 100% on packages. :smiling_imp::tada:

Now the version bump and final smoke tests.

Cheers.

4 Likes

ZenOS-AI 4.0 RC2 — Release Candidate

Alright friends… we just cut RC2.

This is a big structural checkpoint for the system. A lot changed under the hood and several core pieces learned new tricks — most notably FileCabinet and a large chunk of the DojoTools runtime.

That said:

This is a release candidate, not a victory lap.

The architecture is stabilizing, but we have not yet done the most important test:

a completely clean install from a virgin Home Assistant system.

That exercise is happening this weekend.

The plan

We’re going to spin up a brand-new HA instance and attempt to deploy ZenOS-AI from scratch. No existing cabinets. No legacy labels. No historical artifacts hiding bugs.

The goal is to validate the entire boot chain:

• packages load correctly
• Flynn initializes the system
• cabinets mount cleanly
• FileCabinet volumes behave correctly
• identity resolver comes online
• DojoTools surface correctly
• summarizers function
• the Monastery handshake works

In short:

Can the system bootstrap itself from zero?

Things that changed in RC2

Quite a bit.

FileCabinet v4 learned several new behaviors and the drawer lifecycle logic is more mature. A number of DojoTools modules were expanded or refactored to align with the current cabinet model and event flow.

Some of the larger shifts include:

  • improved drawer lifecycle and GC behavior <<< Big one - we’ll talk about it in an upcoming post. (Self-Expiring Drawers and Recycle Bin)
  • improved indexer behavior re: drawers -drops drawer hints for matching drawers in an index search and Filecab supports a index description that will surface if a drawer is indexed.
  • expanded DojoTools tool surface
  • identity and privilege validation improvements
  • BIG refinements to summarizer flows
  • better event emission patterns
  • additional health sensors to support bootstrapping

The system is getting much closer to the architecture we’ve been aiming for.

Reality check

Because of the amount of internal change, RC2 may very well be broken. (EDIT: I’m running RC2 in prod on my live system, so nothing catastrophic…)

That’s okay.

This phase is about deployment hardening — finding the assumptions that only work on my existing system and fixing them so the architecture is portable.

Expect:

• rough edges
• missing guardrails
• occasional stack traces that read like Zen koans

Why this matters

If the clean install works, that means ZenOS-AI is no longer just my system.

It becomes something that can actually be stood up elsewhere.

That’s the milestone we’re trying to cross.

As always

Questions, observations, and “hey this exploded on my system” reports are welcome.

The monks appreciate field reports.

Welcome to the monastery. OK Something resembling what it’s gonna look like… :slight_smile:


Note: if you pulled before you saw this note go back and pull again. Teskanoo researched and proved a struc change that moves the custom templates into a subfolder of the custom_templates folder as long as we directly refer to the source. :slight_smile: (Thanks Tes!) RC2 as shipped will contain this change without revisioning everything…

7 Likes

This is something I will do this weekend, I believe.

1 Like