Ha-state-archive: Infrastructure-side Audit & Archival for Long-Lived Setups

ha-state-archive: Structural Audit & Archival Tooling for Home Assistant

Hi everyone,

I would like to introduce an open-source project extracted from my own long-running Home Assistant production infrastructure:

:link: Repository:


Why this project exists

As Home Assistant setups grow over the years, they increasingly behave like long-lived software systems.

Large YAML include trees, registry-managed entities, generated runtime structures, retained historical logic, partial migrations, dynamic templates… eventually the question stops being:

β€œDoes the configuration load?”

and becomes:

β€œDo I still structurally understand this system?”

Most existing tools focus on:

  • backups
  • YAML validation
  • formatting
  • linting

Those tools are essential, but they do not answer questions such as:

  • β€œWhere is this entity actually declared?”
  • β€œShould this entity exist in YAML or in the registry?”
  • β€œIs this reference statically resolvable?”
  • β€œIs this an integrity problem or expected runtime behavior?”
  • β€œWhat structurally changed between two releases?”
  • β€œCan this archived version be safely purged?”

What is ha-state-archive?

This is not a traditional backup tool.

ha-state-archive is an infrastructure-side archival and audit pipeline designed for long-lived Home Assistant systems.

The repository currently includes:

  • include graph resolution
  • declaration extraction
  • structural integrity auditing
  • registry authority analysis
  • runtime YAML authority classification
  • release-oriented diff generation
  • deterministic retention workflows
  • quarantine-first purge workflows

The central component is the audit engine.

Instead of only validating YAML syntax, it attempts to reason about Home Assistant structural integrity through concepts such as:

  • authority modeling
  • static vs dynamic references
  • actionable anomalies
  • architectural observations
  • bounded outputs

One important design goal is distinguishing between:

  • actual integrity problems;
  • intentionally dynamic Home Assistant behavior;
  • expected runtime-only mechanisms;
  • infrastructure-side observations.

For example, some Home Assistant platforms intentionally operate outside the entity registry model and should not automatically be treated as integrity failures.


Example audit output

Example anomaly types currently implemented:

Type Meaning
declared_not_in_registry YAML declaration missing from registry
registry_not_declared Registry entity without matching declaration
broken_reference Static reference to an unknown entity
runtime_yaml_observation Runtime YAML platform intentionally absent from registry

The audit engine also distinguishes between:

  • actionable anomalies;
  • architectural observations.

Observations are reported separately and do not increment the anomaly count.


Structural release diffing

The repository already includes a structural diff engine capable of generating bounded release-to-release Markdown reports focused on meaningful configuration evolution rather than raw file comparison.

Current implemented concepts include:

  • declaration-level changes;
  • structural additions/removals;
  • bounded diff outputs;
  • exclusion-aware diffs;
  • release-oriented Markdown reporting.

The long-term direction is to progressively move toward increasingly semantic and structure-aware evolution analysis.


Architectural direction

The project intentionally follows an infrastructure-oriented approach.

Most processing occurs outside Home Assistant itself:

                 [ Home Assistant ]
                           β”‚
                           β–Ό
             Immutable extracted versions
                           β”‚
                           β–Ό
                    Archival pipeline
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                 β–Ό                   β–Ό
          Structural audit     Release diffs
                 β”‚                   β”‚
                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β–Ό
                  MQTT supervision
                           β–Ό
               Retention classification
                           β–Ό
                      Quarantine
                           β–Ό
               Delayed irreversible purge

The repository was extracted from a real production environment and progressively generalized for open-source publication.

The core pipeline is already running continuously in production, although APIs and some internal structures are still evolving.


Current status

At this stage, the project is probably closer to infrastructure engineering tooling than a turnkey Home Assistant add-on.

The current target audience is mainly advanced users operating:

  • large YAML-based setups;
  • Git-driven workflows;
  • multi-site environments;
  • long-lived Home Assistant infrastructures.

Feedback welcome

I would especially appreciate feedback regarding:

  • audit semantics;
  • authority modeling;
  • structural diff philosophy;
  • retention strategy;
  • operational ergonomics;
  • edge cases around dynamic Home Assistant behavior.

I am also very open to criticism if some architectural assumptions appear too tied to my own infrastructure.

Thanks for reading.

Quick update since the initial post.

The project has progressed significantly over the past two days:

  • v0.3.0 β€” 22 contractual invariants covered by a pytest suite (retention classification, quarantine purge safety, release anchor detection, MQTT verdict validation)
  • v0.4.0 β€” install_check.py: a pre-installation environment verifier (stdlib only, no dependency on the package it verifies)
  • v0.5.0 β€” proper Python packaging with pip install . and six CLI entry points
  • v0.6.0 β€” GitHub Actions CI running the full test suite on every push
  • v0.7.0 β€” ha-state-init: a conservative project initializer that creates the expected directory structure and a minimal retention policy, with dry-run by default

The project is now installable, testable, and verifiable from a clean environment. The workflow is:

pip install .
ha-state-init --root /path/to/ha_backup_timeline --apply
python3 scripts/install_check.py --root /path/to/ha_backup_timeline

The project is now structured enough to be reproducible outside my own environment.

Still early, but the operational baseline is now stable.

Feedback still very welcome, especially around audit semantics and retention edge cases.