Hi everyone,
I’m running into a bit of a strange issue and was hoping someone here might be able to point me in the right direction.
My Home Assistant has been running for about 2–3 years now (on Hyper-V).
For a long time, I had recorder: purge_keep_days: 400 set in my YAML, because I wanted to compare states and interactions across the same time periods over multiple years.
As a result, my database grew quite large — around 120 GB. The automatic backup file was about 32 GB.
System ~160 GB / DB ~120 GB / Backup ~32 GB
I didn’t really have any issues, and the system ran smoothly without any manual intervention. However, I was aware that such a large database could eventually cause problems.
Last year, due to a mistake on my side, I accidentally deleted a whole set of important entities, including their history.
→ No problem, I thought — I had a backup. Better to lose 2–3 hours of data than the entire history.
Unfortunately (and in hindsight, not surprisingly), I ran into massive problems trying to restore the database.
The backup could neither be restored nor installed into a fresh VM.
Multiple attempts failed, whether the backup was stored locally or on my NAS.
In the end, I manually extracted the .db file from a VM backup and injected it into a new VM.
That worked, but obviously this is not a sustainable long-term solution.
So I decided to finally address the root of the problem.
I discovered long-term statistics and how they are stored, and realized that the coarser intervals are actually sufficient for my use cases.
First steps:
Changed to 90 days: recorder: purge_keep_days: 90
Due to lack of time, I let this run for a while.
A few days ago, I ran the Recorder.purge action with the “repack” option enabled.
The database size dropped to 57 GB.
System ~98 GB / DB ~57 GB / Backup ~29 GB
Unfortunately, the backup file barely changed in size.
Even after running a database “reorg,” things stayed the same.
I still have the problem that I cannot restore this backup.
The Home Assistant server appears to be busy during the restore process, but even after letting it run for 2 days straight, it never comes back. I have to hard-restart it, followed by the message that the restore failed.
Restoring in parts (only apps, only history, etc.) doesn’t work either.
I could probably reduce the retention even further (90 days is still a lot), but I’m afraid that won’t really solve the underlying problem.
To be fair, I do generate a lot of data noise — statistics and values I will most likely never need again.
What really confuses me is the mismatch in size between the database and the backup.
Does anyone have an idea how I can get the backup back to a reasonable size?