Backup size excessively large, even after DB optimization

Hi everyone,
I’m running into a bit of a strange issue and was hoping someone here might be able to point me in the right direction.

My Home Assistant has been running for about 2–3 years now (on Hyper-V).
For a long time, I had recorder: purge_keep_days: 400 set in my YAML, because I wanted to compare states and interactions across the same time periods over multiple years.

As a result, my database grew quite large — around 120 GB. The automatic backup file was about 32 GB.

System ~160 GB / DB ~120 GB / Backup ~32 GB

I didn’t really have any issues, and the system ran smoothly without any manual intervention. However, I was aware that such a large database could eventually cause problems.

Last year, due to a mistake on my side, I accidentally deleted a whole set of important entities, including their history.

→ No problem, I thought — I had a backup. Better to lose 2–3 hours of data than the entire history.

Unfortunately (and in hindsight, not surprisingly), I ran into massive problems trying to restore the database.
The backup could neither be restored nor installed into a fresh VM.
Multiple attempts failed, whether the backup was stored locally or on my NAS.

In the end, I manually extracted the .db file from a VM backup and injected it into a new VM.
That worked, but obviously this is not a sustainable long-term solution.

So I decided to finally address the root of the problem.
I discovered long-term statistics and how they are stored, and realized that the coarser intervals are actually sufficient for my use cases.

First steps:
Changed to 90 days: recorder: purge_keep_days: 90

Due to lack of time, I let this run for a while.
A few days ago, I ran the Recorder.purge action with the “repack” option enabled.
The database size dropped to 57 GB.

System ~98 GB / DB ~57 GB / Backup ~29 GB

Unfortunately, the backup file barely changed in size.
Even after running a database “reorg,” things stayed the same.

I still have the problem that I cannot restore this backup.
The Home Assistant server appears to be busy during the restore process, but even after letting it run for 2 days straight, it never comes back. I have to hard-restart it, followed by the message that the restore failed.

Restoring in parts (only apps, only history, etc.) doesn’t work either.

I could probably reduce the retention even further (90 days is still a lot), but I’m afraid that won’t really solve the underlying problem.
To be fair, I do generate a lot of data noise — statistics and values I will most likely never need again.

What really confuses me is the mismatch in size between the database and the backup.
Does anyone have an idea how I can get the backup back to a reasonable size?

How much space do you have free on your HA drive?

I figured that for the backup to be stored locally and for the entire database to fit in addition to that — roughly around 80 GB total — I allocated 150 GB to the virtual machine. That means Home Assistant has plenty of available space.150 GB minus whatever Home Assistant itself needs, excluding the database.

Every time HA makes a backup, it gathers all it’s data including stored local backups and writes it to the tmp folder. This doubles the space currently used.
You have 60gb of database, HAOS probably another 25gb, storing more that 2 backups will not turn out well with that much HD space. If you want to keep recorder for 90 days, consider 500gb space or more for your drive…
Or be like me and delete your database every year or so because face it, you probably never use that 90 day old data anyway and the LTS data that is provided after 10 days (default) is more than enough.