Database already huge

I’ll try to explain a few things I think I’ve learned, maybe it’ll help. TL;DR is at the end.

There are two “purge” options for the Recorder service. (1) Recorder: Purge allows you to delete all entries older than a given number of days, which you can filter by entity_id and entity_type in addition to the time-based purge. You can also repack the database (recover unused space) after the purge, if you remember to check the box AND turn on the slider. You won’t free up any storage unless you do this.

(2) Recorder: Purge Entities allows you to delete all of the records for an area, device, entity, domain or glob, or a bunch of those. But I don’t see any repack option, so you’ll want to go back and do that after.

Key points here are that you can be very selective about what and when you delete, and that you won’t recover any space in your database unless you correctly select the repack option.

That all applies to the “old” tables which have been part of the Recorder databaase for a long time; basically, the tables named events and states.

More recently, it was decided that the same database should be used to store permanent statistics data, such as is used by the energy management components. Among others, the statistics table was added. But as far as I can tell, there’s no purge option for this table. Apparently it’s designed to grow forever, so that you’ll always have your full energy use (or whatever) history.

I don’t happen to use HA to analyze my energy use, just to collect the data. So every month I end up deleting over 10,000 rows from this table. It should be noted that I never “opted in” to store this data in the first place; sometime in the past few releases HA created and started populating these new statistics tables without asking.

As far as I can tell, doing a purge and/or repack won’t impact the energy use data stored in these newer tables. The only way to purge them is with an SQL DELETE statement.

1 Like

Statistics were referred to in the blog when that release occurred. I believe you can choose what to keep long term.

I recall reading about the statistics update, but I haven’t seen anything about how to choose what to keep. Admittedly I didn’t research them in detail since I wasn’t using them (or so I thought!)

Please see this post before upgrading to 2022.4 - Database Optimizations in 2022.4

Well i have the beta running, and no difference.

On my test system with minimal devices recording it’s still 15mb a day…
So this would be arround 500mb on my normal system with no extra recorder options set-up.

Now for frigate i changed the directory with cfis to store all data on disk from docker/portainer.
As for maria db there is no way to directly store the data on a nas drive on docker, as of the stupid permission problem.

So i solved that with a older laptop running influxdb and maria db to store everything on there, and no disk space problems anymore.

Please post in the beta thread here