Backup is Huge (12 GB)- how to minimize?

Hello Folks,

I run the latest OS and Supervisor of Hass on a Raspi4 - currently migrating to Pi5.
I have mariaDB installed and had recorder settings to 320 Days. These settings gave me huge backups (12GB).
However, I tried to delete the recorder settings a few days ago (recorder.purge) as suggested in some topics here but my backups are just getting bigger.
How can I solve this?
I dont care anymore about long term stats of the recorder - the only entries I would like to keep are the ones from my energy dashboard. I have solarpanels and battery installed and would like to have these energy entries e.g. forever.

Hope someone can help me out!
Thanks
djm193

  • Core 2024.2.5
  • Supervisor 2024.02.1
  • Operating System 12.0
  • Frontend 20240207.1

The backup is probably large due to a default configured recorder.
Look into how to manage the recorder, so you only store the data you need and want.
A quick fix right now is to make a service call for recorder.purge that limits the days kept and replace the database.
You can do this from Developer Tools → Service tab → Search for the service call “Recorder: Purge”

Hi WallyR: Thanks for your fast reply! This did not help. I tried this for 3 days in a row now

Try to open up the backup to see what is inside and find the culprit.
It is just a gzip file, which can be open with Winzip/WinRAR/7-Zip or similar.

Inside the backup you might find the different addons backed up and can see there size.
These addon backups can also be open up with the same programs to look further into the stored files.

good hint! It is the mariaDB with almost 12GB - the rest is only a few MB from Add-ons

1 Like

Ok, I can not remember how you minimize that.

Here we go again. I feel like I’m screaming into the wind sometimes.

The default settings for Recorder are inappropriate. The new user documentation gives few hints about how important it is to configure Recorder from the start. The UI gives no option to include or exclude each entity as it is added, or when it is displayed. There’s only one keep_days parameter for all entities, so no fine-tuning is possible without writing automations to selectively purge, which only increases writes instead of reducing them.

There are FRs about this. There threads about this, explaining how to configure the Recorder settings for the best compromise, given the lack of fine-tuning options. Yet nothing changes and users keep encountering problems.

I’m open to suggestions on how best to elevate this issue so future users don’t have to keep bumping into Recorder database size problems.

1 Like

Ok Captain, thx for your word - however I did not hear you screaming in the first place :wink:

I did reconfigure my recorder settings to sth like follows (in configuration.yaml)

Do you have any suggestions on how to tweak this? Are there special configurations for energy dashboard so that I do not loose my e.g. solar production?

I have also read that only after uptime of more than 24h the purge will start →

recorder:
  db_url: mysql://homeassistant:XXXXX@core-mariadb/homeassistant?charset=utf8
  commit_interval: 60
  purge_interval: 1
"#" purge_keep_days: 320 -> this was the old setting - commented it out

exclude:  
domains:
      - weblink
      - updater
      - input_boolean
      - input_number
      - input_select
      - input_text
      - light
      - media_player
      - sun
      - timer
      - weather
      - camera

In Service, you can “Purge per entity” , Choose Purge All for the entities you want ( You can choose multiple in 1 Purge
You can even Purge per Domain and entity-Globs
Choose “keep 0 /zerro” Days
Then click Call Service

Then Choose Service “Purge”, un-tick “Apply Filter” And Tick " RePack "to the left, Make Sure to “toggle” the the “Repack” to the right.
Then click Call Service

If You Don’t choose “Repack” your DB remains roughly the same size

PS: If it turns out that your “Energy Datas” is the “culprit”, taken to much space, you are doomed, and need an external DB, where you can “shuffle” Datas to, and on monthly/yearly basis to another Table/DB

@boheme61 covered it.

I went a different way, excluding entities rather than domains. I really just excluded the “heavy hitters” which were (1) spamming the database, and (2) I didn’t really care about. Not perfect, but good enough. It keeps any data I might care about or which doesn’t take up much space anyway. But really, there’s no right way to do this. The important thing is you pick a method which works for you and stick with it.

I don’t use the energy dashboard, but I suspect it uses the long-term statistics tables in the database. IMHO these tables should be in a different database, since they have different retention and performance requirements. I just send this type of data to .csv text files and manage/analyze them outside of HA.

1 Like

Right, i did the opposite, still no exclude, Only Include !
Which i btw think should be a fairly Default settings, to remind people upon, if they want/need to track anything , Choose To
Could be an easy interface, or “tick box” i.e ( Want to track this entity ? ?
Many people have no idea what’s been tracking and how much data it “collects” even in default 10 days
And they start fast to implement various template-sensors, and tweek update-time-setting if possible to get every 1 seconds ( in worse case ) :slight_smile:

Basically, Move Recorder into UI
EDIT: Same with Purge … tick entity, glob, domain , click Purge&Repack, all the functions is there , it’s just a GUI Guru, with the knowledge of this parts which is “missing” , and im not that ( or only in my head :smile:

1 Like

See also How to keep your recorder database size under control.

EDIT: Fixed link. My phone has this strange bug where it sometimes cut off the last character when pasting.

the url doesn’t work, but yes, it’s a good guide, which every new users should read

1 Like

Hi, did it like you suggested. Could not see any effect. However I did not try to do a backup by now. Will do this tonight after HA is up for more than 24h.
Will keep you posted.

Good morning, automatic backup after uptime of 30h was created successfully.
Backup size after tweaking my configuration.yaml and calling the purge service as suggested is now 750MB.
This is perfect.

The statistics are also not fully gone-however the data shown is now divided in two parts: Long term statistics and history. The history shows data every minute, the long term stats displays hourly or daily - depending how long you go back.


1 Like