Issues with backup

I am running home assistant 64 bit on raspberry pi 4 with 8gb of ram and 1tb of NVMe storage via usb. I wish to keep my data long term. Neither my Pi 4 CPU or RAM every come close to maxing out.

Once the database gets to a certain size it, and you run a backup the sensors values all glitch out, and a reboot is necessary to fix.

I suspect that the internal buffer over flows while waiting for the backup to complete. I believe there is a 30,000 element limit. However, my pi never uses more than 3gb of RAM so this limit seems arbitrarily low. The problem with this is, I will have to periodically increase as my DB gets bigger and bigger.

So one possible solution is to dramatically increase that or include a config setting so I can increase it.

Another possibility, since the primary database is probably locked during backup, create a second identical in internal structure database to hold the new values until the primary database is unlocked. Then insert these values into the existing database, and delete the temporary database.

This is probably the best solution as it doesn’t increase the RAM usage.

You could create a temporary in RAM database.

Also backup seems to take an exceptionally long time considering my hardware.

Also why can I do a mariadb-dump, and it completes in less than 8 minutes and it does not disrupt the data collection.

I just changed the number of purge days to something like 40000.
The biggest issue is when you try and run a backup, that is where I am having issues.

Mariadb aka mysql can handle billions of records no problem.
Recording records isn’t the issue isn’t when you goto retrieve those records is when it slows down.
Retrieving a day or 2 of points is relatively quick, beyond that it slows down.

However, given the existence of 8gb raspberry pi, the default configuration of the database seems unnecessarily restrictive. I wish I could easily change the my.cnf file.

I got a NVMe drive and a NVMe to USB adapter and plugged it into my Pi 4. Then I migrated the data over from the SD card. So although its over kill I now have 1tb worth of fast storage ready to go.

The sensible solution is to use InfluxDB for long term storage and the HA recorder for no more than a month.

Probably because Mariadb optimized that for their own tool whereas backup has absolutely no clue what it is backing up. It makes a copy of a bunch of folders marked as docker volumes and stores it with the addon config and optionally a docker image. Mariadb is normally not treated any different from any other addon. Of course since that form of backup (mass file copy) causes issues with a db the author of that addon put in a pre-backup freeze and a post-backup thaw.

Additionally Mariadb does not compress the extract, HA does. Compression of a massive amount of data takes time. Do you password protect your backups? If so then they are encrypted as well which also takes time.

I’m going to piggyback on Tom’s suggestion and say you should really lower your purge keep days and use influx or something else for that. HA added support for long-term statistics but long-term history isn’t really a use case it was designed for.

If you insist on keeping that much data then perhaps consider an alternate backup strategy. Like perhaps a shell_command that calls Mariadb export and makes an export in a folder that HA is already backing up. Then omit the MariaDB addon from the backup since you already have the export file. Or have the shell command also copy your export somewhere else and leave all that data out of your backups altogether. Could even make an automation to run that on a schedule

Alternatively you could host your own MariaDB container somewhere else on your LAN and have HA talk to it. You don’t have to use the addon. Then you could configure the my.cnf file to whatever you want and set up your own backup/replication strategy.

The really issue is that I only have roughly 30 days worth of data, and its already happening.

I ended up moving it to another Pi 4 and it works fine. Tweaking the my.cnf to allow more memory consumption. Even with my database growing significantly since then to more like 60gb performance is still good.