Allow backup compression parameters

I own Yellow system with 512MB SSD. Sometimes I like to review data way into history and have set to retain last 3 months in database. Database size has grown to 10GB and it takes quite a time for backup to finish - graphs freeze during backups. I opened Glances and noticed that only one CPU core out of 4 is used which seems very odd nowdays. Unfortunately it seems there is also very high compression ratio set which further delays backup finish time and widens window of backup failure. I use Samba Backup addon to move completed backups to another system:

[23-03-05 21:48:51] INFO: Backup running ...
[23-03-05 21:48:52] INFO: Creating backup "Samba Backup 2023-03-05 21:48"
[23-03-05 23:26:55] INFO: Copying backup a52e65ef (Samba_Backup_2023_03_05_21_48.tar) to share
[23-03-05 23:27:41] INFO: Deleting 29f42767 (Samba Backup 2023-03-02 21:48) local
[23-03-05 23:27:42] INFO: Deleting Samba_Backup_2023_01_22_21_48.tar on share
[23-03-05 23:27:54] INFO: Backup finished

Please give user a possibility to choose compression parameters for backups.

I’ll add some information as i also want to tweak the compression and i feel it’s pointless to start another thread for it :wink:

I’m currently backing up stuff like influxdb data and from what i can tell the build it gz compression is pretty bad compared to zstd or lzma. my influxdb is around 9gb uncompressed and around 4,6gb with standard gz compression, i tried to compress it with 7zip to test lzma and it came down to 2,3gb and 2,6gb with zstd so clearly there is a lot of compresssion left on the table as things ar per default in HA.

Would be really good if i could set the compression algo and fast/slow choice that fits my hw (vm so i can assign however much cpu i want to it)

Supervisor using gzip max compression takes way too long, especially on hardware like Raspberry Pi 4 or CM4. For example, my home-assistant_v2.db file is about 3GB, and the resulting backup file is only 1.2GB, but takes 30+ minutes to build on my Home Assistant Yellow with 8GB CM4.

Taking this file, extracing it, and using tar on another Raspberry Pi 4 to repack the backup files while using max compression takes 38 minutes to complete. Using standard tar+gzip compression is only about 8 minutes and results in a file that’s just slightly larger at 1.3GB

If Supervisor would just not use the max gzip compression, we could probably cut our backup times down to 25% of what they currently are.

Also allowing for faster compression algorithms like LZO would be an even larger speed improvement on limited hardware, at the expense of slightly larger backup files. For example, repacking the same 3GB database with LZO comes in at just under 2 minutes for a filesize of 1.7GB.

So options around which compression algorithm or compression level to use would be extremely beneficial for users that either want higher compression ratios, or users that need backups to be faster.

On my system (Raspberry Pi4) it’s even worse, backups are taking about 3h30m to complete. Since I’m backing up on a NAS I’d rather sacrifice file size for faster backups.

What i do at the moment is, run a nightly backup via a automation, and there you can set to not compress. Only problem i have is, when i install an update and forget to uncheck the backup

which automation?

In December 2023, HA switched from max compression to the default gzip/zlib compression level - see Use non-streaming mode and default compression level 6 · pvizeli/securetar@049f917 · GitHub

Besides the requests above, I have one that I feel is important - I would like the ability to disable compression altogether, as this can save a boatload of disk space for users using something like restic for their backups.

I wrote a blog post detailing the savings.

6 Likes

Just create an automation, which does a Backup and there you can select in compressed or Not.