This is more a rant than expecting help. Except under WTH perhaps.
My 240gb SSD card died and so I found another card and powered up and reloaded my latest backup.
The fresh load from backup never finished its ‘restoring’ dialog box but the system appeared to be up without recorder and other database related elements working.
So I started to puzzle what was going on. I noticed that mariadb was not in the list of add-ons to choose to restore. I used tar piped to null on the backup and found it was corrupt in the mariadb section. I assumed this was related in some way to the disk failure…
So, got the previous backup and did the same tar to null and it was also corrupt.
More backups showed the same problem. Finally I found a backup which included mariadb and was able to get things working, albeit 4 weeks out of date. However some part of mariadb was still corrupt. The key bonus was that this backup now included all the other files like yaml files.
So a reinstall of Mariadb was the only way forward - empty of course and things can move forward again.
I want to highlight that:
using the google backup add-on does not give you any clue that it has backed up an incomplete set of files.
Using HA Restore does not do an audit of a backup tar to check it is not corrupt before restoring from it.
I think that an integrity check of a backup made and one about to be restored should be built into HA core processes.
I also wonder if the backing up of the database files should be the last files in a backup allowing a user to recover the system if not the database. This would have stopped me losing a month of yaml updating.
I’d open a bug report with those Backups, that should not happen.
That said, building backup/restore tools is a complex endeavor and is mostly a solved problem. If you are using docker its easy - just back up the directories, if you are running a VM, just snapshot it using the VMs tools.
I regularly test my backups (monthly - do a full restore from the backup) to ensure that my backup and restore process works as expected. The worst time to discover your backup/restore process doesn’t work is when you need it.
@MaxK In what way do you confirm the restored version is sound.
One of my old backups (in this drama) restored and appeared to work but examination of the tar by extraction of every file revealed there was one mariadb file at least which was corrupt. It did not appear to affect the access to recent data nor prevent writing new data but …
I do not know the specifics of how HA backup integrity is checked but I do agree that some sort or check (checksum) should be performed/verified if it is not already. But that would not give me confidence that I could restore my environment because of other factors that could affect the restore process (including me just screwing something up ). So I immediately use the backup just taken and perform a series of functional tests to prove to myself that it is good (I hope to automate this functional check some day).
Yeap had a Backup that corrupt well HA wouldnt restore it
I unzip it with 7-zip which you see the addon Zips
then should be a homeassistant.tar.gz unzip that which created a homeassistant.tar unzip that on and you see the data folder or config folder and there is all you data
Got those scripts and automation back happy camper
KNOW a trick that MITE work dump that data folder into the Config folder and restart HA
BUT the wrong way pull the power out so HA does do a shutdown and restart it
and pray to the GODS PULLING POWER IS NOT A GOOD IDEA do at own risk.
I’d try to find a way to backup the 240gb SSD card - on to another card or PC - as that would be a complete image backup. I’d do this process before every upgrade. That way you have two working copies. You could even rotate cards.
Hi there. Is there anything new on this topic ? - I noticed that my automatic backups are systematically corrupt every day (tar: invalid tar magic). Looking into the backup file using the terminal, it shows that the /core_configurator.tar.gz is quite small and /core_mariadb.tar.gz is empty. Here is the observation:
I recall that it had been an issue years ago and that the problem was that the mariadb daemon was not stopped during the backup process. This had been resolved and, honestly, I have not looked at this for many months. Is there a regression on this ?