How to setup HA to recover if there is ever an issue or corruption?

Its been a while, but i was thinking about bring up my HA again and before i spend all that time setting it up from scratch again i wanted to see what to consider or do when setting up so that if the install is corrupted at any point after its up and running, its a short downtime before getting it back up and running again…

I had spent alot of time setting it up while i was out of work last year and we lost power for 4 days during the hard freeze and just unplugged it all since it stopped working and was not accessible.

So before i even build a new image, is there hardware or software i can get in order to ensure that if the image gets corrupted again, its just a restore of sorts to get back up and running without having to reinstall and configure everything again…

Is that even possible? Can HA be run on something other than the RPi?

Just looking for options and suggestions…

Assuming you are using Home Assistant OS, pick one of these:

https://www.home-assistant.io/common-tasks/os#copying-your-snapshots-to-another-location

Automatic daily backups copied off your system to another location are your best bet for a quick recovery.

What image you got corrupted?
Pi booted up from SSD should be reliable. Not sure Haos has support for UPS, if so it should help to shutdown correctly in case of power loss

thanks ill take a look, i found this one after i posted the question:
https://rsnapshot.org/

But to give some background, back in feb there was a hard freeze here and we have a few moments overnight where the power kept going off and on, then finally off for about 2 days…

I would have to look and check what i had actually loaded on the SD card, but i know this much is that it wasn’t a desktop version, i had to access HA via my pc web browser once i had initially set it up.

So this time around i would like to take my time and setup in a fashion that is little easier to manage on a day to day basis as well as have it automatically backed up (complete bootable image) so that if i went down right now, and i had a backup from last night or even earlier today, i could rebuild the sd card image with the backup and merely boot back up and back in business in the amount of time it takes to do that.

I didn’t know and realize that the image could or would get corrupt so easily and caught me off guard when nothing was working or accessible again.

I would prefer local back up but i guess google drive is also an option as i use it for other back ups…

What makes it a sucky situation right now is that i had so much time and had spent time setting up NodeRed flows that i finally had working the way we wanted and this happens…

Then use one of the local options in that list. SAMBA backup is working faultlessly for me.

A word of caution: don’t password protect your snapshots. Makes it very difficult to extract one file if you need it, without doing a full restore.

Agree with Tom, you should open Samba to backup your files and be sure you also backup your lovelace if you do it via browser. Not sure about nodered though. But my guess you can also save all the settings.

Must have add-on is GitHub - sabeechen/hassio-google-drive-backup: Automatically create and sync Hass.io snapshots into Go if you used Supervised version.

Furthermore if you are running Pi be sure to booted from SSD and not from MicroSD because it will be much faster and safer.

You have misunderstood me. I’m talking about this: hassio-addons/samba-backup at master · thomasmauerer/hassio-addons · GitHub for automated backups to a local drive as was requested. He does not want the cloud like your Google solution, which has already been suggested btw.

Do a staggered backup

Assuming you are hassos on a pi with sd card (inferred from posts), perform scheduled configuration and image backups to both a usb flash drive as well a local network location

Backup the local network location in an encrypted form to a cloud storage provider daily (when it detects changes obviously)

Perform a full image backup prior to any system upgrade in case it goes wrong

Document the backup strategy AND recovery process in great detail so if something does go wrong, you do not need to start from scratch figuring out how to get your system working again.

Monitor the backup locations on a regular basis, make sure you can get the files if you need it, test cloud storage downloads for transfer speed and file integrity.

Using another sd card, TEST YOUR RECOVERY STRATEGY SEVERAL TIMES A YEAR

I would STRONGLY consider moving your bootable storage to a SSD, and only use the SD card (a high endurance card) for local backups, and using a flash drive for recovery testing.

The reliability of a system like this falls squarely in both the “you get what you pay for” and “you reap what you sow” categories, inexpensive off the shelf sd cards and hobby grade boards like the pi are not built with reliability and redundancy in mind, no matter how reliable the software running on them happens to be.
Use “enterprise” or high reliability storage devices, get a BIG battery backup AND a high end surge protector, use RAID on networked storage, and backup everything everywhere.

It can be run an many devices, I am running it in Docker on a basement server that performs many functions, it is fast, so that also makes backup and recovery fast, the primary backup location is on the same server different SSD. That location uses Duplicati to backup encrypted to Backblaze B2.

I can stop HA, backup all my config and a 3GB database, upgrade to a new version, and have it running again in about 4 minutes.
Recovery from a backup file is even faster, stop HA, rename the corrupt config folder, unzip the backup to the correct location, and startup of HA is well under a minute.

That is the kind of recovery system it sounds like you might want, but it requires maintenance outside of HA and hassos, since you are now in charge of maintaining the host operating system on which Docker is running. It is also FAR more expensive, the server, with 20TB storage, 32GB of mem, and 4 solid state drives (3 are in RAID5) cost about $2500.

There are of course far less expensive intermediate solutions, many use a NUC or other small x64 mini-pc, they run from $300 to $1000 configured with varying levels of performance and reliability, though they do not have the kind of hardware redundancy options that an actual server has, you cannot have a RAID6 storage array for example, or a redundant power supply.

The question to ask is “how much does it feel like I lost” if your system goes down and is unrecoverable, the hardware cost of course, but also the work you put into it. If you value your own time at say, $20 an hour, and you spent 100 hours setting it all up, that 2 grand right there! Spending a fraction of that to make the chance of another loss far less likely and to also make the recovery from a loss far easier and more rapid is a no-brainer… if you have the money to spend.

Not what they want

Which is why I keep suggesting the SAMBA backup addon.

that would be the local network location I was referring to, I was saying to then back THAT up offsite, so that if your local backup storage goes down you can still perform a recovery

Ok but as I said they don’t want cloud storage. Put the offsite backup on a flash drive at someone else’s house or your workplace.

I’m in a very similar situation to the OP.

You could shut down HA, pull the SD card and use a utility to duplicate it. But you’d have to do that frequently, and it doesn’t seem worth the effort. It’s hard enough dealing with all the updates, and the breaking changes.

My (admittedly imperfect) solution is two-fold. I do snapshots before I change anything, but not daily (too much wear and tear on the SD card.) I immediately copy these snapshots to my NAS, deleting older ones. I also use a synchronize utility to copy the contents of the HA “Config” share via SAMBA to my laptop on a more regular basis.

My theory is that, when HA fails unexpectedly, the SD card will probably need to be formatted or replaced. Having snapshots on the SD card is pointless. They need to be readily available during the restore process. Having a handy copy of the Config share adds to the convenience, and allows me to “top off” things which may have changed between the last snapshot and the last file sync.

The restore process, which I’ve been through once, would be to install the latest (or previous) version of HA, install SAMBA, then run the snapshot restore. It’s actually not as bad as it sounds.

I agree it would be nice to be able to create a “ready to go” SD card on another computer, which would contain the base HA system and have the snapshots pre-applied. Apparently that’s not high on the development priority list.

1 Like

10x that ^

Restoring from snapshot shouldn’t take longer than 20 minutes if you have the snapshot off the card already.

Download latest image. Burn to card/SSD. Upload snapshot during on-boarding.

2 Likes

As @tom_l said a few time: You should use the Samba BACKUP addon. It automatically backups your whole HA configuration as a snapshot file (.tar), puts it in the backup folder of HA
AND puts a duplicate to a local network storage (NAS).

If your pi now crashes, you can easily get the snapshot from the NAS, put the standard HA-image to a new SD card (better use a SSD) and upload the snapshot from the onboarding screen, and you are good to go.
Of course, you can always improve the reliability like @richieframe described, but I think this would be one of the most basic setups.

I’m currently using a NUC with an M.2 SSD in combination with a NAS running in RAID and a small UPS that shuts both of them down if there is an outage.
I would strongly recommend a UPS, as a controlled shutdown prevents most of the irrecoverable crashes.

1 Like

Let me start by saying thank you all… That is more than plenty to consider and read about and try… This setup that went down on me was very simple in the sense of the latest Rpi and it was a good SD card… and since we will be moving soon, ill be taking all the above information into consideration as I start to set everything back up in the new place… I’m not looking to spend a fortune on this, but I’m ok with spending as I build out the system.

So running a NUC with a SSD is that a route that would be running windows and then have Docker running on it?

Since what ill end up setting up will be dedicated just for this purpose, I’m ok with running whatever needs to be run in order to make it work…

When I posted the initial question my main question was geared to a back up solution for the SD card on the RaspberryPi. basically a way to image the SD card and have that image saved off somewhere where if the card got corrupted, I could format the card or get a new card, reload the image and be back in business.

But seems that its little more complicated than that… So I have a lot of reading to do and consider.

Yes, a lot to consider. Biggest decision is how much money and/or effort you want to throw at this problem.

The reason I stopped using the SAMBA backup add-on is because I simply didn’t need it. As Syntox explained above, it automatically creates a snapshot on the local SD card (or SSD) and THEN copies it to your NAS.

I’d rather do that manually, before making a change, than have a bunch of unnecessary, huge files regularly created on my SD card. I’ve heard that lots of writes can shorten the life of an SD card. If HA could create the snapshots directly on another storage device, leaving the SD card alone, that would be worth automating to me. YMMV.

I’m using Proxmox. In simple words, it’s an operating system which is made to create and manage VM’s. I’ve created a HassOS MV to have all the benefits of the Supervisor.
Using virtualization, you’re pretty free of what you want to run in the future, as you can easily create another virtual “server” of your choice as log as you have enough memory.

1 Like

Don’t use Windows as a host for running Home Assistant. You have no control over reboots. A nuc running straight HA OS, or run supervised on debian, or use a VM solution as proxmox. Everything but Windows.

1 Like

+100
And windows alone eats so much memory

Not sure of the specific considerations for an NUC, but if you plan to use Docker then I would say Unraid is worth considering. It’s got good Docker support, is pretty easy to set up and maintain, and has an active community.

I have HA running in Docker, and am using Duplicati to backup the same as a previous poster. Duplicati supports local backup and I’ve been impressed with the speed of the backups and the ease of restoring (both full restores and individual files).