How long should a restore from backup take?

It meant to be a reference to a classic joke:

A physicist and mathematician are tasked with making tea, they’re provided with a kettle, a water tap, stove and tea leafs each.

They both fill up their kettle, put it on the stove to boil then add the tea.

The next day they are tasked with the same but provided with an already filled kettle.

The physicist goes about with boiling the water but the mathematician simply dumps out the water from his kettle and exclaims “Now we we’re back to the hypothesis of the first problem which was previously solved!”

2 Likes

Unexpected surprise:

  • HA system on Rpi4 suddenly did not start any more.
  • (re)Imaged (same) SSD with fresh HA installation
  • Vanilla HA system started, asking to be setup or restore from backup
  • Backup (2023.10 or .11, only 170Mb) chosen and waiting for restore to finish. In the meantime reading here it could take very long.
  • Did not have the time, so powered of the RPI after approx 30min and no progress to see.
  • Checked to see in what expected broken status the system would (re)start

But all was OK! No missing integrations, devices or Lovelace setup, all there. Seems like the restore was finished (ok?) but “forgot” to tell the busy-restoring notification in the browser.

It’s still fresh, so maybe something will pop up later. And/or maybe this N=1 is a pure coincidence…

1 Like

Well, I’m a bit late to the party, but your [vdrHorst] reply helped a great deal, thanks!

tl;dr: Refreshing the page after a long restore wait immediately showed completion.

Had installed HA image (via RPi Imager) and played a bit (newbie) on one RPi. Decided to use another, and a different SD card, for a more permanent installation.

Opened the :8123 port using my laptop browser.
Made a backup from the first RPi, 6.5MB. (Yes, “MB”)
Imaged the new sd, booted the RPi, waited till initialization finished. Selected “restore from backup”, gave it the 6.5MB .TAR file. Waited. Wondered. Found my way here, and read through the entire thread, to the prior post.

After reading about powering off his Pi, I wondered… and hit [F5] on the tab for homeassistant.local:8123, which was still showing the chasing circle after ~45min.

And immediately got the expected “overview” page with all devices (apparently).

So it appears that, in this case, restoring a 6.5MB .TAR file to a new Home Assistant SD image on an RPi4, the “restoring” screen chasing circle never updated (~45mins) until I refreshed the webpage. And now all looks right.

Thanks again to all!

That seems like a bad idea, a cheap ssd is still beter than a expensive sd

Depends on the hardware I guess. The USB3-SSD that I used didn’t work well with my RPi4. When RPi4 is physically rebooted I usually have to manually unplug the USB and re-plug back in for it to boot into HA. I failed to find out why and couldn’t bother using it anymore.

Maybe USB SSD isn’t a good idea. I don’t know.

I’ve had exactly the same thing happen to me, “hung” on restore screen until I refresh my browser. I wonder if it’s a “browser thing”

It’s definitely a browser time out of some kind. The backup/restore routines should push a refresh to the browser. It apparently doesn’t. I get the same often, on backup or restore. I use iotop to monitor when it’s done. My backups are over 400MB and it can be painfully long. 120 minutes plus for restore! Even using a SSD on RPi4/8GB.

Outstanding issues with restoring backups

There are at least four issues outstanding.

1. Errors are not shown on upload failures

Problem: Backups just fail silently (usually immediately), which manifests as it taking hours to fail when in reality it usually never actually started. I suspect this is what most people here are facing.

Workaround: Periodically refreshing the page seems to be fine. If you see the restore backup page again after a refresh, your restore has failed and you will need to investigate further.

Solution: I have described the issue of needing to refresh the page once the restore is completed here and proposed a fix here too.

2. Version migration

Problem: If your backup was created using a newer supervisor (2023.12.1) than what is pre-installed on your new HA install (2023.12.0), your restore will fail.

Workaround: I wouldn’t recommend it but strictly speaking it is possible to modify the backup.json file instead your backup archive. I did this and experienced issues with some of my add-ons reinstalling because there was changes in the way that the supervisor stored config data in a future version. Again, downgrading your backup isn’t recommended.

Solution: The supervisor needs to automatically install the latest version when restoring a back-up but doesn’t. An issue has been raised for this.

3. Unknown errors

Problem: Even if we did show the error message from the API when a restore fails, that error most often is just “unknown error”.

Workaround: You will need to check the supervisor logs. Unfortunately, this just isn’t always possible, i.e. with headless devices, like a Raspberry Pi stored in your networking cupboard.

Solution: The Onboarding API should have a way to check the supervisor logs. A new issue or PR will need to be raised for this.

4. HTTP server goes down during a restore

Problem: It seems like the HTTP server goes down during a restore. This can look like the restore failed but it’s usually fine and you just need to wait a few minutes.

Workaround: You can check the supervisor port if the install is still healthy.

Problem: There is no handling currently for when the server is down during the restart in the Front end or Supervisor APIs to update the progress. It should be something like a “Restarting… Please wait 5 minutes.”-type message. A PR has been raised for this.

1 Like

For me it worked to just reboot the system 30 minutes after the restore started. Then, HA booted normally with the backup fully restored. Probably you can already reboot after 5 minutes, my guess is that the restore does not really take very long, it just does not show that it is finished.

I did have a small backup (1 MB) and run Ha on an old laptop, which peforms much faster than the R-Pi4

Looks like there should be significant fixes to the backup / restore processes in 2024.2 with some of the latest changes that have gone in.

The first time I tried to restore from a backup I let it run and it was still showing the “in progress” spinner 24 hours later.

This link helped ease my doubts:

https://www.derekseaman.com/2023/04/home-assistant-pt-3-restoring-your-configuration.html

Here is a quote from the article:

Waiting on the the Restore to Complete

At this point the Home Assistant UI fails us. There is ZERO feedback on if the restore is working, if it failed, or had errors. And you have no idea when it is done. The Restore in progress window remains, even after the restore is complete. Huge fail. For reference, I restored a 1GB HAOS backup and it was completed (per Proxmox stats) in about 10 minutes. But 40 minutes later the Restore in progress window was still there.

If you are restoring HAOS on Proxmox, once the restore starts, flip to the VM Summary screen and monitor CPU, network and disk activity. After things quiet down, proceed. See the screenshot below for what the performance stats of my HAOS VM looked like during the restore process.

After you think the restore has finished, refresh your web browser and see if you get a regular Home Assistant login box. If not, the restore is not yet complete or it failed. Wait a few minutes and try again. If your production HA server was using SSL certificates, then before you refresh your browser change the URL to use HTTPS. If you don’t do this you’ll be sitting there all day refreshing even if the restore is complete. You may need to open a new browser window to get the refreshed HA login page.

the person in the comment above you made a number of PRs to fix many of those restore issues. They wrote a post with a full run down of their PRs, etc a few comments above that: How long should a restore from backup take? - #50 by codyc1515

1 Like

After a power breakdown HA stayed in recovery mode. I restored to a older backup form 1.5GB and that took about 30 minutes. Now I’m trying to restore to the latest backup from 22GB big (I know way too big, it also contains all the recordings from my recently installed frigate) but now after 7-8 hours it’s still not finished. Must I wait a little longer or try to reboot my nuc?

Home Assistant observer

Supervisor: Connected
Supported: Supported
Healthy: Healthy

Since this topic has started many things has changed in HA regarding how Supervisor deals with the backups, but seems to be still an issue the restore process, disregarding that it has had some upgrades added to it, to speed up the process. If you struggle with the large backup, then I would suggest a manual way if you are techy enough to do so. See my post above referencing a Github issue how to copy things manually from the archive.

I think the only thing which I did a bit differently then the instructions is that I started HA and the addon, and when I have made sure that the addon is running, I shutdown HA from the terminal to do not interfere with it when overwriting the files.

So so so… Time to share my experience, just sharing, no judgment :grin:.

After running happily my Home Assistant for around 4 years with no breakdown whatsoever, I bumped into a post saying that SD Card do not last forever, so I decided to be ready in case the worst would happen. Bought a new SD Card, flashed it with the Raspberry imager and made a full backup before swapping the cards. So far, so good. I have a Raspberry Pi 4, the latest OS, latest HA version, latest supervisor, no update pending, clean and neat HA, “by the book”. Backup is not that big, less than 400MB.

I put in the fresh SD Card and after a few minutes I get the “Onboarding” screen. I select my backup, launch the full restore and… nothing happens, for like hours: no blinking of the green led, but apparently heavy tcp traffic. Connecting to the HA simply shows the Onboarding screen, refreshing the page does not help. I have restarted it completely, same onboarding screen. I decided to do the process a second time (maybe I did something wrong the first time, thick fingers or something like it), same result.

I let it run the whole night and the day after, still nothing, same situation.

Because I had Grafana/influxDb stuff in there I thought this could be the part taking ages (when one is desperate, any action is an option), so I went for a partial restore without this block: Restore completed in < 15 min!

I was so surprised that I flashed the SD Card again to try a partial restore selecting all items to validate that indeed the Grafana/InfluxDB combo was guilty as charged: Restore done in < 15 min again!

I hope I am not creating confusion, but I must say I was pretty baffled by the result. Basically, the full restore did not do anything while the partial restore with all items selected went super smoothly and the end result looks identical to the previous instance on the old SD Card: only exception is that I had to activate “Remote Control” for Home Assistant Cloud, but that’s just a check box to tick in the UI, easy peasy.

So, if you are desperate, you can always try that :slight_smile:

Cheers,
Rom.

Exactly this happened to me. The supervisor of the new install was the previous release (2024.11.2) instead of 2024.11.3 of the backup. Waiting an hour until I checked the logs and found out that the restore hasn’t even begun.

Because I’m reading your answer just now after fixing it, I was using another method:
The command supervisor update told be it is already the newest version. I think they reverted the stable release to the previous one (.2) because .3 had some issues.

I found this script: https://version.home-assistant.io/update-supervisor.txt
which can update (change) the supervisor version. Because it was automatically fetching the latest stable release which was no longer the .3 version, I had to type all the docker commands by hand in the terminal using a german keyboard with english layout :face_exhaling:

But luckily it did work and the supervisor is updated now. Just need to wait until the restore - hopefully - finishes (yes, there are already other errors in the logs…)