Outstanding issues with restoring backups
There are at least four issues outstanding.
1. Errors are not shown on upload failures
Problem: Backups just fail silently (usually immediately), which manifests as it taking hours to fail when in reality it usually never actually started. I suspect this is what most people here are facing.
Workaround: Periodically refreshing the page seems to be fine. If you see the restore backup page again after a refresh, your restore has failed and you will need to investigate further.
Solution: I have described the issue of needing to refresh the page once the restore is completed here and proposed a fix here too.
2. Version migration
Problem: If your backup was created using a newer supervisor (2023.12.1) than what is pre-installed on your new HA install (2023.12.0), your restore will fail.
Workaround: I wouldn’t recommend it but strictly speaking it is possible to modify the backup.json file instead your backup archive. I did this and experienced issues with some of my add-ons reinstalling because there was changes in the way that the supervisor stored config data in a future version. Again, downgrading your backup isn’t recommended.
Solution: The supervisor needs to automatically install the latest version when restoring a back-up but doesn’t. An issue has been raised for this.
3. Unknown errors
Problem: Even if we did show the error message from the API when a restore fails, that error most often is just “unknown error”.
Workaround: You will need to check the supervisor logs. Unfortunately, this just isn’t always possible, i.e. with headless devices, like a Raspberry Pi stored in your networking cupboard.
Solution: The Onboarding API should have a way to check the supervisor logs. A new issue or PR will need to be raised for this.
4. HTTP server goes down during a restore
Problem: It seems like the HTTP server goes down during a restore. This can look like the restore failed but it’s usually fine and you just need to wait a few minutes.
Workaround: You can check the supervisor port if the install is still healthy.
Problem: There is no handling currently for when the server is down during the restart in the Front end or Supervisor APIs to update the progress. It should be something like a “Restarting… Please wait 5 minutes.”-type message. A PR has been raised for this.