I run HAOS in a VM in ProxMox and ended up creating a problem during some cluster re-configuration where my HA VM ended up starting up on more than one node. This resulted in Boot Slot A getting corrupted. Boot Slot B is still fine.
I can see that boot slot A is /dev/sda3 and B is /dev/sda5. I thought I would be clever and just copy /dev/sda5 → /dev/sda3 with dd which went fine - however HA still will not start on that slot and “os info” shows A as “status: bad, version: null”.
I have nightly HA backups so I have no concern with being able to spin up a new HA VM and reload my backup - however I am a stubborn nerd and this strikes me as an opportunity to learn something.
Where is the metadata for the “boot slots”, e.g. these disk partitions stored? I started poking around in the GRUB configuration but didn’t see anything obvious. What do I need to do to mark Slot A as good? Is there any reason that my bitwise copy of Slot B to Slot A shouldn’t lead to a working Slot A?
Thanks for any help! Again, I can restore from backup if I need to - but I wouldn’t mind just fixing what I have and learning something in the process.
HAOS can now boot just in from Slot A - HOWEVER “os info” tells me Slot A “version: null”. I would still like to fix that!
Looking around in /mnt/boot/EFI/BOOT I can see where the EFI boot config is all setup, but I don’t see anything there about status, version, etc - so still not sure where this “boot slot” metadata is being stored…
For anyone who finds this later - I was able to completely fix Slot A by making sure I was booted to Slot B, downloading the 14.2 (current version) “ova” raucb file from github (home-assistant/operating-system releases) and then “rauc install {imagefile}”. Decent documentation is available here: Update system | Home Assistant Developer Docs
Please, don’t do this and don’t recommend others to do it, as it puts RAUC metadata out of sync and it leads to confusion when troubleshooting different issues. There is no need to “repair the damaged slot” - the “bad” status only indicates that boot didn’t finish successfully on the last attempt - it can be fixed either by a successful boot (maybe you just turned it off before it finished booting) or by re-installing another version that works (in case the version was somehow broken on your device). It’s not something that should bother you or that would hinder future update attempts.