Help: My HA won't boot

Hello. I am a fairly basic user of HA for about 15 months and have gradually added more devices and automation. I run a standard setup on a RPi 4 using an SSD as the boot device, its been working fine for a long time and I’ve kept it updated and dealt with various niggles caused by updates, but yesterday my world cam crashing down. I’d updated the core to 24.4 a few days ago and that was fine. Yeesterday I noticed that HACS and some of the HACS integrations had updates, so I ran the updates, checked the HA configuration was OK and restarted HA from the gui. It never came back up.

I didn’t know what had happened, I looked on a browser whilst it was booting I could see HA is starting but then it just stopped responding. If I connect a screen to the HDMI port I can see HA Supervisor start and I can access the CLI so I thought ok , what I need to do is restore from backup. help gave me the backup command, so all I needed to do was to find the slug of the most recent backup (I backup to a Samba share using the build-in samba support in HA). Eventually I found out that I needed to type login to be able to run
ha backup | more
to be able to see the backup details and find the slug.
Great, I thought so I went back into the CLI and typed the restore command:

Error: ‘BackupManager.do_restore_full’ blocked from execution, system is not running - startup

Oh, now what? I googled and didn’t and anything useful, so I though t that system might mean core, so I looked for a core command and tried

core start

and got

Error: Another job is running for job_group_home assistant_core

I’m stuck. Where do I go next? I’m not finding any guide for helpful info. I thought I would be better off using a standard setup if things went wrong but I realised that I have no real idea how HA works or how to recover and even if I get things back running how much time I will lose and how dependant I’ve become on HA and the choices I need to make to cope with it going down.

Any advice will be gratefully received.

to see the logs, enter

ha core logs

or try

ha core rebuild
1 Like

Thanks for responding Francis, I tried Core rebuild and that eventually completed, then I tried core restart and after about 20 minutes the web page was visible and completed the startup, albeit the web page was very unresponsive, now I get connection lost. My Zigbee adaptor was unplugged as I’m moved the Pi to my bench, so I’ve plugged it back in and restarted and eventually I have a web page, but its extremely responsive and the log seems to have lots of timeout errors.

Looking at logs I see tons of errors, but nothing stands out and I don’t know how I can paste the results here.

Try disabling all your custom components, and then restart HA. If it is better, then you can enable your custom components one by one to find the problem.

As I can’t get the GUI, can I do this from the CLI?

Not that I know.

I gave up trying to restore from HA in the end and created a new install on an SD card and then restored from my file share. I was able to upgrade HA Core and HACS but when I updated Tuya Local I was back to being locked out, so it seems I have the culprit. With such a catastrophic effect I would have thought that there would be other reports of this, but I don’t see any although I can’t point to anything specific so I’m not sure how the devs would go about trying to address whatever it is, maybe its something to with my setup? who knows?

I haven’t seen an upgrade of local tuya in months, but there are different forks. I would open an issue on their github.

1 Like

Well now I’m truly mystified, I was able to restore using the CLI (I discovered you can only restore full backups) and then gradually applied updates to all the custom components in HACS and HACS itself, taking full backups.

Finally I was left with the Tuya local update. So I took a full backup again having setup the samba backup storage, and blow me if it doesn’t all work fine now. So I’m none the wiser, but I now have a functioning HA again.

Thanks Francis for responding in my hour of need. At least I’ve learn a bit more about HA, even if I’ve no idea what happened. What I take away from this is to update components one at a time and take frequest full backups just in case. I’d just got too blasé about updates just working.

1 Like