Recovering a half-dead HAOS system

I have a remotely located Pi4 running HAOS, HA and a number of addons. It’s been running OK for ages, only being rebooted when there was the last HAOS update. Various containers have restarted as and when they were updated. I’m now in a situation that I’m trying to recover from without having to drive to the remote site. I also want a way to recover in future.

  • I can access a previously logged in session to HA (though I can’t log in from a new private window).
  • The default dashboard can be accessed and I can control devices that don’t require the services of an addon.
  • I can see that any integration that relies on an add-on is showing ‘Unavailable’
  • I cannot access System, Developer or Notifications pages.
  • I cannot access any of the add ons.
    • I cannot ssh into the terminal add-on (which normally I can)

So, the homeassistant core is running, but (maybe) nothing else is. And, I have no way to reboot the system.

Before I schedule a remote site visit to physically power-cycle the Pi, is there anything else I can try?

Moving ahead, is there anything I could have done ahead of time to enable recovery?

Gareth

No suggestions for fixing it remotely, but for future recovery I use a Shelly Plug with cloud enabled so I can remotely power cycle my Pi4. It’s allowed me to recover a hung OS several times.

Thanks, @mightybosstone . Good suggestion.

By way of an update to this.

While I was exploring what I could do, homeassistant restarted and I found I could access Settings and initiate a reboot. I can now see a big gap in the history for any device integrated through an addon. Those integrated directly continued to function. Also:

  • as I used mariadb to store history, that addon was working.
  • influxdb was working and collecting data

Maybe the problem wasn’t that addons stopped, but that API access to them failed.

Unfortunately, the system log doesn’t go back before the latest boot. I can see that the folder /var/log/journal exists and has files, so there are log files going back before, but the Terminal addon doesn’t have journalctl installed

Hi,

I have same type of issue. last when this freeze happend couple of week agoi enabled diagnostic sensors for Add-ons.

yesterday HAOS freezed again, what i found was that all other sensor were normal, but MariaDb has some activity every night at 04:15 and yesterday it make HAOS Freeze. This is best finding that i have got now. I haven’t got time to check what MariaDb do every night yet.