HELP! Home Assistant OS has crashed

I am running Home Assistant OS in an Intel NUC.

This morning I went into configuration - Automations, and get this:

Error while loading page automation

I never saw this before, so I wanted to try a restart. So, I went to configutation - settings, and got this:

Error while loading page server_control.

Not good, I knew I could restart the server from my profile, and I get:

Error while loading page profile.

Last thing to try is a power/off reset of the server.
Nothing changed. Still lots of errors.

Ok, my last backup was only 13 hours ago, so config - backups, select the last snapshot and…:

Could not restore backup

Restoring a backup is not possible right now because the system is in startup state.


Now what? Is my HA installation hosed?
Unless someone has another idea, my plan is to start completely over and re-flash the M2 SSD on the NUC with the NUC image then do a restore.

Is it OS in startup state or home assistant itself? In any case, i would check system logs.

Check for config files on disk. If they exist your OK.

Check inside config folder for .storage folder. If files there you are OK.

What version you using?Did it update on its own? I would backup the Config folder making sure to get .storage before anything.

Do you use addons? Curious not relevant.

It’s in Home Assistant. And basic functions still work. I can turn lights in and off from HA and sensors values show normal on Lovelace… Cameras and HASP are just blank pages.

I am not running docker- it’s HA-OS from the image file.

Log file (configuration - system) shows what looks like a normal startup. Just a couple of warnings from Samba Backup and from Node-Red. But I see these all the time.

Looks normal here.

Looks normal, I think. I never look here. What’s a pickle file?

Host: HA OS 7.2
Supervisor: 2022.01.1
Core: 2022.2.2
I have auto-updates on.

They are backed up on another NUC running an NAS, but what is your concern here? If I do a restore then isn’t everything restored, including .config?

I have most recently been doing stuff in Node-Red.

Update
Still limping along. I was able to find a page where I could restart the server: configuration - system - host. But still no difference.

I am noticing that all of th errors say something like “Error while loading page history.” Could this be a Lovelace problem? I ran ui-lovelace.yaml through a YAML Lint program, but no errors.

Automations are still working, Node Red is still working, so functionally Home Assistant is mostly working. So, I am in no rush to re-flash HA OS on the M2 disc. I ordered another one.

QUESTION:
Am I on the right track to flash the HA OS on a new M2 SSD then restore from an offline snapshot?

there is some method for fixing database. try that.
this is not necessarily critical however

Do you mean to delete home-assistant_v2.db?

Tried that. No difference.

I was preparing to flash a spare M2 SSD, and when I went to shutdown the server from Home Assistant, all issues had resolved themselves.

While that saved me some work, it is also worrysome because I don’t know why it failed in the first place.

Did it fail or just have error msg?

Maybe integration auto update create small db issue that require restart to resolve. If all still working this is not big worry?

Did you read the whole post?
There was nothing in the logs that looked suspicious.
Deleting the database made no difference.
When clicking on various pages in Lovelace, many of them simply said " Error while loading page…".
Several reboots later the error messages kept coming.
This persisted for more than 24 hours.

Then suddenly, those pages were good again.

Wouldn’t you be worried about why this happened in the first place?

Yup

Didn’t completely understand original symptom but that didn’t necessarily matter since question was “Now what? Is my HA installation hosed?”

Since you want to know cause I now ask about fail state. I know supervisor not work but we’re you able to access and control devices still? I can’t tell from above.

Thanks for caring. I was delayed in responding because of a problem in one of my servers.

Yes, devices, except cameras, worked fine. Automations worked and I could control lights from HA. HASP devices still worked as well as ESPHome devices.

I dont know HA OS or supervised well but sounds like it is problem with the supervisor since server controls and some config pieces did not work but integrations OK.

If integrations remained functional I would not worry but would look into it. like losing ssh access temporarily to HA server, it would concern me but if server still function I would not see it as big issue and assume ssh had problem unless it continuously occur. Loss of integration control is big. Camera loss is big, maybe this is hint to cause. Memory issue maybe?

Just saw someone have problem where automation page not load due to browser issue

It occurs to me that maybe this was your issue

Did you test from other devices when problem occur? Camera and only some page not working could be sign of browser issue. Specifically the fact that all integrations Ok except camera

Tried three browsers on two computers.

Here is some information on the info tab:

host_os: Home Assistant OS 7.2
update_channel: stable
supervisor_version: supervisor-2022.01.1
docker_version: 20.10.9
disk_total: 457.7 GB
disk_used: 12.6 GB
healthy: true
supported: true
board: generic-x86-64
supervisor_api: ok
version_api: ok
installed_addons: Terminal & SSH (9.3.0), Samba share (9.5.1), Node-RED (10.4.0), TasmoAdmin (0.16.0), ESPHome (2022.1.3), Samba Backup (5.0.0), File editor (5.3.3)

I forget how much RAM is installed, but sensor.memory_free shows 14MB.