HA OS reboots/crashes a few time per day... where to start to fix?

Hello Everyone,

I’m running Home Assistant OS on my raspberry Pi 4 running off a self powered SSD drive.
this setup has been rock solid for the past few weeks.
I also have the conbee II USB key. nothing else is connected to the raspberry pi USB ports.
but it looks like since upgrading to OS 6.0, it started rebooting/crashing a few times per day.

recorder is running on mariadb. auto purge 7 days. commit set to 30. I also have a few devices/entities excluded.

before running home assistant OS, I was running raspberry os with home bridge on my micro sd card. This setup was also rock solid. I installed home assistant with docker… also running with a micro sd card. this was also very stable.
I’ve switched to home assistant OS shortly after. this has been running fine for a few weeks until I’ve found out it was rebooting by itself when I see some HomeKit notification about my blink cameras system status (which is updated every time I restart home assistant)

I’m somewhat new to Home Assistant OS. where do I start to find what is causing this?
where/how can I look at the log to see if there’s anything going one before the crash?

1 Like

could be power after all, check Pi4+SSD too much to handle for powersupply? replace with - #6 by Mariusthvdb for my findings on the matter.

Though I must admit, I have the SSD plugged in directly now, because it feels quicker somehow, and I agree that it started only after 6.0…

Still, as I was told on Discord, it probably was just luck the self powered SSD worked ok for so long… we just cant tell.

I have a canakit 3.5A power supply… so technically this one should be fine.
I’ll check your thread for sure

yeah but the PS for the PI is not the real cause here, power consumption of the attached USB devices is. The USB ports are capped at 1.2 A so if they draw more at some peak point, you’re in trouble.

However, you’re just as in trouble if you use a Powered USB Hub, because many, if not most, back-power the PI, bypassing the fuse in the Pi, and, which was what I expereinced, prevent rebooting the device.

So, all in all, its not that obvious.

did you have any other usb connected device?
I also have a CONBEE II USB key… I can surely disconnect it temporarily.

last time I check my external SSD enclosure on my Mac, it was like 100ma… so that should be an issue either.

yours was freezing… mine is rebooting. at least with the SSD is quick.
The PS integration is saying all ok too on my side.
I removed my CONBEE II USB key for now… will see if it’s more stable.

still rebooting without the CONBEE II.
I’m back to a microsd card… lets see if it’s more stable.

/edit just rebooted again using my microsd card.
I’m a little bit lost… I don’t know what to do.

1 Like

I have the same problem with a Raspberry PI 3B (w/ SD card). Was running stable. But since a few months rebooting up to a few times a day. Load goes up to 15 before reboot.

How to debug, what is causing the high load? I already disabled many integrations and automations.

Is there a way to increase the watchdog timeout?

For me, load or cpu use only goes up when it’s booting, which is normal.
It goes down quick and stays there.

For me this happened, when I had some add-ons added and the Pi 3b became (memory) overloaded. Although it would work often, it would reboot 2, 3, 4 times a day. I have disabled Grafana since the issue (almost) vanished. Now I am waiting on my Odroid N2+ the fix the issue completely while being able to use Grafana.

Guys check your logs… see if there are some breaking changes… If that happen and error occurs your HASS will surely did not perform in top notch causing problem

memory usage is very low … I’ve never seen it go over 900MB. same for swap… most of the time, it’s 0MB. Also for add-ons, I don’t have any memory intensive modules. google drive backup, lets encrypt, mariadb, ssh&web terminal, phpmyadmin (not running).

but I will monitor this more closely to see if something happens right before a reboot.

the only error I have is in my core log… I see this when it boots… but no error after that.
2021-06-21 06:24:23 ERROR (MainThread) [pyhap.characteristic] SecuritySystemCurrentState: value=0 is an invalid value.
2021-06-21 06:24:23 ERROR (MainThread) [pyhap.characteristic] SecuritySystemTargetState: value=0 is an invalid value.

this is either from the blink or envisialink integration. no other error after reboot.
there’s probably some variable not set by one of these integration.
both integrations work fine.

All logs shown happened after a reboot. how can I check logs that happened before a reboot?

To get to the previous log files you need a ssh connection on port 22222. Many directions on how to do this. With the reboots you are having I would think you have some type of hardware issue? Most of the posts since Nov of issues with the pi 4 have been with system freezing after HA OS 5.4 updates. The update effects a small group of users ( I am one) and the fix for us was either stay below 5.5 or go to Debian with HA as supervisor install. My system is rock stable either method.

seems weird that I would suddenly have a hardware issue right after upgrading to HA OS 6.0.

for now, I’m transferring my setup to a virtualbox on my Mac mini (which is running 24/24h anyways for my plex server)… and I’ll do some more testing with the raspberry pi with a fresh new install.

I agree that it is weird, but not seeing others with this same issue. Try installing ssh on port 22222 and publish the logs. You can also try the Debian install with ha as supervisor. Very good directions in the forums. If this works it would confirm that it isn’t a hardware issue. Snapshot restores make it fairly easy to test different configurations. I was going back and forth with different setups to figure out what would be stable for me.

ok you can laugh a little (maybe)…

before going on vacation for a week, I installed all my network equipment, including the raspberry pi, on a HomeKit Meross wifi plug to be able to reset everything remotely if needed.
I also have same model of this power plug on my security camera in my living room.
The power plug in the living room started to turn off/on by itself a few times per day pretty much at the same time I upgraded to HA OS 6.0.
I’m not sure yet if the one connected to my network equipments is doing the same… but that could explain why my raspberry pi is rebooting somewhat randomly.

I removed this power plug, reinstall HA OS and restored my snapshot… lets see if all is stable now.
I’ll report back tomorrow.

Confirmed… no reboot after removing this wifi plug.
Lesson learned.

For me, it seems to have been low memory. As soon as the free RAM goes below 100 MB, the load climbs higher and higher.

Resolution: When free RAM < 100 MB, I purge the recorder, which frees enough memory. (Although I configured recorder to only keep the data for the current day, it was hogging the RAM quickly.)