HA regularly becoming unresponsive

I’m running HA on a Raspberry Pi 4 and have had no real issues for a long time. Obviously we’ve come to rely upon HA for alarm, lights, presence etc so its frustrating when this all stops working. However, over the last week I’m having to restart HA (literally pulling the power) as its becoming inaccessible via the web front end (both local IP and Nabu Casa).

I’ve got the latest versions of everything installed. Could anyone suggest any troubleshooting tips?

Check your log files and keep an eye on system resource monitor sensors.

Most likely just an SD card on the way out though.

Monitored the logs prior to HA stopping again, nothing out of the ordinary appearing, just suddenly gets a “Connection Lost” message. Only up for 3 hours this time before stopping.

I have an automation linked to a hardwired button on the Pi (via StatusPI) and that does not respond, so its pretty much stopped working.

Going to clone the SD card and try again.

You will need a new SD card, not just cloning the old one, if this is the issue.

The other thing to try is to see if you can connect via SSH and view the logs after the web UI stops responding. The SSH addon runs in it’s own container so should continue to function if it is a home assistant issue.

So I can’t clone the SD card onto a new one, I have to start from scratch?

No that will not be necessary.

Install a clean copy of home assistant on the new SD card, install the SAMBA or SSH addon, copy across one of your backup snapshots, hit refresh on the snapshot page, then restore the snapshot.

Before going to all the trouble of doing this though I’d try looking at the logs via SSH when the web interface fails.

Had several freezes today, SSH’ing into to look at the logs showed that its was pretty much ticking along as the aarlo extension was happily logging messages. I ran “ha core restart” which restarted but immediately froze, only pulling the power gets it started again.

Will have a look at rebuilding the SD card next.

Reinstalled HA on a brand new SD card at the weekend and restored from a backup. Everything worked well until today when I’ve had two crashes.

I can SSH into it and there’s nothing in the logs, its the just the interface that is inaccessible from local IP and Nabu Casa.

Anything else I can look at?

I’m having similar issues with hangs on Hassio 0.113.0, and all my logs show no errors or anything out of the ordinary.

My Swap Use is hitting 100% and the CPU throttles up to between 60-80%. My Rpi stops working a few times a day and I need to pull the plug on the whole thing and reboot. I’m using an SSD via USB, so it’s not a failing SD Card.

All my Teckin wifi plugs connect to the Rpi, and if it hangs and stops responding then all the devices reset. The relays click and some devices turn on (!), such as a back-up light that kicks in if the UPS triggers.

I’ve tried disabling add-ons one at a time today, and the culprit looks like it might be InfluxDB ver 3.7.2. This updated today to 3.7.3 and didn’t restart, and during the period it wasn’t running the CPU stayed down at 8% and the Swap Use was down to 40%. A few minutes after it started back up, it went straight to 100% Swap Use and 80% CPU.

What add-ons do you use? Have you tried monitoring your CPU use and other Rpi parameters? Have you tried disabling add-ons?

Hi I am new one here. Installed HA on new Rpi4 this weekend. I have only 2 KNX switches working for the start but after 2 to 5 hours I can’t connect to webUI and I can’t ping Rpi.

I have WLAN connection with Aruba AP 303 with Instant Aruba. When I looked trough Aruba, no signal was reported from Rpi. Then I deleted it from Aruba point and after 30 seconds it reconnected. So, now I will connect it on LAN over the night to see what will happen.
So when system becomes unresponsive please try to disconnect it from WiFi router and see if it reconnect.

I’m also suffering from hass hangs since yesterday. Some sensors are surely updating as I can see activity on network but I can’t get into lovelace nor connect via app. Automation seem not to work as well. Which logs can I check for reason? I haven’t been playing around with it for a couple of weeks so it can’t be something I broke. I also have zoneminder on that system and it works fine. And it’s not Rpi but NUC with SSD.

I am experiencing the same thing with HA Blue. HA becomes unresponsive every some days and need to restart the power to make it become online again. Any idea on what is going on?

Possibly same issue as this one: Better logging - #21 by erik3

And since home assistant deletes all logs when it reboots one can never find out what is wrong.

ok, I have performed some tests and came to the conclusion that HA Blue becomes unresponsive due to something NOT related to heat. I have attached a cooling fan to my odroid and left it running like normal. I found myself having HA unresponsive regardless of the cooling fan. I hope something can be done about it because I did not pay 200 bucks for a HA Blue that is not reliable!