Home assistant hangs after few hours

mrloddo · January 12, 2020, 7:30am

Hi all,
I’ve switched my Home Assistant on RPI 4 on December. It started to hangs 2-3 weeks ago. On homeassistant logs I can find several exception on connection or on timeout on connection.
I have to restart the homeassistant container from ssh 22222.

I suspected the SD card in first instance but the other containers work fine (like mariadb and node-red). So I don’t know how to troubleshooting this issue.

Any hints?

Thanks
Marco

mrloddo · January 20, 2020, 7:19pm

No ideas?

Marco

VolkerKa · January 20, 2020, 7:28pm

Hello Marco. I’m not a professional and I probably can’t help you either. However, problems often arise if the Raspberry Pi is not supplied with sufficient power. Do you have a power supply with sufficient power? Could that be the cause?

mrloddo · January 21, 2020, 6:31pm

Hi,
I’ve evaluated the power and also the SDCard. But I presume it’s not related to those components because when the hang occurs, I need only to restart the docker container related to the homeassistant. The other containers (nodered, supervisor, duckdns, samba, portainer) still work.

Marco

mrloddo · January 29, 2020, 7:34am

I’ve tried to go deep in the issue:

I’ve removed all lovelace modules in the configuration
I’ve removed almost all entries in the configuration.yaml (obviously mantained vital rows related to http, authentication and etc.)
I’ve removed all automations, scripts

Now the homeassistant hangs after 15-16 hours. In the homeassistant logs are still present exceptions related to upnp, probably because the entities are still present on the database.

Is there a way to dump the threads stacktrace of python3 process related to homeassistant?
Marco

code-in-progress · January 29, 2020, 12:25pm

Is it Home Assistant itself that hangs (as in no automations run) or is it just the browser frontend? I’ve had this issue on and off when using Chromium-based browsers (Google Chrome, Brave, etc). On other browsers (FireFox, Opera, Edge), I don’t have any issues.

mrloddo · January 29, 2020, 7:27pm

Hi,
all Home Assistant hangs (no light automations run during the hang period).

Marco

Cobi · April 3, 2020, 8:10pm

Did you ever found the root cause? I’m having the same issue.

xpoh · May 16, 2020, 5:47pm

Same situation here! Did you guys found the way to fix it?

kernelkraut · July 11, 2020, 3:36pm

I have exactly the same issue. I have no errors showing up in the logs that I can find. It seems to happen whenever I have automations kick off. I have some automations for notifications (power outage, turn lights on etc). It seems to overlap with when those automations kick off. My front end is just blank but I can still login via ssh and ha core restart. Then it’s fine again for a while. It seems to have started for me since 0.112

Jpsy · September 5, 2020, 12:34pm

I have the same problem for some months now:
HA gets unresponsive. A system restart (I.e. ha core restart in HA’s shell) solves the problem for a short time.
I am investigating and stripping down the system step by step, but so far only limited luck.
My system is Hassio on RPi3b. My data partition is on SSD, so I don’t think the SD card holding the system is damaged and responsible.

I have switched off anything that could create heavy CPU load. MotionEye is installed but stopped. All cameras have been temporarily removed. I have analyzed the DB and excluded all entities from recorder that create unnecessary amounts of state records.

One week ago I deleted the database (~2.4 Gb at that time). After that the system was fast as lightning. But some four days later it started to lag again every now and then. Now, after 7 days, it is back to real hangs (database size now ~130 Mb).

I think my next step will be to exchange HA’s SQLite database by a MariaDB. I will run Maria on the same RPi using the corresponding HA add-on. I don’t think that the RPi is to weak to handle that, especially considering that I have removed all excessive system load. I also don’t want to change to a NUK or other beefier system for reasons of low energy consumption and carbon footprint.

I will report whether Maria did change anything.

flave · September 6, 2020, 8:06am

I also seem to have that problem. At moment I’m also running a rpi 3. And as you @Jpsy I have a knx installation. So I’m very interested in what is the result of you test with the MariaDB

jaruba · November 6, 2020, 12:23pm

@Jpsy

Did switching to MariaDB help / fix the issue? HA also hangs for me after a few hours or 2 days max. I’m using Deconz and I’ve found many others that use this addon too with similar issues in this community and on GitHub.

For my case, I’m running HA on a RPI3b+ with a SSD, if I disable Deconz everything works fine… But I’d prefer not disabling it…

Can’t figure out the exact cause of this, logs don’t seem to hold any valuable info but I’m also suspecting the recorder…

jaruba · November 8, 2020, 12:39am

I managed to fix my issues by optimising the more intensive integrations.
I set memory_init: 256 in Unifi Controller’s config, disabled query log in AdGuard Home, and tweaked the recorder with these settings:

recorder:
  purge_keep_days: 1
  commit_interval: 10

Not only does it not hang anymore, all the automations seem a lot faster too.

Jpsy · November 8, 2020, 8:08am

NO! Sorry for not reporting back.
My issue – and probably the issue of most HA users that experience laggy performance – is/was memory (RAM). HA has evolved to a point where an average installation with some add-ons will easily consume most of the 1 Gb of memory that an RPi3 provides. Even if only some 700 to 800 Mb of mem are allocated Linux will still start to get much slower. The reason is that free mem is used for disk caching. And HA reads and writes to/from disk a lot for its history data.

So the real solution for this problem is to upgrade to an RPi4 with 4 or 8 Gb of RAM. I have done that and my system flies! You can use my extensive installation guide to get HA running on an RPi4:

jaruba · November 8, 2020, 10:05am

@Jpsy This definitely seems to have been the case for me too. But an RPI3b+ still works beautifully with the tweaks I posted above, dare I say it works better then it ever did before. I think I can safely continue with my current setup.

hdehaseleer · January 5, 2021, 5:36pm

I’m relatively new to HA.
I’m running HA on a Raspberry model 4.
At this moment, I’m only using the integrations Node-red, Denon Heos, Onvif, Zwave, KNX

The core on my Raspberry hangs nearly every week. A ping to the PI still is OK. A connection with the UI times-out. Until now, I’m always doing a hard reboot (power off). But this empties also the log file

Is there a way to keep an archive of the log files? So that at least I can try to find the problem in the older log file from before the hard reboot.

Does a watchdog mechanism exists for the Core on my PI, which reboots automatically?

vinz486 · March 7, 2021, 5:13pm

Hi,
same issue here.

RPI4 8GB Ram.

Few entities, few automation, deconz and conbee setup.

Hangs every 2 weeks. May be some memory leak going on?

This is the type of issue that make me think that HA is “not ready for production”.

lwolfs · March 8, 2021, 8:23am

Same problem over here:
Raspberry 4 w 4G RAM, boot from SSD.
Hang occurs daily
Minimal addons (SMB, Grafana, InfluxDB).

sparkydave · March 8, 2021, 8:31am

Have you guys checked your logs for entries along the lines of ‘malformed database’? Have you tried stopping HA, deleting the database file, restarting HA?