I have exactly the same issue. I have no errors showing up in the logs that I can find. It seems to happen whenever I have automations kick off. I have some automations for notifications (power outage, turn lights on etc). It seems to overlap with when those automations kick off. My front end is just blank but I can still login via ssh and ha core restart. Then it’s fine again for a while. It seems to have started for me since 0.112
I have the same problem for some months now:
HA gets unresponsive. A system restart (I.e. ha core restart
in HA’s shell) solves the problem for a short time.
I am investigating and stripping down the system step by step, but so far only limited luck.
My system is Hassio on RPi3b. My data partition is on SSD, so I don’t think the SD card holding the system is damaged and responsible.
I have switched off anything that could create heavy CPU load. MotionEye is installed but stopped. All cameras have been temporarily removed. I have analyzed the DB and excluded all entities from recorder that create unnecessary amounts of state records.
One week ago I deleted the database (~2.4 Gb at that time). After that the system was fast as lightning. But some four days later it started to lag again every now and then. Now, after 7 days, it is back to real hangs (database size now ~130 Mb).
I think my next step will be to exchange HA’s SQLite database by a MariaDB. I will run Maria on the same RPi using the corresponding HA add-on. I don’t think that the RPi is to weak to handle that, especially considering that I have removed all excessive system load. I also don’t want to change to a NUK or other beefier system for reasons of low energy consumption and carbon footprint.
I will report whether Maria did change anything.
I also seem to have that problem. At moment I’m also running a rpi 3. And as you @Jpsy I have a knx installation. So I’m very interested in what is the result of you test with the MariaDB
Did switching to MariaDB help / fix the issue? HA also hangs for me after a few hours or 2 days max. I’m using Deconz and I’ve found many others that use this addon too with similar issues in this community and on GitHub.
For my case, I’m running HA on a RPI3b+ with a SSD, if I disable Deconz everything works fine… But I’d prefer not disabling it…
Can’t figure out the exact cause of this, logs don’t seem to hold any valuable info but I’m also suspecting the recorder…
I managed to fix my issues by optimising the more intensive integrations.
I set memory_init: 256
in Unifi Controller’s config, disabled query log in AdGuard Home, and tweaked the recorder with these settings:
recorder:
purge_keep_days: 1
commit_interval: 10
Not only does it not hang anymore, all the automations seem a lot faster too.
NO! Sorry for not reporting back.
My issue – and probably the issue of most HA users that experience laggy performance – is/was memory (RAM). HA has evolved to a point where an average installation with some add-ons will easily consume most of the 1 Gb of memory that an RPi3 provides. Even if only some 700 to 800 Mb of mem are allocated Linux will still start to get much slower. The reason is that free mem is used for disk caching. And HA reads and writes to/from disk a lot for its history data.
So the real solution for this problem is to upgrade to an RPi4 with 4 or 8 Gb of RAM. I have done that and my system flies! You can use my extensive installation guide to get HA running on an RPi4:
@Jpsy This definitely seems to have been the case for me too. But an RPI3b+ still works beautifully with the tweaks I posted above, dare I say it works better then it ever did before. I think I can safely continue with my current setup.
I’m relatively new to HA.
I’m running HA on a Raspberry model 4.
At this moment, I’m only using the integrations Node-red, Denon Heos, Onvif, Zwave, KNX
The core on my Raspberry hangs nearly every week. A ping to the PI still is OK. A connection with the UI times-out. Until now, I’m always doing a hard reboot (power off). But this empties also the log file
Is there a way to keep an archive of the log files? So that at least I can try to find the problem in the older log file from before the hard reboot.
Does a watchdog mechanism exists for the Core on my PI, which reboots automatically?
Hi,
same issue here.
RPI4 8GB Ram.
Few entities, few automation, deconz and conbee setup.
Hangs every 2 weeks. May be some memory leak going on?
This is the type of issue that make me think that HA is “not ready for production”.
Same problem over here:
Raspberry 4 w 4G RAM, boot from SSD.
Hang occurs daily
Minimal addons (SMB, Grafana, InfluxDB).
Have you guys checked your logs for entries along the lines of ‘malformed database’? Have you tried stopping HA, deleting the database file, restarting HA?
I checked the logs ( ha core logs ) and grepped for malformed.
Nothing there …
(tried top copy/paste the output, but the terminal addon doesnt facilitate copy/paste)
I added the logger config. Think there’s more info over there.
Check out https://github.com/home-assistant/operating-system/issues/1119
Lots of people with similar issues. When it freezes it looks like most of the time the logs don’t give a clue. Most in the 1119 issue can run fine on os below 5.5. This has been an issue for a lot of people after the kernel was change to make 5.5.
Yesterday I changed the recorder backend to MariaDB. Since then no hang
A good video on how to achive this: https://youtu.be/0Nf70avId0w
And one day later, HA is dead again
This is not good. Stability is crucial.
Have your tried the OS downgrade to 5.4. I have yet to have this crash. I just tried this morning an update to the firmware on my StarTech controller. I rebooted, went to 5.12 and 4 hours later it crashed. 5.4 is very stable for me.
RPI-4 clean install, no temp og memory problems - basically a decent installation but still experiencing hangs once a day for the last 2 months appx.
Any hints og news if someone is looking into this rather common issue?
I have the same issue now for several months, Systems freezes after 1 day - simple installation, new RaspberryPI with SD card, original power supply 2% CPU load, 58 degrees celcius, Disk 8%, Mem use 12%
recorder:
purge_keep_days: 1
commit_interval: 10
im still trying to resolve this issue unresponsive HA once everyday. I have set automation to reboot my HA every 6hours hoping this will solve the issue. I will update this ticket once it eliminates the issue.
Same here - tried almost everything. SD->SSD, cooling, disabled integration etc. etc.
I’m wondering if the devs are aware of this thread. I would post logs if I knew what they needed to diagnose the issue.