Here I see that at some point HA stops registering the sensor and then after about ~15 minutes, it starts again, usually coming down from a high CPU usage. This makes me think that there is some process (maybe an add-on) that spirals out of control and then recovers. While HA is responsive, ssh also becomes irresponsive, so I cannot look at top for example.
I have no (simple) idea of how to debug this because the log files (both supervisor and core) show nothing of relevance. Does anyone have any suggestions?
I have been having the same problem as well. Granted, I am running off a RPi 3B+, but it didn’t happen until I updated to 0.110. I hope this gets fixed soon.
I am not entirely sure whether this happened for me since 0.110 though. I have been having issues for a few weeks and recently moved from Ubuntu to a Proxmox setup because of the (reverted!) deprecation announcement. So I am not sure whether it’s because of that or some HA bug.
Home Assistant is built around an event loop. This means that there is always only a single task running at the core of the system. When a task needs to do I/O, they schedule an I/O task and suspend themselves until the I/O is done.
I don’t know if it is the cause here, but one reason for Home Assistant not acting for several seconds is if an incorrectly coded integration is doing I/O inside the task, instead of scheduling an I/O task. Now the whole event loop blocks and no other tasks are processed until the I/O is done.
A good first start is to make sure you don’t have any warnings about I/O in the event loop in your logs. If you do, get those fixed should be step 1.
Thanks @balloob! However, I do not think it is merely the event-loop getting blocked. Because not only does Home-Assistant become irresponsive, so do all add-ons.
So ssh doesn’t work, glances stops reporting (preventing me from finding the culprit), and I am even unable to get into the Proxmox image.
When everything unblocks I see that a Python process crashed:
Might not be related at all, but i had similar issues. For me 2 things helped.
I disabled zeroconf. That already helped a lot in bringing the python process down a notch in cpu usage.
I disconnected my vlans from the host. As soon as my host is connected to more than one subnet, homeassistant uses ALOT more CPU.
After doing 2. I re-enabled zeroconf and had no issues anymore.
I noticed something similar this last week. Maybe related? I’m running hass supervised on a nuc. I noticed my nuc fan was full speed following a restart and would not come down, temp was spiking. Upon investigation the python3 process for the supervisor was using 100% cpu and not going down. Since I’m in a docker, I restarted the supervisor and everything went back to normal. No idea what caused the problem as I didn’t see anything in the logs, but it seemed to be a pretty good bandaid.
Just leaving a link to my original post with what i believe to be the same error. Python3 high CPU Usage
What you could do is run PySpy to analyze the Python Process.