Tonight, I noticed that our dashboard tablet suddenly lost the connection to the Raspberry 4 running HA 2022.5.5. I checked HA, and everything seemed extremely slow. htop shows that the process:
python3 -m home assistant --config /config
sits at about 100+% CPU. I have not installed any new extensions or update any of the existing ones.
Any tips on how to debug this? I have read about “py-spy” for profiling Python programs, but I am not sure how to install it on RPi4/Raspbian.
I have now disabled most integrations and stopped most add-ons, and the system seems to get back to normal once in a while, but it is not clear to me, when that actually happens. Over the past days I have noticed that even though I somehow get the system to run normally, about once per day HA hogs the CPU again for close to an hour. This happens at random times of the day. This is the history of the CPU utilization for last night:
Nothing else was happening on the RPi4 in that time period.
Please please, can someone help me with troubleshooting this?
Is it enough the disable/stop integrations and add-ons, or should they be uninstalled entirely to be “ineffective”?
It happened again… I now see that the log file fills up with hundreds of thousands of these lines:
2022-06-01 12:23:54 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection]  Client exceeded max pending messages : 2048
Does that ring any bell with anyone?
These messages are normally caused by the mobile_app or a browser tab, that’s left open. Most likely the mobile_app.
Do you have it installed on the tablet? Try deleting all the data from the app and after a restart connect it again.
I have this every other week, because my tablet is running the ha_mobile_app in the background (so the sensors for battery and so on work) and I’m using FullyKioskBrowser to show my tablet-dashboard.
@paddy0174 Thank you very much for your reply. It is highly appreciated. I shutdown the app on the tablet, and is now just showing the dashboard using the built-in Safari browser. I will keep an eye on HA for 24 hours while holding my breath
You’re welcome, but tbh I’m not entirely sure that this is causing your high CPU load, but that’s to see.
In the meantime you could raise the debug or log level for specific things, to take a look if something is going “the wrong way”.
You could just turn on the highest log level for the mobile_app (and if you suspect other things, these too) and have a look. And, to rule out, that the “log spam” is causing the high load, you can as well disable the logging for that component.
It seems like a tedious task, but before you know where this comes from, you need to do some deeper search.
What I’d do: If you can pinpoint some time frame, raise the log level for everything in that time frame and go back to “normal” afterwards. This should at least bring up something. And before I forget, you should as well take a closer look to the (sys)logs of your Pi (assuming you run RaspbianOS and not HA-OS).
And just out of interest (maybe it will become useful), let us know exactly how you run HA and on what device (Pi4 ?GB RAM, Raspbian Lite?, SSD or SD-card, boot from ?). You get the idea. The more info, the better.
Good luck in th emeantime!
I have spend the past days troubleshooting this, and everything points to the tablet indeed causing the issue. The problem occurs after 8-ish hours, so it takes time to debug, but it seems that it is caused by the tablet displaying a dashboard, and it doesn’t matter if it is the standard browser (Safari) or it is the mobile app. The problem occurs with Safari displaying the dashboard with the mobile app completely uninstalled.
I think I have narrowed it further down to being a specific dashboard, which makes sense, as I refuse to believe that there is general error in HA with having a device display a dashboard over long periods of time. Lots of people must be doing that with no problems.
So… it must be something I do on that particular dashboard that annoys HA. I use the following standard cards:
and the following custom cards:
@paddy0174 Do any of these match the cards that you use on your dashboard also causing this kind of issue?
Not sure at all but I would look in the direction of the custom cards, not the standard ones.
@Hansen I just recently noticed a huge slowdown on my iPad.
I can’t find any obvious signs of why it is happening.
Yep, that seems to be right, something is not responding the way it should. But great job in pinpointing it down, what I did was to filter that specific message in the logs… But in my case it is not slowing things down, so it was merely cosmetical.
The tablet that gives me headaches has one dashboard with the following cards used:
So the one custom-card that shines out seems to be
I’ll take that one out on my dashboard and see what comes up (or hopefully not).