After 6 months of tinkering on and off… I finally have HA into a pretty stable state with reliance on very few cloud platforms.
Now however, it seems like if HA has been up for a few weeks without issue, it will slowdown unexpectedly to the point that nothing works… I can’t connect to the UI, but the service still shows it is running and the logs just show that everything is taking forever to respond… and Google Assistant won’t control anything either…
Right now I have alerts to tell me when HA is running or not, but those aren’t helpful if the service is technically running… just super slow so that it’s basically not doing anything…
First question, what can I even check for when this happens? Are there common causes? I already have my logs set to flush daily so that shouldn’t be an issue.
Second question, is there any way to monitor this type of thing so I can be alerted before I find out from my wife that something isn’t working?
To help track down the issue you can add some system monitors:
- platform: systemmonitor
resources:
- type: processor_use
- type: memory_use_percent # <- keep an eye on this, make a notification alert
- type: swap_use_percent
- type: disk_use_percent
arg: /
Also addining this custom component will tell you if your pi is being throttled due to temperature or voltage. This is unlikely to be your problem but good to eliminate.
I’m running on a Ubuntu Server with a 500GB SSD. I won’t have any of the hardware issues a Pi might have as this is a Dell PowerEdge server. My memory usage is always quite low… I’ve allocated 4GB of RAM to this server and I’ve never even seen it go above 1GB.
Lol, makes me think of the silly phrase “did you just assume my hardware?!”
Anyway, no problem. I actually adjusted the logger to only log errors and above… Didn’t realize there was a setting for that… So it’s probably been slowing down due to all the debug and info items written to the logs… I assumed the logs purged with history but now after reading more I don’t think that is the case… We’ll see if that fixes it.
Would be great to keep us posted - just like @tom_l I was puzzled t see that you had performance issues with that setup.
Seemed very similar to what I saw with my RPi3 before I managed to get the recorder settings under control.
What kind of DB are you using? I was using the default sqlite and saw some speed improvement after switching to mysql. Then I had some issues with innodb and switched to Postgresql. System is stable and responsive so far.
Thanks for all the discussion guys. As I mentioned above I cut down the logger to only log errors and I just pared back the recorder component significantly to stop tracking history on things I don’t care about.
For now it seems to be stable, my DB is smaller. I guess SQLite isn’t the quickest DB around?.. I’ve only ever used larger DB systems.
I have been intending to switch over to Postgres for some time, so I will be doing that in the near future as well.