There seems to be a growing issue with command_line sensors and ping sensors breaking at random intervals after booting, and it is proving very difficult to debug at the moment.
The problem:
- Occurs at random times after booting
- Results in the error:
Updating xxxx took longer than the scheduled update interval xx:xx:xx
- Results in a sudden and sustained increase in memory usage, usually around 5-10% more than baseline
- Can only be fixed by restarting Home Assistant
- Appears to be limited to Hass.IO installations (but may be more universal)
- Seems to affect all command_line sensors regardless of whether they communicate externally or internally to the device
- Once it occurs, does not affect the ability for the Configurator add-on to fire off shell commands, even if the broken command_line sensor contains a shell_command
- Is hella irritating
Things that haven’t fixed it:
- Complete fresh rebuild and re-write of Home Assistant from scratch
- Fresh SD card
- Switching from SD card to SSD loading with my rPI 3B+
- Trying to get more detailed error logging within Home Assistant - the error listed above is as detailed as I can get
Things I’m currently doing:
- Changing all my command_line sensors that communicate externally into scripts that make sensors via REST, driven by an automation (leaving me with only one command_line sensor that monitors the Pi’s CPU temperature)
- Gnashing my teeth and looking desperately for answers or alternatives
This problem has also been documented here (Binary_sensor.ping stopped working after a while) and here (Updating ping binary_sensor took longer than the scheduled update interval), and a Github issue has been logged here (https://github.com/home-assistant/home-assistant/issues/24899). I’ve also tried to get help from the Discord channels but with no luck.
It seems to be a pretty fundamental issue and whilst I appreciate how much work goes into Home Assistant, I’m surprised that nobody has looked into it yet. I’m not a developer but I’m keen to help debug and fix this any way I can - see the above for what I’ve figured out so far - but would really appreciate some help in tackling this, as I’m now running out of workarounds!
Many thanks all.