Ping and command_line sensors break randomly

There seems to be a growing issue with command_line sensors and ping sensors breaking at random intervals after booting, and it is proving very difficult to debug at the moment.

The problem:

  • Occurs at random times after booting
  • Results in the error: Updating xxxx took longer than the scheduled update interval xx:xx:xx
  • Results in a sudden and sustained increase in memory usage, usually around 5-10% more than baseline
  • Can only be fixed by restarting Home Assistant
  • Appears to be limited to Hass.IO installations (but may be more universal)
  • Seems to affect all command_line sensors regardless of whether they communicate externally or internally to the device
  • Once it occurs, does not affect the ability for the Configurator add-on to fire off shell commands, even if the broken command_line sensor contains a shell_command
  • Is hella irritating

Things that haven’t fixed it:

  • Complete fresh rebuild and re-write of Home Assistant from scratch
  • Fresh SD card
  • Switching from SD card to SSD loading with my rPI 3B+
  • Trying to get more detailed error logging within Home Assistant - the error listed above is as detailed as I can get

Things I’m currently doing:

  • Changing all my command_line sensors that communicate externally into scripts that make sensors via REST, driven by an automation (leaving me with only one command_line sensor that monitors the Pi’s CPU temperature)
  • Gnashing my teeth and looking desperately for answers or alternatives

This problem has also been documented here (Binary_sensor.ping stopped working after a while) and here (Updating ping binary_sensor took longer than the scheduled update interval), and a Github issue has been logged here (https://github.com/home-assistant/home-assistant/issues/24899). I’ve also tried to get help from the Discord channels but with no luck.

It seems to be a pretty fundamental issue and whilst I appreciate how much work goes into Home Assistant, I’m surprised that nobody has looked into it yet. I’m not a developer but I’m keen to help debug and fix this any way I can - see the above for what I’ve figured out so far - but would really appreciate some help in tackling this, as I’m now running out of workarounds!

Many thanks all.

I am also troubled with the same problem with command_line sensor.
I haven’t been able to solve it all the time

Seeing the same behavior with the ping binary sensor. After 1-2 days it stops recogizing when my shield tv and desktop are on or offline, been noticing this issue for a while now, and it causes power cuts to my shield/desktop while they’re still on :flushed:.

Will start attempting to debug.

Issue reported, hopefully something happens. I’ll be using the nmap integration instead for now.
Pls only report related stuff inthere :wink:

I also do have several PING sensors set up for doing network control (been more stable than nmap). After updating last night from 104.1 to 104.3, the problems started. About 1 hour after reboot, all the ping sensors ‘die’ and my log contain

Updating command_line sensor took longer than the scheduled update interval 0:01:00

and

Updating verisure binary_sensor took longer than the scheduled update interval 0:00:30

Frustrating as I can’t find anything in the logs pointing me to what should be wrong…

I have now removed all PING from my device_tracker but do heavily use cli for local sensors. It starts failing abt 30-60 minutes after a reboot. At some point a lot of other integrated platforms starts to fail on updates. So this is not related to PING… Image is taken as an illustration from last night.