Home Assistant Add-on: Hardware watchdog service


Plugin activates /dev/watchdog - hardware watchdog device to restart server on no responce. For details about watchdog see https://www.kernel.org/doc/Documentation/watchdog/watchdog-api.txt.
I checked it with my Raspberry Pi 4 - it has Broadcom BCM2835 Watchdog timer, enabled by default.
Service sends keepalive to watchdog timer every 5 seconds, on hang or other software problems system will do hardware restart in 15 seconds.

Repository on GitHub


Cool, I’ve been having occasional issues with my home assistant becoming completely unresponsive so I’m happy to see this exists and I’m giving it a try.

Seems like this should really just be a default part of Home Assistant OS.

One thing that would be useful is to track how frequently this is triggered.

I do have an uptime sensor configured (Uptime - Home Assistant) but that won’t let me distinguish restarts due to software updates or config changes from watchdog resets.

I also have set up a notification when HA restarts (similar to the example here Home Assistant restart notification) so at least I can note & track manually.

Personally, I added notifications about shutting down and starting HA and I think this is enough, because reloading the watchdog timer is not a frequent occurrence.
I think the main problem with counting these events is that watchdog restarts are outside the scope of the software - it’s a hardware restart when the software becomes unresponsive. You can’t write down some information because you (as a script, as a program) may be (and probably) dead at that moment.
Another way is to count “incorrect” OS startups - create some flag on proper shutdown, and if there is no such flag on startup, interpret this situation as a bad/watchdog shutdown.
If anyone has an idea how to count this - do not hold back)
Also if you know how to increment any counter in HA from supervisor container - give me an example and I’ll add this function.