Anyone set up a fail-safe for HA

The other day I came home to a 29C home. Like many I have a myriad of devices, entities and automations. But when the HA core goes down or worse the system box on which HA is running, it is possible - as happened here - for the current automated devices to hold their states , that is until commanded otherwise by one’s HA setup. But of course my server box was down. So all my physical devices got “stuck” in their current state as persisted in that state until I got home and figured out that my server needed rebooting. All was good then.

Clearly this is a problem.

I was wondering how some kind of fail-safe ‘watchdog’ might be implemented to monitor the server box and ensure HA is running. Has anyone actually achieved this to success ? The focus I think needs to be on the words ‘actually achieved’ - there are lots of ideas and ways this could be done but I suppose I am looking for verified solutions.

Thanks
So I wish to turn an old raspberry 2 in a drawer doing nothing, into a simple server watchdog. The idea being

For me, I have automations set up on Home Assistant shut down and start up to make sure anything mission critical is touched. For instance, whenever HA starts I have it re-run my climate control automation for the thermostat because I’ve had that happen before as well.

I’m sure there are a myriad of ways to accomplish this, but this was an out-of-the-box solution I deployed.

The answer to that is redundant hardware, and a cluster type setup.

For climate control I automate on/off and set a target temperature in the device then let the device use it’s own thermostat.

thanks guys

I’m leaning toward the startup trigger. that should be better than I have now.

Some linux builds have a software watchdog system, but the much more robust implementation is to have a independent hardware watchdog attached to your server. There are couple of these for Raspberry Pi, I have not tried a hardware solution for Pi’s as yet. I have used the software watchdog that the Raspberry Pi OS includes on a couple Raspberry Pi’s with pretty good success. See the links below, note that the name of the watchdog subsystem has changed between versions of the Pi OS…

https://linux.die.net/man/5/watchdog.conf
RUNNING FOREVER WITH THE RASPBERRY PI HARDWARE WATCHDOG
https://diode.io/raspberry%20pi/running-forever-with-the-raspberry-pi-hardware-watchdog-20202/

not clear, based what version of pi and os is needed to setup
but I did the watchdog install above
https://raspberrypi.stackexchange.com/questions/108080/watchdog-on-the-rpi4

WatchDog for Raspberry Pi
24. September 2016 von Hödlmoser
https://blog.kmp.or.at/watchdog-for-raspberry-pi/

Enabling Watchdog on Raspberry Pi
Arslan Zahid
Dec 30, 2019 · 2 min read
https://medium.com/@arslion/enabling-watchdog-on-raspberry-pi-b7e574dcba6b

I do not understand your code snippet - sorry - what are you trying to get over as a message? p.s. I’mfaimilar with the watchdog service but it requires software (such as HA) to be “compliant” and the kernel driver “expects” a write at some defined frequency. I don’t think HA “knows” anything about this.

??

Can you re-express your reply please.

I’ve been doing extensive research into this option myself and plan to build it out. Not because I need this much redundancy in my home automation but because it’s cool and relatively cheap on Pi’s.

Apologies for my messy share, these were just good guides I found to watchdog on RPi. Glad you are ahead of me on watchdog in Linux. I have not used the Home Assistant Linux on metal Raspberry Pi solution. That is a bummer to hear that Home Assistant OS does not have a watchdog solution built in, since they run on two platforms that have good solutions for both a software and hardware watchdog, the Pi and Odroid.

Seems more reason to run HA in a docker container or VM under a solid Linux distro. My experience is that if you are not seeing years in your Linux ‘uptime’, something is very wrong. And if HA were to lock up in these models, seems practical to detect this from outside and restart VM or container.

Good hunting!