Hass.io needs a watchdog

I noticed that home assistant sometimes dies without a warning. Still, ssh is running. For a home automation system that must not happen. Therefore i suggest a watchdog that checks if home assistant is accessible and restarts or even reboots when its not the case anymore.

Definitely. This is the issue that I have right this minute. My HA server has been unreachable all day for no apparent reason. Since I don’t get home until tomorrow I have no idea if my alarm system is active or my gardens are getting water…

Also, you should probably vote for your own feature request :wink:

+1

For me, it is not only home assistant but additional things like Pilight, Grafana, Deconz, etc. that are on the same system. But for the hassio components alone it would already be useful. I have HA stall the system (on a pi2) on a regular basis because of memory leaks somewhere in the code.

My “watchdog” is currently another HA instance running on a different raspberry pi. It monitors the availability of the different APIs on the hassio system and also its RAM usage. The main system can be rebooted remotely by cutting the power because it is plugged into a DECT switch.

1 Like

I’m running this: https://github.com/nragon/keeper
Monitors ha throught mqtt and also mqtt connection status.

2 Likes

Is that running on another RPi? The readme for it isn’t written for the less Linux-minded people such as myself :confused:

Currently it’s running on same machine as HA. But the only thing you have to change if you want to run in another machine are the commands which can be done through ssh for instance. Open to suggestions also :slight_smile:

so it can be done on HassIO?

Well, I’ve never tried it on hassio nor i know the requirements.

You can restore the service after a crash with systemd, just add to homeassistant.service the line

[Service]
...
...
Restart=always

The same applies to mosquitto.service and others…

However, it is better if you use this, together with something specific like keeper.

1 Like

Indeed, systemd lacks connection integrity between systems. For instance, HA and MQTT might be up and running but for some reason HA is unable to receive or send messages. Keeper helps you detect this whereas Restart=always will make sure the service is always restarted in case of error

i’m unable to add the github link to the add on store in hassio