Hello,
My second post on the forums here was asking how to make the Zigbee2MQTT Watchdog keep attempting to start the add-on if it kept failing (as it gave up incredibly quickly - something like a minute or two). Unfortunately, nobody had any suggestions.
The reason for this is that if Zigbee2MQTT cannot reach the Zigbee coordinator (in my case, a network-connected coordinator) for 10 minutes or so, it will stop the service. The watchdog will do its thing restarting it for a little while, but eventually (within another 10 or 15 minutes, if I recall), it will give up too (note that it will fail and the watchdog will give up within about a minute if you start the server while the coordinator is unreachable).
This is not ideal in a home lab where I mess around with things quite frequently, and after spending some time in my lab, I’ve had a handful of occasions where I will wander off to go to the bathroom or something, and find out none of my crap works (even though I’d reconnected the coordinator).
Anyway, why spend the few minutes occasionally trying to troubleshoot something that comes up infrequently, when I can spend an entire evening trying to figure out how to prevent it altogether? lol. Anyway, here is my over-engineered, automation-based add-on watchdog:
First, we need to enable the Zigbee2MQTT “Running” entity so that we have a condition to check in the automation. I’ve not found any other means by which to get its status
- Go to Settings>Devices and Services>Devices. Search “Zigbee2MQTT.” Select it.
- Under “Sensors,” expand the entity list by clicking “+4 entities not shown.”
- Select “Running.”
- Select the “Settings” tab, expand “Advanced settings,” and change the Entity Status to “hidden.” Click “Update.”
- It will take about 30 seconds for HA to enable the entity. Refresh the page, and ensure it now automatically shows the entity in the “Sensors” list (without having to unhide it), and shows it as “Running (Hidden).”
Then, add the following automation. Note that the reason for the forcing of the entity status update is because, for some reason, HA only updates the Zigbee2MQTT “running” entity every 5 minutes. So, this automation will not start trying until up to 5 minutes after it stops. You can optionally create a separate automation that will force an update of the entity on a schedule, but I figured up to 5 minutes was fine for my use. If you happen to know where I can view/edit the YAML of automatically-created entities, please let me know - I’d modify the “scan_interval” on it if that were an option.
This automation will check every minute to see if the addon is running. If it is, then it doesn’t run. If it isn’t, it will attempt to start the service, wait, force-update the entity, and loop back again. It will keep looping, checking the service’s status and then attempting to start it again as necessary, until the service has been running for 5 minutes before exiting the loop.
alias: Zigbee2MQTT Watchguard
description: ""
trigger:
- platform: time_pattern
minutes: "*"
condition:
- condition: state
entity_id: binary_sensor.zigbee2mqtt_running
state: "off"
for:
hours: 0
minutes: 0
seconds: 0
action:
- repeat:
until:
- condition: state
entity_id: binary_sensor.zigbee2mqtt_running
state: "on"
for:
hours: 0
minutes: 5
seconds: 0
sequence:
- if:
- condition: state
entity_id: binary_sensor.zigbee2mqtt_running
state: "off"
then:
- service: hassio.addon_start
data:
addon: 45df7312_zigbee2mqtt
- delay:
hours: 0
minutes: 1
seconds: 0
milliseconds: 0
- service: homeassistant.update_entity
data: {}
target:
entity_id: binary_sensor.zigbee2mqtt_running
else:
- delay:
hours: 0
minutes: 0
seconds: 30
milliseconds: 0
- service: homeassistant.update_entity
data: {}
target:
entity_id: binary_sensor.zigbee2mqtt_running
mode: single