Get notify when an automation failed to run

liorfranko · May 14, 2024, 7:28pm

I’m looking for an easy way to be notified when any one of my automation fails to run.

Troon · May 14, 2024, 7:31pm

What does “fails to run” mean? Give some examples.

liorfranko · May 15, 2024, 5:57am

I have a Sunset automation that turns on light on Sunset, when one of the lights is unreachble.
I have automations for Alarms, when I replaced my wife’s phone, the notification failed to be sent.

Troon · May 15, 2024, 6:18am

So for the first one, you would turn the light on, then use Wait For Trigger or delay + conditional check to see if it came on or not:

trigger
turn lights on
wait a short time
count how many lights are on
either notify or try again if it’s not what’s expected.

The second one just sounds like a configuration thing.

liorfranko · May 15, 2024, 6:29am

Thanks @Troon
So I understand there isn’t something generic that checks automations?
I think it would be a good use-case of having something generic for failed automation.
Those two examples just poped up to me recently, but I think that a generic outcome of the automation (Failed/Succeeded) is missing.

parautenbach · May 15, 2024, 6:39am

Again, what constitutes a failure needs to be defined.

It looks to me to include:

Failure to trigger (when it was expected to).
Failure to meet a condition.
Failure to execute an action.
Silent failures.

In case 1, I’m not aware of anything to help you there. For case 2 and similar, it needs to be handled in ways as described by Troon, because it depends on the semantics of your automation. For case 3, listening to log events could help.

Troon · May 15, 2024, 6:45am

Exactly. The automations didn’t fail: they sent the request to the light; and the notification event.

It was the light that failed; and your phone configuration that failed.

I do understand what you’re saying, but the feedback loop for each potential point of failure is hugely varied: you’ll need to design and implement your own.

The simple ones might be something that HA could look at implementing — switch something on, check if it’s on. But there are so many device types that this approach will still have edge cases.

There was a post yesterday about a cat feeder automation which operates by setting a number entity to 1 even though it was 1 to start with. There was no mechanism for checking if the feeder had actually dispensed the food.

liorfranko · May 15, 2024, 7:36am

I see.
Yea I think it’s a bigger issue in HA, I would expect that every action in the automations would have a structed exception map or something like this, and instead of watching logs we would have used them.
Thanks anyway!

parautenbach · May 15, 2024, 8:55am

I think there are sufficient mechanisms for debugging. I personally don’t think there is a problem with HA. Building a dedicated, generic feature (if even remotely possible) would be managing a symptom. I have close to zero errors in my log (and over a thousand entities). I believe that should be the aim.