I have a notification script that triggers several notifications - Growl, Slack, and Google TTS… the problem is that if one fails, none of the subsequent steps get executed.
The script itself is pretty simple - and it is triggered by a few different automations (doors, leak detectors, etc.).
The simplest “fix” is to put the highest priority notifications first, but that isn’t a fix, and the other notifications are there for a reason. Is there any way to handle the exceptions raised in script execution so that subsequent steps can be executed? The script is provided below - in this example, if the Growl target (my desktop PC) is asleep or off, the other notifications won’t fire.
Script below provided for detail. The message is based on templating in the automations that trigger the actions. Please don’t suggest separating this into separate scripts - this script is called from several different automations, having separate actions for each one would be a bit silly.
Now each call to a notification service runs in its own script. If that script fails it doesn’t block/abort execution of whatever called it (like script.send_notification).
Thanks! Seems like a bit of a kludge, but if it does the trick, I’ll take it. I kind of wish there was real exception handling available - I have never really been a fan of “happy path or fail completely” programming. I suppose that’s where I’d have to turn to something more advanced or external to HA for scripting.
Hmm tried it out, still stops if an exception triggers.
Might have to dig into some other mechanism - looks like the script failing even prevents subsequent automation steps as well.
Any other ideas? I started digging into the code to see if I could implement a catch branch that doesn’t re-throw the exception, but that touches multiple files.
I know what you mean. It’s far from proper error handling but this is as close as you’re going to get to preventing a failure from aborting the script.
FWIW, the home automation software I’ve used for years employs VBscript. Even that has
on error resume next
which allows you to gracefully handle a failure.
on error resume next
err.clear
<<<potentially failure-prone code goes here>>>
if err.number <> 0 then
debugout "<" & this.name & "> ERROR! Unable to announce Door Status."
err.clear
end if
on error goto 0
From my experience you could configure Growl notifications that go nowhere
Thanks for your help. I’m still pondering / considering digging into the HA script action code to see if it would be possible to put in a configurable catch block for actions, and revert to the default if there is no catch configured. Not sure how much I want to dig into that code though, especially since I have a habit of making changes that I don’t end up putting into a pull request.
Further to previous testing - I poked around into \helpers\script.py and put an ugly little hack in the async_run method, that checks for the presence of a variable called “ignore_exceptions”. If that key is present in the variables, it ignores the exception (allowing the script to continue) otherwise normal behavior is maintained.
except Exception:
_LOGGER.info("Script exception occurred")
if 'ignore_exception' in variables:
pass
else:
# Store the step that had an exception
self._exception_step = cur
# Set script to not running
self._cur = -1
self.last_action = None
# Pass exception on.
raise
I would really prefer to not have a hack like this (since I have to re-apply it after every update) but at least for now, I have the ability to let scripts continue selectively. One scenario to consider why I see this as an issue, consider water leak detection:
Water detected
Take some sort of action (i.e. close valve)
Notify via Slack
Notify via Growl / Google Home / etc.
Should any step fail, I still want (even require) the others to happen - while there is a little bit of priority ordering here, you can see how a failure / exception in say taking action should absolutely not prevent notification and vice versa.
Again, thanks for your help. This seems like a fairly big flaw to me, but maybe I don’t know the correct way to do this. Coming from an industrial automation background, it seems to me like this would be useful functionality.
My understanding of how failed scripts behave was flawed. pnbruckner provides a detailed explanation to how they work and how to make them fail gracefully (i.e. non-blocking) by simply adding an initial delay:
@123 I did try the way that comment talks about, but it didn’t seem to work. I ended up modifying the exception handling code in the component to make it work. The step - delay - step pattern stops even when other scripts are called.
I would love to fix this in the mainline code, but I have never set up for contributing. Right now I sort of weigh the effort of applying patches after updating to sorting out how to set up a PR.
I believe that at the moment phil is in the process of improving the way scripts work (and that delay trick might not work if you search forum for recent topics).
We just need to wait a bit until it’s completed.
“… the calling script has a choice of whether or not to wait for the called script to finish. This is now documented. See: Waiting for Script to Complete . This also affects whether or not the calling script will abort if the called script causes an error.”