Sometimes an action fail because of a temporal problem (unavailable device, connection problem…) and be useful could retry until the action works.
For example, I try turn off a light but it is unavailable so the action fail. Now I need add a loop until light is off and call light.turn_off with a delay…
And sometimes the reply is just lost or the action on the device is just slow, so do you want you radiator to just keep increasing the temperature, if it is because the reply does not get back to HA.
And how long is a delay to count for slow devices?
What do you do if you start an action on a device that can not be ubdobe again?
Like a pet feeder, where and extra feeding could create issues and especially multiple feedings.
The retry must be optional, yes sometimes could be worst, but others could be inocuos.
For example, a radiator where you set the temperature (not increase, set) don’t be dangerous if you set multiple times or if you send multiples turn off a light.
How many times or what timeout? All must be configurable
On my end i had to implement a “digital twin” + remediation loop.
With Input select(scenes) + input boolean(room), each time theres a change :
An automation tries to set the lights in the correct state it then starts a 1s timer.
When that timer ends the automation mentionned above starts again, but will start a 60s timer afterwards.
The devs need to make this without knowing what device it will be used on, so that is the first hurdle.
Then the timeout needs to be configured, because that is the trigger for the retry.
Without knowing the device the devs can’t set a default value.
That means it is up to the user to know this value in order to be able to use the retry function and very few will be able to figure this out. Trial and error will not wotk for setting this value, because it will have to be set for all situations and not just the ones you can test for, or else it might fail when you need it.
It is still possible to make a retry function with automations and there are also third party integrations, but they still have those requirements in order to actually work when you need it.
Besides that you also need to counter/new actions into consideration.
What happens if you get a new command for the device that is relative in action.
Do you carry on with retry and when it succeed then apply the new command or do you assume the new command should cancel the not succeeded command?
I think this WTH request can be more generalized by a feature that returns the result of an ‘action’ in a variable, similar to how HTTP response codes work. This way the user can himself implement a retry function (e.g. while action_success == FALSE, do action).
Another use case is that you might want to know within an automation if an alarm code was entered wrong. Currently there is no option to act based on an alarmcode that has been entered wrong. A more detailed description can be found here:
I completely support this request. Having a reliable method for retrying operations after a delay is critical. Having the option of then logging that retry and subsequent results would then be the icing on the strawberry pie.
That is pretty much how it works.
You send a request you get your reply by the message that the device state have changed.
Maybe alarmcodes are different and need some working though.
Remember that HA does not make the devices, so GA can not demand a reply and a reply message before erhe state change would just be another point of possible failure, since you then can get an OK reply, but no state change. Might as well stay with just the state change that convey the same thing.
There just needs to be a check for a change of state of a defined entity within a timeout period, if that state change completes then finish the process. Otherwise wait for a further timeout period and apply the request again, wait for the change of state within the timeout period, and repeat. The repeat count, timeouts and the object being monitored for a change are the important inputs.