Improving automation reliability

I see most of the scripts/automations that use retry are not working in visual edit anymore, they display unknown in place of the step to perform, complains about service keyword specifically

@srescio , what are HomeAssistant Core and Retry integration versions of the system? Did you restart HA after upgrading the Retry integration? Something is still old as there is no service key usage anymore.

Here is how it looks on my system - HA 2024.8.1 with Retry 3.0.2:

am on HA 2024.8.1 and retry 3.0.2, yes i just restarted everything to get alexa media player back to function but some of my scripts are now in this state, if there was an auto update function it probably did not catch everything
Screenshot 2024-08-12 alle 22.15.56
Screenshot 2024-08-12 alle 22.15.29

anyway i went into my scripts and automations yamls and with find/replace all its back to normal

New to this addon and having issue getting it to work. I have v3.0.3 installed and HA v2024.8.2. For some reason, the retry addon just doesn’t appear in the ‘Perform action’ list when I try to add it to a script. Am I missing something? Is there some other way I should be calling it.

The most common error is to miss the HA restart step:

Home Assistant restart is required once the integration files are copied (either by HACS or manually).

Ah fixed it. For some reason it was missing from the Integrations list, I must have accidentally removed it at some point. Thanks!

I have a lamp group that has bulbs that can be turned and often do by family members. I like the retry logic and expanding out the groups because sometimes it takes 1-2 tries to get all the lamps to comply with the on/ off. However if someone turns off a lamp I get the error message. I thought validation was so I could check the group to see if it was off as validation instead of each bulb but it does both so I still get alerts.

Short version:

  1. check every entity and perform requested action
  2. repeat x number of times until compliance
  3. stop and don’t alert if the group entity is in compliance

Does this look right for above requirement.

data:
  sequence:
    - action: light.turn_off
      metadata: {}
      data: {}
      target:
        entity_id:
          - light.group_family_lamps
          - light.button_off
  on_error:
    - action: notify.mobile_app_not_yours
      data:
        message: Check Night Lights
  state_delay: 2
  retries: 1
  validation: “[[ is_state(light.group_family_lamps, off )]]”
action: retry.actions

validation: “[[ is_state(light.group_family_lamps, off )]]”

Should be:

validation: “[[ is_state('light.group_family_lamps', 'off' )]]”

However, a better approach will be to avoid the validation parameter and instead to use:

expected_state: "off"

Thank you and I understand but I want to validate the group is off because I have entities that the family has turned off so I want to validate against the whole vs each lamp.

Where is that code being used? In a card?

The syntax of that validation looks like the “Javascript” templating often used by cards, not the Jinja template of the back end.

Also, the code you pasted above has “smart quotes” in it, which won’t work if they are also in your real code.

Yours:

  validation: “[[ is_state(light.group_family_lamps, off )]]”

What I’d expect, depending on where the code is:

  validation: "{{ is_state('light.group_family_lamps', 'off') }}"

If you need more help, suggest you start a new topic: you have “hijacked” this thread for an off-topic discussion.

(For the sake of others who will read through this:)
You are correct that this is not the regular syntax of Jinja. However, this is a backend template expression. The special syntax is needed to prevent HA from rendering the template to a regular string in advance. See also here.

It would have been helpful to explain that. Nevertheless, your quoting is wrong. Instead of:

  validation: “[[ is_state(light.group_family_lamps, off )]]”

you need:

  validation: "[[ is_state('light.group_family_lamps', 'off') ]]"

I have been using the successfully. Thank you!

I use it with some lights that occasionally need a few reties to go to the correct setting.

When the retries are exhausted, default 7, I get a repair created. This is great because I can see problem lights and most likely reboot them (power cycle) to get them working again.

My problem is that I can’t get the repairs to go away. My only option is to “Ignore” which I don’t want to do as this persists over Home Assistant reboots. If I ignore my thinking is that I will never get another repair for this particular bulb.

Is there a way for me to mark the repair resolved?

My options are “ignore” and “learn more”.

The system tries to delete an old repair ticket (with the same ID) before issuing a new one. This means that you can safely use “ignore” and it won’t hide future tickets.

1 Like

I dont even know where to start but is there a way for the “On Error” to tell which entity failed with “Notifications” action?

Just found this while trying to improve handling of a temperamental cloud-based integration, but I’m confused about the wording of when it’s not suitable for use, specifically:

When the order of the actions matters: the background retries are running independently to the rest of the actions.

Does this mean that I can’t use Retry to ensure a device is set to a particular state before moving on to the next stage of an automation? Ideally I want to use Retry to set the state of one device, and then use Retry again to set the state of another device, but only after the first state is achieved.

Thanks for your work on this!

@MaximumFish , your understanding is correct and the sequencing is not guaranteed.
The way to achieve the scenario is by using the following:

wait_for_trigger:
  - trigger: state
    entity_id: domain.device_state_1
    to: "on"

This condition should be placed between the steps and will make sure that the 2nd step happens only after the 1st step succeeds.

@kobejo34 ,

I dont even know where to start but is there a way for the “On Error” to tell which entity failed with “Notifications” action?

on_error can use templates and it gets entity_id as a variable with the failed entity. It can be placed in the message of the notification.

That makes sense, thanks. Can I then use the “on error” condition to abort the automation if it’s already moved onto the next steps?