Help writing script to ping z-wave nodes every night?

finity · March 26, 2025, 3:08am

if there is only one dead node and the device entity_id’s are named consistently you could do the following in the entity_id section of the button press action:

- action: button.press
  target:
    entity_id: >
      {% set dead_list = states | selectattr("entity_id", "search", "node_status") |
               selectattr('state', 'in', 'dead, unavailable, unknown') |
               map(attribute='entity_id') | list %}
      {{ dead_list|replace('sensor', 'button')|replace('node_status', 'ping')}}

if there is more than one dead node at a time it gets more complicated. I’m not entirely sure it’s possible to iterate over a list in an action like that. At least I can’t think of a way off the top of my head.

But if there are more than one then you could ping the first dead node in the list, wait a bit then ping the next dead node in the list. which isn’t too hard since as long as the ping was successful the next node in the list will still be the new first node in the list.

so something like this:

repeat:
  while:
    - condition: template
      value_template: >
        {% set dead_count = states | selectattr("entity_id", "search", "node_status") |
             selectattr('state', 'in', 'dead, unavailable, unknown') |
             map(attribute='entity_id') | list | count %}
        {{ dead_count | int > 0 }}
  sequence:  
    - action: button.press
      target:
        entity_id: >
          {% set dead_list = states | selectattr("entity_id", "search", "node_status") |
               selectattr('state', 'in', 'dead, unavailable, unknown') |
               map(attribute='entity_id') | list %}
          {{ dead_list[0]|replace('sensor', 'button')|replace('node_status', 'ping')}}
    - delay:
        minutes: 5

there may be better ways to do it but I think that should get you started.

the concern will be that a dead node never comes back alive and the automation runs forever trying to revive the totally dead node and never gets past that node.

you can limit the number of times it runs to the number of original dead nodes so that would mitigate that situation:

repeat:
  while:
    - condition: template
      value_template: >
        {% set dead_count = states | selectattr("entity_id", "search", "node_status") |
             selectattr('state', 'in', 'dead, unavailable, unknown') |
             map(attribute='entity_id') | list | count %}
        {{ dead_count | int > 0 }}
    - condition: template
      value_template: >
        {% set dead_count = states | selectattr("entity_id", "search", "node_status") |
             selectattr('state', 'in', 'dead, unavailable, unknown') |
             map(attribute='entity_id') | list | count %}
        {{ repeat.index <= dead_count }} 
  sequence:
    - action: button.press
      target:
        entity_id: >
          {% set dead_list = states | selectattr("entity_id", "search", "node_status") |
               selectattr('state', 'in', 'dead, unavailable, unknown') |
               map(attribute='entity_id') | list %}
          {{ dead_list[0]|replace('sensor', 'button')|replace('node_status', 'ping')}}
    - delay:
        minutes: 5

all of that is totally untested tho.

hansaplast26 · March 26, 2025, 8:07am

Thanks for this, I got a lot to learn wrt to Jinja. I came up with this; and testing it now. so far it does what i want.

alias: "Timepattern: ping dead zwave nodes"
description: ""
triggers:
  - trigger: time_pattern
    minutes: /10
conditions:
  - condition: numeric_state
    entity_id: sensor.zwave_failed_nodes
    above: 0
actions:
  - action: button.press
    metadata: {}
    data: {}
    target:
      entity_id: >
        {% set dead_list = states | selectattr("entity_id", "search",
        "node_status") |
             selectattr('state', 'in', 'dead, unavailable, unknown') |
             map(attribute='entity_id') | list %}
        {{ dead_list[0]|replace('sensor', 'button')|replace('node_status',
        'ping')}}
mode: single

PeteRage · March 26, 2025, 1:29pm

If you have many dead nodes and try to ping them all at the same, that may create too much traffic for the zwave network, overwhelming it, and causing more nodes to go dead. So I would try to use a repeat block to iterate through the list with at least a 15 second delay between each ping.

hansaplast26 · March 26, 2025, 2:15pm

Thanks I have 12 nodes and occasionally 2 go dead, due to distance and interference I assume. So far I have been pinging them manually without 0,5s delay without any issues.
Good tip though, time will tell what is best here.

finity · March 27, 2025, 12:07am

instead of running the automation every 10 minutes even when unnecessary why don’t you only run it when you have dead nodes? That is an easy trigger to set up since you already have the sensor to know if you have any dead nodes. instead of using that sensor in the condition use that to trigger the automation instead of the time pattern trigger. it’s more efficient.

hansaplast26 · March 27, 2025, 4:56am

I agree with you this would be more logical. However multiple triggers are treated as OR. having one trigger + condition are treated as AND.

finity · March 27, 2025, 5:12am

I don’t know why that’s an issue.

hansaplast26 · March 27, 2025, 6:38am

Because the trigger well then be: every 10 minutes OR (immediately) when a dead node fails. If the dead node fails and the ping does not bring it back to live the trigger will repeat as fast as it can. Which will for sure overload the zwave network and possibly the home assistant service. The 10 minute trigger will be completely be ignored basically.

finity · March 27, 2025, 9:33am

that’s not true.

the trigger needs to go back to false then back to true before it will trigger again. if the node never goes to alive the trigger will never be false so the automation will only run one time.

hansaplast26 · March 27, 2025, 10:20am

Ok that makes sense, it will avoid loops. but i think it will create a new problem:
Say you have a node that failed, the automation will trigger and ping the node. If that node doesn’t recover with the ping the automation will not be triggered again, since it stays false.
If in the meantime a new node fails, the automation will not trigger. and nodes will not be pinged.

I have tested the automation with trigger every 10 minutes and a dead node as conditon and that runs every 10 minutes to ping the deadnode.