Automate ZwaveJS Ping Dead Nodes?

Nice and sweet!

However, you Could simplify the template to:

             {{
              states
              | selectattr("entity_id", "search", "node_status")
              | selectattr('state', 'in', 'dead, unavailable, unknown')
              | map(attribute="object_id")
              | map('regex_replace', find='(.*)_node_status', replace='button.\\1_ping', ignorecase=False)
              | list
            }}

and Should use the

{{ states('sensor.dead_zwave_devices') }}

format in both lines in the automation to prevent startup issues on these templates

Can you clarify on the syntax at what you’re suggesting? I’m trying to get this setup, but am struggling with the YAML.

Here is what I have, but it’s obviously incorrect when trying to input your
{{ states('sensor.dead_zwave_devices') }} suggestion.

- alias: Ping Dead ZWave Nodes
  description: ''
  trigger:
  - platform: state
    entity_id: sensor.dead_zwave_devices
  condition:
  - condition: template
    value_template: '{{ states('sensor.dead_zwave_devices') != "[]" }}'
  action:
  - service: button.press
  target:
      entity_id: '{{ states('sensor.dead_zwave_devices') }}'
  mode: single

@Mariusthvdb - want to reach out and see if you could comment, or paste the whole automation, on what you proposed for the proper syntax. Thank you :slight_smile:

did get a ping so sorry for the response lag…

you use single inner and outer quotes in both the value_template and the entity_id line.

Use double outer quotes "double-quote" or multi-line notation using > to solve that syntax issue

eg

value_template: "{{ states('sensor.dead_zwave_devices') != [] }}"

or my preferred style:

value_template: >
  {{ states('sensor.dead_zwave_devices') != [] }}

same for the entity_id

Can you please post the final iteration of the all the parts needed to make this work? I am having the same issue with my ZST10-700. Devices randomly going dead.

I agree it can be confusing when people only post part of their configuration, especially since yaml is such a non-descriptive format. I’ll try to elucidate.

I put all of z-wave ping handling into a “package” file, that I load from my configuration.yaml.

In the configuration.yaml, near the top:

group: !include groups.yaml
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml

# added to the default includes to pull in packages
homeassistant:
  packages: !include_dir_named packages

And in the home-assistant directory (where the configuration.yaml file resides), create the directory “packages”. I put a file in the packages directory I called “zwave-ping.yaml”, with the following contents:

template:
  - sensor:
      - name: "Dead ZWave Devices"
        unique_id: dead_zwave_devices
        unit_of_measurement: entities
        state: >
          {% if state_attr('sensor.dead_zwave_devices','entity_id') != none %}
            {{ state_attr('sensor.dead_zwave_devices','entity_id') | count }}
          {% else %}
            {{ 0 }}
          {% endif %}
        attributes:
          entity_id: >
            {% set exclude_filter = ['sensor.700_series_based_controller_node_status'] %}
            {{
              expand(integration_entities('Z-Wave JS') )
              | rejectattr("entity_id", "in", exclude_filter)
              | selectattr("entity_id", "search", "node_status")
              | selectattr('state', 'in', 'dead, unavailable, unknown')
              | map(attribute="object_id")
              | map('regex_replace', find='(.*)_node_status', replace='button.\\1_ping', ignorecase=False)
              | list
            }}

automation:
  - id: ping_dead_zwave_devices
    alias: Ping Dead ZWave Devices
    description: ''
    trigger:
      - platform: state
        entity_id:
          - sensor.dead_zwave_devices
    condition:
      - condition: template
        value_template: >
          {{ int(states.sensor.dead_zwave_devices.state) > 0 }}
    action:
      - service: button.press
        target:
          entity_id: >
            {{ state_attr('sensor.dead_zwave_devices','entity_id') }}
    mode: single

Note the sensor can be added to your dashboard and will show a count of dead devices:

You can enable or disable the automation from the settings, but you cannot edit the automation using the visual editor. You can change that by putting the automation part in your automations.yaml file if you prefer.

If you disable the automation, you’ll see the count go up and down over time. Since enabling the automation, I’ve not had any persistent dead nodes.

22 Likes

F***ing AWESOME!! Thank You.

zwave_ping.yaml :slight_smile:
Thanks for this.

1 Like

I simply repackaged the hard work of other folks in this thread, but I’m glad it was helpful.

2 Likes

It seems this no longer works. Had a node go dead shortly after I implemented the code and the automation failed to trigger. When I attempted to run the automation manually, nothing happened. Not sure what is going on. I’m running Home Assistant 2022.7.5.

1 Like

I just implemented this as per bretton.wade’s package just above. Perfect timing since I had one node needing pinging.

I was able to run the automation manually to clear the node and will monitor the automation.

I note, however, that at least in V2022.7.6 there is no longer an entity for the master controller, so I deleted these two lines:

{% set exclude_filter = ['sensor.700_series_based_controller_node_status'] %}
and
| rejectattr("entity_id", "in", exclude_filter)
2 Likes

I updated my HA, but the automation still works for me. This is what you love to hear, right?

Sadly, the ping only resolves one type of problem in the ZWave 7 universe. Try connecting to your ZWaveToMQTT.js web interface and see if the deice you want to ping is having some other problem?

Caveat: this package does not actually fix your nodes, so when they do go offline, they don’t respond until the ping brings them back. For some automations, this means things like a light switch stays on. I’ve begun to implement a shadow that tries to check if a switch is in the state HA expects it to be… I hate that this is necessary. I’ve also changed the automation so that it runs every 15 seconds instead of in response to a new dead entity. I found that it was unreliable the other way.

I can say with complete certainty that my nodes are being marked as either dead or unavailable. It’s definitely caused by the known 700 series device bug and I’ve submitted logs to the devs in the hopes they can identify the cause. I’ve done everything necessary in an attempt to mitigate the problem (firmware upgrade, usb cable extension, etc.) Sadly, until a fix is found, tools like this are our only available workaround.

FWIW, I am able to “revive” the nodes by manually running the Ping Button feature. I suspect, as you’ve already pointed out, the expediency of my current automation may be a causal factor which interferes with the automations efficacy.

Quite by accident, I discovered that an easy way to trigger the automation is to force the Zwave driver to restart. Changes made to the Zwave settings panel in Z2M result in a restart of the driver. It was during such a restart that I observed the automation attempting to run, which it probably shouldn’t do since the entire Z-wave environment is being completely reset at that point in time. I think a delay of some sort, perhaps an “if dead for X minutes” condition, might help offset any conflicts that could arise.

Agreed - I follow the zwavejs slack channel and have submitted many debug logs for them to send to SiLabs before the latest SiLabs firmware was released 5 months ago. While the dead zwave devices being “resuscitated” by pinging has reduced dramatically since the 7.17.2 stick firmware, it still happens (just not daily like before).

There is still the entity for the stick, but it’s now disabled by default. Good catch on calling that out though, it broke my automation from working as well.

ZST10 700, current firmware.

I think instead of chasing my own tail with the ongoing 700-series issues (multiple dead nodes every week), I’m going to bite the bullet and move every Z-Wave device over to a Gen5 stick. I can’t think of any real-world benefit offered by the 700 that’s worth the issue of devices dropped off the network.

While this won’t solve every issue, it’ll solve the most frequent. Some devices, including some from Zooz like Zen26 and Zen27 have other firmware issues that cause them to crash and become totally unresponsive until they’re power cycled (at the breaker). This happens much less frequently, and typically from a too-brief grid power failure or brownout.

After switching to a Gen5 controller, would all existing entities/devices be lost?

I was getting ready to dump ZWave entirely - literally searching Amazon for equivalent WiFi devices - when I did some more Googling and found this thread. Pinging dead nodes from an automation would be a lot easier than physically replacing all the devices with wifi.

I see an entity called sensor.700_series_based_controller_node_status, but I can’t enable it, it’s marked “unavailable” and my attempt to enable it apparently fails.

Depends on how you ‘switch’ (see later)

It’s supposed to be - the ‘node status’ entity for the controller itself isn’t valid and should have never been available in the first place - recent versions of ZWaveJS and ZWaveJS2mqtt fixed that and why you cant enable it. Not sure why the node status node is still available on your install, have you updated your ZWave integration? In any case - this part isn’t an issue.

Making some assumptions based on what you posted… It sounds like you have a 700-series based Zwave network and have been dealing with unresponsive nodes. Your first recourse should be locating the latest firmware for that stick. Verify it has a version of 7.17.2 or BETTER. (The patch that was directly intended to help the dead node issue) If you’re not to that firmware level, upgrade to that first, then try the ping automation in this thread. Upgrading the firmware on the stick is nondestructive.

If the firmware and ping script doesn’t work, (it’s by no means perfect, I have a node drop now and then - but for the most part they just work now) and you want to try rolling back from a 700 to a 500 stick, (again, probably not necessary) if you’re running ZWaveJS2mqtt at least, it can back up the NVM of the Z-stick and restore it to a 500 stick (meaning you don’t lose your network/entities, etc.) I do not know if you can do that procedure with base ZWaveJS. Im sure someone knows that answer.

So the solution in this thread - the sensor and ‘ping’ automation - is already obsolete and can no longer be made to work?

I get it now - I just need to remove those lines from the yaml. The rest may still work.

I decided not to try upgrading the firmware. Too many posts from people who said they did that and lost everything, or that it didn’t even solve the dead node problem. If I could do it from within HA on my RPi, then maybe, but I’m not jumping through all the hoops to do it on a Windows machine.