Zwave JS frequently dropping connections to endpoints

No, it is a 500series. It is all I could get at the time

Hmm… I was hoping that the 500 series might be the path to fix/resolve the issue. Perhaps I’ll hold off on migrating over (and buying a new USB stick) and see if there are any other suggestions. I have seen some automation in the Blueprint Exchange that sends a ping to the device as soon as the node shows up as dead and wakes it back up. I’m going to test that out and see if it helps…

To be fair, I have seen a few posts about my stick not being very good so it could be that 🤷

Is there any chance that you have a bad z-wave controller? I lost a 500 series USB stick about 4 years ago. It was erratic like you have explained.

The only way I know how to test this is to buy another and transfer over all the nodes. Have you set the log level to silly in the integration and looked for a problem there? It may give you some guidance.

Could the RPI have port problems?

I have had to unplug the dongle a couple of times to get z-wave back. It is like the computer and the dongle quit talking to each other. When I unplugged and plugged the dongle back in and rebooted everything worked.

I run a VM on Linux with supervisor so different from the RPI. Do you have another computer that you could run a HA virtual machine to test your dongle to see if it is something on the RPI?

I am throwing out ideas to see what sticks in the hopes that one might help.

All great thoughts, thanks.

  1. Yes, absolutely possible that the controller is flaking out/going bad. I’ve started to look at other options, but this will be a bit further down my list of troubleshooting steps as I’d like to do a bit of research on what I’d buy to replace it.

  2. The Pi could also have a bad port, although, I suspect that this is not the cause. I have switched the USB sticks (I have 1 for Zwave and 1 for Zigbee) around and they seem to continue to function fine - the Zigbee network has (by and large) been solid and I’ve tested it running in the same USB port.

  3. I will try out turning the log level to Silly - Question Regarding This: Once I do that, I presume that I’d need to let the system run for a period of time to then be able to review the log, correct? Is the log automatically saved?

  4. I unplugged the device when the HA server was powered off (and unplugged – apparently just shutting it down doesn’t necessarily cut all the power). It seemed to help, HOWEVER, the issues seem to return after a period of time…

  5. I do have another machine, however, I’m not familiar with running virtual machines. I suspect this will be a step I’d try after testing a new ZWave stick.

Correct. No the log is not saved automatically and will clear if you leave the integration configuration.

I just see each communication between the nodes and the controller.

2022-08-23T17:48:42.481Z SERIAL « 0x01090004002403250300f3                                            (11 bytes)
2022-08-23T17:48:42.483Z CNTRLR   [Node 036] [~] [Binary Switch] currentValue: false => false       [Endpoint 0]
2022-08-23T17:48:42.484Z SERIAL » [ACK]                                                                   (0x06)
2022-08-23T17:48:42.485Z DRIVER « [Node 036] [REQ] [ApplicationCommand]
                                  └─[BinarySwitchCCReport]
                                      current value: false
2022-08-23T17:49:19.968Z SERIAL « 0x010b0004004d0531050301008e                                        (13 bytes)
2022-08-23T17:49:19.970Z CNTRLR   [Node 077] [Multilevel Sensor] Illuminance: metadata updated      [Endpoint 0]
2022-08-23T17:49:19.972Z CNTRLR   [Node 077] [~] [Multilevel Sensor] Illuminance: 0 => 0            [Endpoint 0]
2022-08-23T17:49:19.973Z SERIAL » [ACK]                                                                   (0x06)
2022-08-23T17:49:19.974Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[MultilevelSensorCCReport]
                                      type:  Illuminance
                                      scale: Percentage value
                                      value: 0
2022-08-23T17:49:57.253Z SERIAL « 0x010b00040022053105030103e2                                        (13 bytes)
2022-08-23T17:49:57.255Z CNTRLR   [Node 034] [Multilevel Sensor] Illuminance: metadata updated      [Endpoint 0]
2022-08-23T17:49:57.256Z CNTRLR   [Node 034] [~] [Multilevel Sensor] Illuminance: 5 => 3            [Endpoint 0]
2022-08-23T17:49:57.258Z SERIAL » [ACK]                                                                   (0x06)
2022-08-23T17:49:57.259Z DRIVER « [Node 034] [REQ] [ApplicationCommand]
                                  └─[MultilevelSensorCCReport]
                                      type:  Illuminance
                                      scale: Percentage value
                                      value: 3
2022-08-23T17:50:19.975Z SERIAL « 0x010b0004004d0531050301008e                                        (13 bytes)
2022-08-23T17:50:19.977Z CNTRLR   [Node 077] [Multilevel Sensor] Illuminance: metadata updated      [Endpoint 0]
2022-08-23T17:50:19.979Z CNTRLR   [Node 077] [~] [Multilevel Sensor] Illuminance: 0 => 0            [Endpoint 0]
2022-08-23T17:50:19.980Z SERIAL » [ACK]                                                                   (0x06)
2022-08-23T17:50:19.982Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[MultilevelSensorCCReport]
                                      type:  Illuminance
                                      scale: Percentage value
                                      value: 0

You can turn devices on and off and watch in the logs the communication.

I have the impression as well that Zwave reliability changed after one of the recent updates.

I just got a new Zwave update recently. The documentation was minimal. I tried it, and it doesn’t seem to have changed much in either direction.

The problem is becoming more troublesome as I’m now getting comments from my wife. WAF is at risk :frowning:

Good luck - that’s a tough one :wink:

Is there a chance that the sd card is going bad on the pi? I know you say you recently switched to HA. It could be a case of infant mortality.

Edit: Have you looked here to see if this simple hardware fix would work for you? Is it possible the Gen7 has a similar issue?

I’ve had this exact problem for over a year. I have HA on a PC, so it’s not RPi related. I started with the Aeotec 5 controller. Upgraded to the 5+, and then migrated to a zooz 700 controller. Problem continues to pop up. Switched to a fresh install on a new pc. Same issue…. I’m pulling my hair out.

Any solutions people have would be appreciate.

Just my observations, having a relatively large Z-Wave network… (102 devices currently)

Solid until a Z-Wave JS, or Z Wave JS UI update comes. Then trying to do the restart add-on dance never works, stop UI, JS Starts, etc. Easiest to just reboot the host after an update.

For Reference I also have this stick and it works perfectly fine. Though I have it plugged into a USB2.0 on a home assistant blue - I had issues with it on the usb3.0 port(which it didn’t need anyway). I can do OTA updates from Z Wave JS UI. It’s a good stick.

  • 30% of my devices do not have security.
  • 5% of my devices are S0: Legacy
  • remainder are S2 Authenticated

So I do not think any mix of security is going to change the performance of your network.

Other things to note:
Go through ALL of your devices, and ensure in Z-Wave JS UI, that each device joined the network properly. IE

  • Beaming is Checked
  • Wave Version reports a 1 or 2
  • Security enrollment is the maximum the device is capable of
  • Latest OTA firmware for your devices are applied

When I have had issues, is when I joined a new device, and it didn’t join properly, so ‘Beaming’ was negative for that device, should a route ever go through that device, other devices may not respond. Exclude the device and reinclude - ensure it joined with Beaming. I overlooked this a few times when adding multiple devices at once. All nodes should beam, including battery nodes. Exclude/Include dance until all nodes are beaming.

700 sticks just aren’t there yet. I’ve tried the Zooz 700 S2, and Aeotec 7, but had problems with response time with just a handful of nodes, and went back to a 500 stick.

1 Like

Thanks for the tips. I’ll check the beaming to see if that’s the issue.

This type of issue affects 100% of people with the Zooz stick and Home Assistant. There’s nothing you can do as an admin/user to eliminate the issue.

As it stands, there are a lot of threads about this and people have tried all manner of band-aids, including auto-ping scripts to bring dead nodes back online.

I’ve yet to find reports that point to Z-Wave 700 sticks (as is commonly mentioned here) in forums for other automation platforms. IMO, it’s still up in the air whether this is 700-series firmware bugs or HA-specific.

Any update on this, it is getting to the point where I loss 2-3 devices a week. Had one 4 minutes ago go unavailable, an automation ran on it 10 minutes ago. I’ve been stable for a long time then about two months ago this things starting going wrong. The only devices I added were some Zooz ZSE41 (window/door sensors), I had 4 of them connected for a long time so I decided to get one for each window. One of the ZSE41 shows up as a “Unknown product 0xe001”. I’ve had to add a fan switch back 3 times.

Getting very frustrating.

I’m seeing the same issues. It seems to only happen when the network is under load (I’m executing a flow that issues a lot of commands), and thus far it seems to only happen to non-Zwave+ devices that are being commanded from the flow, amongst all the other devices. (I have a mix of Zwave+ in that are being commanded as well).

Throughout the day under low load, I don’t seem to lose devices.

Well, after a long period of time of fighting all things I could think of, my entire Zwave Network went completely dead about 2 weeks ago… at least I think it was. The USB controller was present and said it was communicating. As a test, I deleted all of my Zwave devices (because, well, nothing to lose at that point) and added a single device back. It was online and fairly stable for about a day or two before it decided to go belly up. I caved and bought a different USB controller stick - I grabbed an Aeotec 700 Series (convenience from the large, orange, smiling internet store) and installed it. At the same time, I moved from Zwave JS to Zwave JS UI. Added a couple of devices, and everything seemed to be running smoothly. Over the weekend, I added the remaining ~40 devices and was pleased that everything was running smoothly by Sunday (Mar 19, 2023) evening. On Monday the 20th, I reviewed all my automations and dashboards and got everything back to operational.
Monday night, a bit after 9:00 PM CDT, ~80% of my Zwave devices went offline - Nearly all at the same time. Here’s a sample of the Zwave devices from the historical report at the time:

This morning I have taken the new USB controller and moved it onto a longer USB extension cord (moving from a ~1’ to 10’) to move it further away from the Pi (and the Zigbee controller) and I am currently healing the network. A number of the devices appear to be coming back online… but it still makes me wonder why it decided to tank out the way that it did.

FWIW, I had 2 devices (TZ67 socket and an Aeotec switch) which kept falling off my Aeotec 7 ZWave network, but would instantly come back if I interviewed them; I set up a schedule to ping them every 30 minutes and they have never fallen off since. Not sure what the issue was - the Aeotec switch is in the ceiling above the stick!!

I have a Zooz 800 USB stick, and all but two (both Kwikset Locks) of my devices are Zooz devices. I constantly have to deal with “dead” devices that I have to ping multiple times to get back up or physically turn the switches one and off to get them to ping. This has been going on for well over a year and its driving me insane. I have tried everything, including replacing devices, how ever every single one does it from time to time. The battery powered devices NEVER have issue. It is only the hardwired ones. I have a total of 20 hardwired devises and 4 battery powered.

Battery one’s aren’t expected to talk all that much. It would take far longer for them to show dead.

This has been a problem ever since the switch to Zwave JS. its something in their base level code. Not a week goes by I don’t have to fix 2-3 devices.

Aeotec 500 stick.
Jasco/GE and Zooz switches, Dimmers, Light On/Off. Fan Controllers.

The only commonalities are the stick (since everything talks through it) and Zwave JS. It was rock solid stable though not as well optioned on the previous zwave control software package. Finding such a sporadic bug through is exceptionally difficult since it doesn’t have a specific device type or manf that shows the problem.

I’m going to have to do as someone above did. Add a Ping command on a node that goes unavailable. Its annoying to have to create a workaround for it but it seems to be the only option currently.