Hello! I am looking for some advice on how to troubleshoot what is wrong with my Z wave implementation.
I started investing in Z-Wave because the inside of my home is inundated with 2.4 ghz noise from the surrounding neighborhood and I’ve struggled to get more than 10-30Mbps out of the band. The main reason I decided to try Z-Wave is that I have a Ring security system with no repeaters and components that are quite far from the base through multiple walls, and it performs phenomenally with rarely a shred of lag.
It’s a small start, but I have a ZWA-2 and 9 connected devices running on Z-Wave JS, all Shelly and Zooz, except for an Ultraloq and a Philio multi-sensor. It’s absolutely abysmal. There is a ton of lag and erratic behavior. Two of my Shelly devices have firmware updates that I can’t get pushed to the devices. For the sake of troubleshooting, I’ve put my HA Green with the ZWA-2 connected right in the open in a room in the middle of my home, and it’s had little to no effect on the performance. I’m at a loss as to how to investigate what exactly is going on, and am kind of at my wit’s end with the protocol. Before throwing several hundred worth of tech in the trash though, I’m hoping there is something that I’m not understanding about how to interpret what’s in the logs or otherwise identify what could make this perform so incredibly bad.
I definitely don’t understand why the meshing behaves like it does. One of the devices is a Shelly Wave Plug US, which I have two of in my laundry room for the purpose of monitoring when the washer and dryer are finished running. They are far from the ZWA-2 and have a weak signal, but everything tries to connect through the washer plug, including a Zooz switch that is about 10 ft from the ZWA in the current placement. The plugs have -90 to -100 RSSI no matter where I place the hub, and everything still insists on connecting through them, even when I move the hub and rebuild the routes. I’m absolutely baffled by how this works.
Does anyone have any pointers on how to unscrew this mess?
Z-wave mesh is by design, slow to recover from moved or broken nodes. If you do nothing a network recovery could take days because nodes only update routes when they communicate with the controller. You might try to re-interview and heal the nodes.
If you move zwave controller, everything that was working may no longer work. Network conforms to device locations.
Good zwave network needs a few AC devices. Battery devices don’t act as mesh nodes. If they are close to controller it is not issue but if devices are spread out you may need AC device or dedecated repeater.
If you check device info for device in HA there are 3 dots next to “configure” that show device stats. Lots of fails TX/RX or repeat transmissions for device will cause network wide issues. Check devices and check for this. “Rebuild rote” on the individual device may improve connection but only doing problem is found.
Zwavejsui has same information on stats and do same as settings in HA
The only two battery powered devices are the Ultraloq deadbolt and the Philio multi-sensor, that latter of which shows in a sleeping state 99 percent of the time, as it only wakes for a fraction of a second and transmits data when a sensor is triggered. All of the other devices are AC powered; two Shelly wave plugs, 2 Wave 2PMs, 1 Wave 1PM Mini, and two Zooz ZEN71 switches, then I have another 1PM mini and a Zooz ZEN77 that I haven’t even installed yet since I haven’t been able to stabilize the network.
Another thing that I’m not understanding is the logs. Why when I go into the new UI at Settings > Z-wave do we see a Logs tab at the top that indicates it is Z-wave JS logs, but it’s always empty, while if I go to System > Logs and select Z-Wave JS from the dropdown, there is a lot of activity? Are these not supposed to be the same logs?
At any rate, I physically unplugged the one wave plug that the other nodes were trying to route through, and they’ve now switched over to the other Wave Plug. The plug that I removed was creating a lot of these messages in the log:
CNTRLR [Node 005] failed to decode the message, retrying with SPAN extension…
sound like you may rush or done things in the wrong order. When i sett up new z-wave networks i follow this steps and its always rock solid: 1 never move the z-wave device, Always include the device in the place it should be mounted, if you have to have the controller closer to the device during inclusion, move the controller instead of the device. 2: when adding devices, ad them in a circle from the controller and out, 3 ad battery devices last. 4: make shore the device is 100% included before moving to the next one. 5: z-wave is rely sensitive to metal and electric trafo, make shore no metal like clothe hangers are between the device and controller or if your electric box is metal or the outlets have a metal frame, make shore the antenna from the device is not behind the metal.
Since removing the one plug, and all of the devices have now switched to mesh through the other plug, which I still don’t understand since the plugs’ RSSIs are not good, now I’m getting “[Node 003] failed to decode the message, retrying with SPAN extension…” messages in the log, Node 3 being the other plug. Until the other devices started to mesh through it, I’ve not had any of these messages in the log. What exactly does this message mean? Is it normal to see this, or does it mean there is a problem?
And to better illustrate the scenario, one of the Zooz switches is in my master bedroom on the first floor. This room is one exterior wall. On the opposite side of the floor plan the laundry room is against the opposite exterior wall. The Zooz switch is connecting through a plug in the laundry room. To do that, the signal it broadcasts has to physically pass by the controller, which is at the center of those two nodes, in order to reach the plug. Physically, it’s simply not possible that it is more efficient for the traffic to and from the bedroom switch to have to pass through the plug in the laundry room some 10s of feet past the controller antenna. Neither device has direct line of sight to the controller antenna, and the laundry room has more metal between it and the controller antenna, such as the appliances in the laundry room and kitchen.