Is ZWA-2 Flakey

Has anyone noticed flakiness in their Z-Wave network that wasn’t there prior to upgrading to a ZWA-2 controller? I’ve noticed quite a bit of weird behaviors in some of my devices that I never saw before when using the Aeotec Z-Stick 5+ (which I’ve used since 2020). For example, I’ve had the ZWA-2 installed for about 2 weeks now and twice in that time my Ecolink garage door tilt sensor has not registered a close action (still showing OPEN) on my dashboard. Battery is at 100%, and I never saw this behavior with the Aeotec.

Also, just now, when one of those tilt sensor issues happened, two of my motion sensors in the garage also showed motion and never cleared, which is really odd.

Most things seem to be working OK, but I do think there are some issues with the ZWA-2, especially in robustness. I did immediately notice what appeared to be faster routes to many of my devices (based upon link speeds and number of hops), especially powered ones following the switch. Some of the routes were odd, but it was pointed out by the Z-Wave JS team that they are working it and some of it is due to the routing function in the chips being a “black box” algorithm. I tried making a few priority routes to some devices where the routes seemed roundabout, so perhaps that may be some of the issue, IDK. I think I’ll go delete all the priority routes I created and see if that helps.

But I would like to know if anyone else is seeing any of this behavior with the ZWA-2, or have heard of any issues.

Hi,

After upgrading from a hardware fixed Aeotec Z-Stick 5 to the ZWA-2, the Z-Wave network has been rock-solid, however the bindings of a couple of devices changed.

This may have been a side-effect of the controller migration or ZWA version, but I doubt it is the controller itself.

For example, a Fibaro FGMS001 stopped reporting motion. Well, what really happened is the previously used binary sensor wasn’t in a later version of the device definition, so I had to locate the new binary sensor for motion, and rename my automations. This was pretty obvious due to the lack of status change in the history, but still a little unexpected.

On the device page, Recreate entity IDs helped in some cases, not in others.

If this helps, :heart: this post!

That’s good to hear. However, just today, after updating HA to 2025.11.2, my door chime wasn’t working, and when I checked the logs I saw a couple of the following (which, actually, I have seen before since adding the ZWA-2):

Activate Door Chime (ATM): Error executing script. Error for call_service at pos 1: Unable to set value 85-121-0-toneId: zwave_error: Z-Wave error 200 - Timeout while waiting for a callback from the controller (ZW0200)

Error while executing automation automation.activate_door_chime_atm: Unable to set value 85-121-0-toneId: zwave_error: Z-Wave error 200 - Timeout while waiting for a callback from the controller (ZW0200)

Not sure what it is telling me, other than something is broken and not working as it used to, or as expected. Maybe somebody knows.

Thanks for your response and report.

I too am having issues. I went from an old z-wave.me USB controller that has been doing a solid job. I think my expectations were too high for the new ZWA-2 device with its purportedly better reach. In my case, the farthest devices no longer have such a solid connection and often die. Possibly because the ZWA-2 device indeed has longer reach and the devices try to connect directly rather than via a better transit node? Hard to tell, but a better reach changes the landscape and may require relocating of some devices? But weird that not everything gets better? It’s just decoration lights in an apple tree in my case, so not critical.

My guess is that your garage is at the range limit? I would check the RSSI values and how the garage nodes are trying to connect.

Could be, but all my devices worked fine with the Z-Stick 5+, even with a (supposedly) shorter range. It could be what you are saying in that now the mesh is “shorter” in that intermediate nodes have been eliminated with the ZWA-2. But that seems odd that it would choose such a route that makes things less stable. I should have recorded the RSSI with my Z-Stick prior to conversion for the most distant runs to compare them after, but I was assuming this controller would be SO MUCH better that I’d have no problems.

What would happen if I swapped controllers again? Would HA see it and use it effectively? Or would it screw things up royally? I have added a few new devices since I’ve changed over to the ZWA-2, so those would definitely be missing. Any thoughts on doing what would happen?

Before going back, I’d try rebuilding routes one at a time.

a) for each line powered device, rebuild the routes one at a time, starting with the ones closest to the controller
b) for the battery operated devices that have been problematic, rebuild the routes one at a time (you’ll need to wake-up the device for it to rebuild)

Theory is that a different stick in a slightly different spot will have different RF characteristics.

Also look at the health of the network using the zwave diagnostic counters (timeout, tx/rx failures)

Please post an update if this works or not. I’m on an Aeotec Gen 5+ stick. With all the problems with Gen 7 and Gen 8 sticks over the last years I haven’t moved, I’m strongly considering migrating to a zwa-2 now that the Gen 8 firmware seems more stable, however, reading your reports I’m not sure it is stable.

Ultimately when adding zwa-2 you must do this.

All radio characteristics have changed. It may eventually do this over time but really it’s best to rebuild one by one

Thanks. I did choose to rebuild routes automatically after the upgrade as suggested, but not one at a time, initially. AFter I started having issues, I did go back and, looking at the routing map in the zwjsui, built fixed routes that made more sense than the one being used, trying to utilize the closest intermediate powered node, like an in-wall light switch (which are all over my house, in addition to multiple powered plug-in devices, like the ZEN15 and two Leviton plug switches which my routing seems to LOVE for some reason).

That didn’t seem to help a lot, even though I did it for several of my battery powered devices as well, going to the trouble to continually keep them awake long enough to finish rerouting. I also, during that time, rebuilt troublesome routes individually, but sometimes saw no change (but I’m not really sure how quickly the maps are updated in the UI).

Unfortunately, I didn’t take a snapshot of the controller stats before upgrading, since I assumed the ZWA-2 would be so much better, based on the advertising. My experience has not been even a little better over the Z-Stick 5+, and in fact some days I regret making the switch. I’m hoping that there is something a FW update can fix, but at this point I’m not real happy about it.

My controller stats from the zwjsui show the following:

Total Commands:
TX: 19671
Dropped TX: 0
RX: 284584
Dropped RX: 553

Messages:
TX: 32784
Dropped TX: 0
RX: 72928
Dropped RX: 208

Maybe this is normal, but I’m trying hard to remember what these numbers were like with the Z-Stick 5+, but with no luck. I suggest anyone converting to take note of these prior to for comparison purposes.

At any rate, I’ll make one last attempt at rebuilding at least the powered routes as both you and @tmjpugh suggested. But at this point in time, I can still confidently state that, with over 5 years of experience with the Z-Stick 5+, that using this new ZWA-2, my Z-Wave network is neither as reliable nor robust as it was with the Z-Stick 5+. YMMV.

EDIT:
What’s interesting to me is that when I look at the Network Graph, as I reported earlier here and elsewhere, I see more direct routes and fewer hops than I had with the Z-Stick 5+, which I attribute to the power/range advantage of the ZWA-2. So, maybe the issue is NOT with the radio, but with the comms FW. Here’s the latest graph:

Previously, there were many more red lines (slower links) and many more 2, 3, and even 4 hop, routes. Also, I have only a couple of new LR devices, and chose NOT to include them as LR, but as mesh. So, comparing this NW Graph to my previous one with the Z-Stick 5+, by all indications I should have a faster, more robust network. But I don’t.

Those stats do not look great. Here are mine from Gen 5 stick.

You have a lot of nodes. In my study of gen7/gen8 issues there seems to be a challenge when dealing withvery busy networks. Many examples of networks that were fine on Gen 5 and then have issues on Gen 7 / Gen 8. Take a look at the RX stats to identify the nodes that are chatty and see if they can be reduced via config. Also, look at the failure stats for each node to see if it is specific nodes. Check out this link for a script

Thanks for the response. I have always, from the beginning, eliminated any and all unnecessary traffic by disabling reporting for devices in the configuration when I add the device to my network. I don’t think it is a “too much traffic” issue, tbh. For example, I restarted the UI about 30 minutes ago and current stats are Total Commands: TX-95, Drop-0, RX-353, Drop-7. Zero message dropped so far. However, I’ve restarted ZWJS UI a couple of times in the last couple hours and each time, several of my battery devices are showing up as “Unavailable”, seemingly randomly each time. I’ve even tried relocating the ZWA-2 a bit to see if that helps. Not really.

The logs for one of the devices reports the following:

The node did not respond after 1 attempts, it is presumed dead.
The node is dead.
ping failed: The node did not acknowledge the command (ZW0204)

So, ZW pings a battery device ONCE and, if it doesn’t respond, considers it DEAD?! That is what the logs are saying each time one of these devices shows up “unavailable” after a UI restart. Granted, the devices doing this are the farthest from the controller, but this rarely happened with the 5+. I was under the impression that the ZWA-2 was optimized and more powerful than the 5+ USB sticks, but apparently that is not the case. Maybe it’s optimized for LR only and is less capable for mesh, idk, but I’m now seriously considering reverting back to the 5+ for now. I’ll have to re-add the new devices I added after converting. However, I never heard back from anyone as to if there is any issues with “reverting” back to the 5+, or what the correct procedure would be.

Any advice from anyone? I’ll be traveling soon and really need my stability back. Having my garage door randomly show as OPEN in 10 deg F weather while I’m 5,000 miles away, like it did the other day, wouldn’t be good.

It may not be of much comfort or help, but I migrated grom a gen5+ to ZWA-2 and indeed have a stable mesh which is much more responsive. I do not turn off any reporting. I did rebuild routes when I got it. That took a long time, before that I had many devices that were really slow to respond.

I do include devices unsecure because very few of my devices support S2 security and I avoid S0 security like the plague. I also do not have locks and such requiring better security. So my question would be do any of your devices use S0 security: if so, include them with whatever else because S0 security is known for problems like these.

That’s interesting. Your network looks a lot like mine. Maybe I need to manually rebuild ALL routes as suggested. How did you go about that? First powered, closest proximity to furthest, then battery?

To answer your question on security, most of my devices have no security (a red “-”) under the Security column in the UI. A handful have a combination of S2 Authenticated and S2 Unauthenticated, everything is “Beaming”, and most of them are Z-Wave+ V1, with a handful that are V2, and 3 that are not Z-Wave+, if that matters here.

I used the big bang: rebuild all routes. I woke up battery powered devices. It will lead to times where the whole network goes down for quite some time, but when all is done it all comes back to life. At the end I looked at the graph and manually did a rebuild routes on single nodes that seemed to have not so great connections.

That was exactly the approach I took: followed the advice of install, did the “rebuild all routes”, slow and painful, especially for the battery-powered ones, and then look at the graph. That’s when I saw quite a few that made no sense. Routes that went from the middle of the house to the other end before bouncing back to the opposite end, far worse than a direct route. I then, like you, created priority routes for those devices, even routing some through wired wall switches, spreading the load where necessary. But that only seemed to make no difference or even make it worse. I may try what I said, and do them one at a time, ordered by proximity and see if that helps.

If not, I may try to revert, but suspect that will totally hose everything, i.e., HA having/showing devices and entities that no longer exist on the network, and then adding effectively duplicates (for the 5+) of the devices I added after the upgrade. :-/

I did not set manual priority routes, I just did a rebuild route on individual devices to make sure they picked the right one once most of the mesh has settled. IMHO you cannot assume that a rout that seems to make sense to you makes sense unless that is backed up by link quality data. Is it possible that you set priority routes on devices that do not work well?

That’s what I’m doing now. I agree with you on which route makes sense, but some were just too bad, based upon distance, walls, floors, etc., to be logical. As I (think) I said previously on this thread, the ZW JS guys indicated that the routing mechanism is somewhat of a “black box” that they don’t have access to, and either are using it as it, or reverse engineered it, don’t know which. They admitted sometimes routes weren’t the best. But you are right, the only way to really know is to look at signal strength. I’ll let you know how the rebuild does and if things get better. I also removed a couple of powered devices from the network that I had added but left unplugged (since I wasn’t using them). I read today that sometimes this can cause routing issues if the controller isn’t smart enough to not try to route through dead nodes. Cheers.

Oh, also, I keep getting a notice (top right Controller status block in UI flashes red and says) that “Controller is Unable to Transmit”. It only lasts a second or two, but wonder why that would be. Can’t say I ever noticed it with the 5+.

1 Like

You may want to review this issue to see if it applies to your system. Does the controller ever report jammed?

This is not low, it is an RX every 5 seconds.

I have not noticed that particular message in the box at top right, just the “unable to transmit” message, which doesn’t really tell me much on the surface. It could be because it is jammed. I’ll check the link. Thanks.

After the rebuild, I seem to have more red connections than I did before, and many nodes with multiple links, which I didn’t have before. Like, for example, what is this graph telling me?

This particular device is in a floor below the controller, and a floor above two of the three relay devices. The ecolink chime is about midway to the Fibaro, which is on the same floor, yet it chose to go through the floor twice to reach it instead. Why There are multiple wall switches and plug-ind devices on the same floor with a more direct, much shorter route than any of those three listed. Maybe this is the best it could figure out with a battery device that is only awake every 2 hours for 5 seconds. It appears most battery devices don’t stay awake when in the middle of a “process”, like rebuilding routes or something that might last longer than the programmed “wake” cycle time, which requires constant reawakening when doing these things - which seems like a network design flaw to me.