Would Z-Wave Plus help my performance?

I found that polling was a disaster for my zwave network. I don’t poll at all, and I limit the frequency of push updates to prevent network congestion.

You can sometimes learn a lot by watching the open zwave log. If you see failures, those can inhibit performance greatly.

You can try using associate to group devices together. Use the Zwave Panel in HA and Node group associations.

You only get the benefits of ZW+ if ALL the devices are ZW+. In a mixed environment you do not get the higher speed. You do get the longer range between ZW+ nodes which could be helpful.

1 Like

@anderson110 how do you disable polling completely? I couldn’t see how in the HA docs. I agree with you on the polling too. I wonder if it should be off by default.

@zarthan do you know if you can associate a Zwave (non plus) with a secure Zwave+?

So it essentially downgrades the speed between all devices if any device in the entire mesh network is not ZWPlus? What about a command that goes straight from the controller to a Plus device, or makes a hop but only does so over Plus devices? That wouldn’t remain at the higher speed?

You can’t blanket say that there is only a benefit to Zw+ if all your devices are Plus. It all depends on how devices are laid out, which devices are neighbors etc. I would not shy away from getting Plus devices because you already have a bunch of non Plus devices. In your case, you would need a Gen 5 stick to get started.

1 Like

@halfbaked I have no idea if associating with a secure device would work. Secure devices do participate in the mesh with non-secure devices so I don’t see why it couldn’t work. I would just give it a try.

A few issues come into play.

First is the controller. If you have a ZWP capable controller but have ANY ZW devices in the network the controller will run in ZW only mode. You will get no speed or protocol benefits in this situation. If all your devices are ZWP it will operate in ZWP mode.

The next are your nodes. Here in a mixed ZW/ZWP environment you will get ZW speed. Where you will see a benefit is that ZWP devices have a more powerful radio that does bring benefits in larger or noisy environments - longer range and better performance through walls and such. This brings benefits in that you get more mesh associations and can cover longer distances between nodes. In general your mesh gains redundancy which is always a plus.

ZWP nodes also tend to be much better on battery life and such as it’s a newer chipset that is more efficient and does some tricks to conserve battery life. You get that irregardless of mixed network or not.

So as pointed out you should ONLY be buying ZWP nodes. The cost difference is really small and even in a mixed environment you get some good benefits with the longer range.

One way around this mixed network is to use two Zwave controllers; one dedicated and associated with just your ZWP devices, and the other to use on your legacy ZW network.

As for polling on a ZWP network as the devices are all new enough they typically all support status reporting you can set the polling interval really high, mine is as 600 seconds. Those devices are happy.

On ZW you are not so lucky as many old devices do not support status reporting so you must poll them. Here the tuning is tricky. By having fewer devices polled you can increase the polling frequency as well without overloading the network.

One thing to remember is ZW and ZWP are SLOW networks. 20Kbps and 40Kbps. This is not a lot of throughput. It is very easy to bring a ZW network to it’s knees.

4 Likes

I didn’t do anything to disable it that I can remember. I think I just didn’t turn it on.

That is very interesting information that is new to me. How did you learn it? I am trying to learn about z-wave but the information available seems rather scattered, disorganized, and marketing oriented, or is written in spec-speak that doesn’t really bring out crucial pieces of information like this.

I would offer that even though 20 kbps and 40 kbps are both “slow”, I have my doubts that either is a likely bottleneck when people are experiencing performance problems. Rather, the latencies between sends and receives are probably the dominant issue. The messages seem quite short, if there were no latencies the network should fly at 20 kbps.

In looking at the open zwave library, I kind of wonder if there might be performance issues lurking in there. It seems the library was written by reverse engineering before the specs were made public. It’s possible there are optimizations which could be made now with better knowledge of the specifications. But it doesn’t seem like anybody is working on it.

Great explanation thank you. And yes, going forward, I am only buying Z-Wave Plus devices, but it would cost a couple thousand dollars to do them all at once and I just would rather not.

I’m thinking to separate the two I might try putting my older ZW devices on SmartThings and MQTT them to HA, and put ZWP devices on the HUSBZB-1 stick I have coming tomorrow. And slowly sunset my reliance on SmartThings.

I did a little experiment last night to try to understand if the low-ish speed of my z-wave network (lights take generally about 1/2 sec to several seconds to respond to on or off commands) is more in the hardware or the software.

So I studied the z-wave protocol and came up with a way to directly feed the bytes of a z-wave command into my serial controller with home assistant off, no open z-wave, no software layers at all, just a direct feed of bytes into the controller.

And the response was instant on the light for the room I was working in (which is in the same room as the controller). No perceptible delay at all. I had to quit for the night at that point, but I am really interested now in repeating the experiment for the most remote nodes on my network, and for batching a bunch of light on/off commands together to see how quick the response is. It seems possible that somewhere between HA and openzwave, we are not achieving the full potential of our zwave devices.

More later.

3 Likes

Very interesting! Can’t wait to hear more.

1 Like

You can see throughout issues when polling is set far too low. Also with some older style dimmers that work around status update with hail you can see it too.

With old dimmers that “Hail” by continually pushing their buttons you can easily bring down a zwave network. This is because hail is treated as a priority and causes a broadcast storm. By flooding the network with hails you eat all bandwidth.

But you are 100% correct - in most day to day situations on a healthy mesh it’s network latency that causes “slow” performance.

1 Like

Things get more “interesting” when you start trying to do more than just one command - which by itself is instantaneous.

Just trying to issue an on and then an off 1 sec later doesn’t work - it seems like seconds need to pass before you can issue an off after issuing an on. There is probably some handshake in the protocol here that I’m not respecting, and it has to time out waiting for something before accepting a new command initiation. (I’m guessing here - I’m not exactly sure what’s going on yet.)

Trying to send two “turn light on” commands to different nodes in quick succession also doesn’t seem to work. The first node responds, the second one does not. (Again, this is probably because it’s waiting for more follow-up from the first command before accepting new command initiation. But I need to look more carefully at the protocol.)

I will dig a little deeper on the protocol and see if I can get to the bottom of this.

Ok, I got the basic protocol message sequence sorted. Every switching message needs to recv an ack, then a response, then send an ack for the response. Once you do that, the controller is ready to go for the next command without delay. Of course, it’s not quite that simple, as there can be other unsolicited messages coming in, or the command can fail to be acknowledged for reasons that are unclear to me so far. But if these complications don’t arise, that’s a sufficient sequence.

So I tried turning on 2 lights in sequence using this protocol sequence. It is fast. You can perceive the order they turn on, but not by much. But here’s the catch… it is lighting fast like this when the command sequence goes off without a hitch. When it doesn’t (which is pretty common), then I’m not sure what the optimal handling is, and how fast it would be. I see code in openzwave which says things like “wait around for 500 ms to see if we get what we’re expecting before giving up”. I suspect it is this kind of condition which causes the slow zwave performance we typically see.

I’m not giving up yet… I think there’s more to be learned here about how to structure robust communication that will be fast even in the presence of “errors”.

1 Like

When you first start up OpenZwave/HA, there is a 5 to 10 minute period when it is obtaining capabilities of devices and that takes some time, so messages to turn devices on and off will be delayed.
I would say, tail the OpenZwave log when you are switching things on and off and see if messages are being sent (actually getting turned on and off) or queued (waiting to be sent). Check your node statistics for averageRequestRTT and averageResponseRTT to see how much time it is taking for your node to respond. Also check sentCNT and sentFailed for problem nodes to see if individual devices have an issue. You could have a device that is out of range that is causing problems.

Someone mentioned that adding secure nodes will slow down OpenZwave. You can try re-adding devices that don’t need to be secure in a regular manner to see if it helps.

Also, I don’t really recommend this, but I have a script that restarts Zwave every day at a specific time and things seem more snappy for me, but it might not exactly help you.

- alias: Restart Zwave Every Day at 2pm
  trigger: 
    platform: time
    at: "14:00:00"  
  action:
    - service: zwave.stop_network
    - delay: '00:01:00'
    - service: zwave.start_network

Again, these are not solutions necessarily, but things to try.

I followed up a little more with this today. I wrote a python script which is a little smarter about handling incoming messages, and sequencing things with the controller. My script turns on two lights one after the other as quickly as possible while respecting the handshake sequence, and it is damn fast. Close to simultaneous. It is also rock solid, handling other incoming unsolicited messages without any issues. I ran it for an hour or so and it never “got lost” with the serial message streams. No delays at all.

So I’m definitely thinking there are some software issues here limiting the responsiveness of zwave in HA. I’m not entirely sure if the issues are more in HA or more in openzwave, but I suspect the latter.

(Part of the problem here is that while some of the Zwave docs have been opened, the API for communicating with a z-wave stick has not, making it a reverse engineering chore to deal with. Sigma Designs seems to have no intentions of opening up this piece of the standard.)

I’m not sure what the next step is here. Trying to profile all of this within HA, probably, to see where the slippage is, but that’s a pretty daunting task.

2 Likes