Z-Wave graph doesn't seem to match up with physical topology

I’m trying to make sense of the logical topology of my network, and I just can’t seem to understand how it got there based on the physical topology.

A few examples:

  1. I understand why 26 & 27 are communicating directly with the hub - they’re both not Z-Wave+, and they’re relatively close. However, node 11 is just as close as 26, yet has 26 as its neighbor and not the hub.

  2. Node 9 is at the same distance from the hub as 26, just in the complete opposite direction, yet still has 26 as a neighbor and not the hub.

  3. Node 28 is right next to node 26, but not only is not neighbors with the hub like 26, it’s not even neighbors with 26!

I tried healing the network several times, and while the graph changes, the are always examples like these. The only constant is that 26 & 27 are neighbors with the hub, and almost all other nodes are 2 hops away.

Any idea what’s going on here?

Anyone have any idea?

Yes, I do know a bit about this, relatively speaking. On an absolute scale, I am not so sure about all the in and outs of routing at all… There are not many public documents describing the algorithms. Everything I know is from reading the scarce documents, experimenting and reading the official SDK. I also have a Zniffer to diagnose my network. I cannot link to official sources because a single, good document AFAIK does not exist.

Z-Wave has no true concept of “distance” or “topology”. It prefers lower node IDs in routing and does use some statistics to select routes, but basically if something is a “neighbor” and mains powered then it can act as a repeater… And I also have been looking at the open-zwave code and I think they also manipulate routing at startup, but don’t take my word on this, I am still trying to understand everything it does…

Please note Z-Wave routing is not like IP routing. Routing decisions are made by the sending node and the hops are just repeating the message… It is “source routed”. You’ll need a Z-Wave sniffer to actually diagnose Z-Wave routing, this chart does not have enough information. Or maybe you can find out enough information in the open-zwave log, that is quite possible. I haven’t studied it in great detail yet! But still that log might be incomplete because the controller evidently cannot hear sending nodes when it has no direct connection (that is why it is using routing in the first place). With a Zniffer you can move closer to the sending node and see what kind of routes it attempts.

I discuss some of that stuff in this topic on the Fibaro forum.

Unfortunately I have not yet found the time to consolidate all the information in a single consistent post. So I’m sorry you’ll have to read to quite a few posts.

You can make your own Zniffer, recommend if you have > 50 nodes (rule of thumb)

Does this help?

Thanks, really helpful.

Given that

I guess the question is then - why aren’t those nodes that are close to the hub its neighbors? They’re really close, and Z-Wave+, so should have a larger range, yet are not neighbors with the hub.

I do believe this sub-optimal routing is affecting my network in some ways (e.g. startup time, some delays in some nodes, etc…)

From a distance (between you and me, I mean), it is sometimes hard to diagnose, but I accept the challenge…

However, I would not start from that graph but rather have a look at the “raw” data gathered by open-zwave.

Can you send me a PM with your OZW_log file (in your hass config dir)? If it is too big, please restart hass and wait a few minutes, until you are certain that Z-Wave has started (or watch hass for the event “zwave.network_ready”)

open-zwave will gather information of all mains powered devices and also of battery devices if you wake them up by clicking the B-button.

In that log you’ll find stuff like this

2018-08-11 10:31:35.892 Info, Node001, Received reply to FUNC_ID_ZW_GET_ROUTING_INFO
2018-08-11 10:31:35.892 Info, Node001,     Neighbors of this node are:
2018-08-11 10:31:35.892 Info, Node001,     Node 2

This is clearly from a very small test setup at my house, this is a minimote and it really has only one neighbor (node 2) because that is all there is…

It is possible… The log file also contains “round trip times” and other useful info.

I am very new at diagnosing issues with OZW, it might take me a while (I am used to diagnose my own Z-Wave network, about 90 nodes on a Fibaro HomeCenter 2 but it has different logging) to get back to you with useful information.

AFAIK the startup of ozw takes a while because it tries to contact sleeping devices and it waits until they time-out. More battery devices (and more mains devices) means slower startup of Z-Wave.

I recently installed this automation to find out when Z-Wave is ready

- id: '1547804144105'
alias: Z-Wave network is ready
trigger:
- event_data: {}
    event_type: zwave.network_ready
    platform: event
action:
- data:
    message: Z-Wave network is ready...
    notification_id: zwave
    title: Z-Wave
    service: persistent_notification.create

I am 100% noob when it comes to automation yaml, but I do get that notification after a few minutes.

Thanks. PMing you the log.

In case other people join the conversation - I only have one battery powered device. The rest are mains powered, and there aren’t that many of them. I suspect that with more nodes directly contacting the hub, things will go smoother.

Hmm, says you’re not accepting messages…

Sorry about that, sending PM was indeed disabled on my config page. Should be OK right now.

@DudeShemesh

Recently I’ve ran into the same question-- my network is about as many nodes as yours, I recently added a few Qubino Z-Wave+ relays in the same room as my hub where a few existing non-Plus switches are still installed. Instead of directly connecting to the hub I’m finding they are hopping through the non-Plus switches. I made some changes to orientation of the relays and one did eventually directly connect. After that I’ll see something else change like the non-Plus lock in the same room that has always directly connected is now neighboring every other switch in the house but no longer the hub itself. I think it’s like @petergebruers said, Z-Wave must not use a true concept of distance to make these decisions. I am wondering about the statement re: preferring lower node IDs in routing. Would it make sense to include your nodes in an order by closest distance?

On the topic of lower IDs, I’ve seen that controller stick issues a new node ID if something is excluding and re-included, even if you clear all nodes and entities from the device/entity registry. I read somewhere that the stick itself keeps some memory of these old nodes-- curious if that just keep going forever or does it eventually go back to node 1 after hitting the Z-Wave 232 node limit?

Hi! I’m still talking to @DudeShemesh via PM, looking at log files and discussing options. If I find out something worth mentioning, I’ll report back here.

Sounds as if they are close to the edge of direct connection. Moving controller and device antenna does have an effect. But hard to predict…

It is hard to find good explanations. What I am almost certain of is that older devices use this simple logic: they reduce transmission power by 50% and try to contact a node. If it succeeds, that is a “neighbor” and a candidate repeater. I’d say that is a very crude notion of “distance” and depends on obstacles and sources of noise. That is why I say it does not have “distance” as we know it. Some posts mention that later versions of Z-Wave add response time and RSSI (receive strength) so they possibly have a better strategy to select nodes. Anyway, if nodes cannot reach the controller directly they will go through neighbours, possibly scan all nodes and resort to “explorer” frames. I don’t know any more details about the algorithm.

I’d say yes, I would start by adding at least 3 mains powered nodes about 5 m away. Then add more mains nodes further away. That will be the “backbone” of the mesh. A Fibaro Homecenter supports NWI = network wide inclusion on recent devices, which is nice. You can include (add to controller) the device in its final place, no need to go close to the controller. This saves you from doing a “heal node” (because you do not have to move it. If you move a node, I suggest “heal” on that node). I don’t know if open-zwave supports NWI.

Yes, this is to avoid accidentally having two nodes with the same node ID which would not work and give weird issues.

To be honest, I do not know. I’ve been dealing with Z-Wave for about 5 years now and I’ve never seen anyone post “hey, I’ve hit the 232 limit”. I’ve asked Fibaro years ago and that information might not apply anymore. If I remember correctly, they said you would not get past node 232…

I read somewhere that once you cross 232, unused numbers get recycled. I’ve yet to see it happen though.

Thanks @petergebruers for all the information.
I decided to move my controller to a different location, it has always been right in the middle of the house which seemed ideal but more experience and information can help make decisions. I moved it down the hall to the laundry room, sat it right on the washing machine just for a quick test and did a heal / restarted HA and now 18 out of my 19 devices are directly connecting to the controller. I’l going to test some other locations now…

Hi! That is good news… It is plausible that the centre is not the best place, it is hard to predict because it depends on obstacles, reflection, …

On the Fibaro forum I often post this picture: