70 Zwave device network stuck healing for over 24hrs

HeyImAlex · August 24, 2021, 9:24am

But does it ? I just dug into the sources for zwavejs2mqtt and zwavejs to find out.

In order to build the graph, zwavejs2mqtt only uses the node neighbor lists (see ZwaveGraph.vue). Hops and links between nodes are determined by looking at the first neighbor if multiple are present. Which is almost certainly 100% wrong, unless you have a node with only a single neighbor (and even then, it can be wrong, see below). When I do a neighbor update on my zwavejs, the returned lists are sequential node IDs. There is no information about LWR/NLWR (what you would call preferred route), nor any information about signal levels.

To create the neighbor lists, zwavejs calls the ZW_GetRoutingInfo method on the zwave API (see GetRoutingInfoResponse). The zwave API reference does not state any guarantee that the order of returned neighbors are in any way related to the APR/LWR/NLWR. It only states that (section 4.4.15):

ZW_GetRoutingInfo is a function that can be used to read out neighbor information from the protocol.This information can be used to ensure that all nodes have a sufficient number of neighbors and to ensure that the network is in fact one network.

In fact, the standard clearly defines when neighbor lists are used for routing, and that’s almost never (section 3.10.2):

The routing attempts done by a static controller to reach the destination node are as follows:

* If APR, LWR and NLWR all are non-existing and TRANSMIT_OPTION_ACK set. Try direct when neighbors with retries.
* If APR exist and TRANSMIT_OPTION_ACK set. Try the APR. If APR fails then try LWR if it exist and if it also fails then remove the LWR and try direct if neighbor.
* If APR do not exist, LWR exist and TRANSMIT_OPTION_ACK set. Try the LWR. In case the LWR fails, ‘exile’ it to become NLWR and try old NLWR if it exist. if the NLWR also fails, remove it and try direct if neighbor.
* If APR do not exist, LWR do not exist, NLWR exist and TRANSMIT_OPTION_ACK set. Try the NLWR. In case the NLWR fails remove itand try direct if neighbor.

So basically neighbors are only used as a ultimate fallback if everything else fails. And even then it will route through the neighbor with the strongest signal. And that is not available / used in zwavejs and the network graph either.

So in conclusion, what the zwavejs network graph displays has nothing to do with the routing. It’s completely random and should not be relied upon.

Yeah that discussion made me smile. How to display random data in a pretty way

About the MQTT, there are pros and cons. I’ll reply to this later, I’m on the go right now.