Problem with ZHA network visualization

I am running latest HA (2020.12.1 & OS 5.9) on RPi4b with 4Gb ram. For the ZHA integration, I am using a ConBee II USB radio module.

For test purpose, I have a Conbee Coordinator + a Philips Lamp Router + an Aqara temperature EndDevice + an Aqara motion sensor EndDevice

In configuration | Integrations | ZHA | Configure | Visualization: all four devices are shown in the graph but there is only one connection between Conbee and Philips. The two Aqara EndDevice are displayed without any connection. The deprecated zha-network-visualization-card show the exact same graph (except time information).

In Lovelace the custom:zha-network-card displays correctly a table with the 4 devices. The two Aqara devices have valid RSSI and LQI values (-42 and 255).

I have shutdown HA for 30 minutes and restarted it to force reconstruction of zigbee mesh by devices but did not change anything. Indeed the EndDevices seems to communicate without problem with the Coordinator. This is my first test of using ZED devices and I was expecting to see them connected to ZC if not ZR devices (start or mesh)?

Is this a problem of the network visualization or a problem with Aqara integration?

I used to have zha_map: in configuration.yaml but I removed it since it does not seems necessary anymore for new network visualization.

Should I try ZigZag?

2 Likes

ooops, now that read your post in this section, I see you are already using this toolā€¦
I think my end devices may show up ā€˜unlinkedā€™ at times. The network graph is a work in progress, I believe. It shows some interesting info, but is not the only ā€˜viewā€™ of your zigbee network that will be helpful in managing it. You might try the tool shown below, by the same author as the network graph I believe. The columns, ā€˜last seenā€™, ā€˜rssiā€™ and ā€˜lqiā€™ per device add some helpful data for managing the zigbee network. I believe they update in near real time as well :

title: ZHA Info
# icon: mdi:home-outline

cards:

# https://github.com/dmulcahey/zha-network-card
clickable: true
columns:
  - name: Name
    prop: name
  - attr: available
    id: available
    modify: x || "false"
    name: Online
  - attr: manufacturer
    name: Manufacturer
  - attr: manufacturer_code
    name: Manufacture Code
  - attr: model
    name: Model
  - attr: ieee
    name: IEEE
  - attr: device_reg_id
    name: Device Reg ID
  - name: NWK
    prop: nwk
  - attr: rssi
    name: RSSI
  - attr: lqi
    name: LQI
  - attr: last_seen
    name: Last Seen
  - attr: power_source
    name: Power Source
  - attr: quirk_class
    name: Quirk
  - attr: quirk_applied
    name: Quirk Applied
sort_by: available
type: 'custom:zha-network-card'


Thanks for the information. I am already using the custom:zha-network-card and yes it provides useful information.

I am learning about the way ZigBee works and the visualization of internal information helps a lot to understand how the mesh network is build. I would like to see when I add a new router or a new end device how connection path are added to the network.

Unfortunately currently as mentioned seems like not all devices are shown connected. I have several other router devices (all Philips) and other end devices (Philips and Xiaomi) and I will add them to the network one by one. I was hoping to be able to check that an end device connect to the closest router or the coordinator.

The zha-network-card provides useful info but as far as I know, it does not show up connection between devices.

I am also experimenting with zha_map: I call the service zha_map.scan_now in the developer tools and apparently, it creates neighbours_xxx.txt files (in config/neighbours) for nodes that have neighbours. No files are created for the two end devices. I do not know from where all this information is coming from but I am wondering if for some reason the information related to my two Xiaomi devices is not detected correctly?

I am also looking at the zigbee.db but it is difficult for me to interpret at this point.

I think a useful measure of sensors connection quality is a combo of the LQI, RSSI and number of ā€˜packets or readingsā€™ over time. This is were the data being shown on the in ā€˜near real timeā€™ on the ZHA Network Card is really more helpful then the Network Graph. As the author states ā€œThis implementation leverages the ZHA websocket API to get ZHA device information instead of using hass.states.ā€

You want a RSSI value that is close as possible to zero and a LQI value as close as possible to 255. These in combo with, in the example of the Aqara Lumi temperature sensors, a consistent number of readings per period.

While the connections between zigbee devices in the map is interesting, these devices are constantly updating their routes and tables to find the ā€˜bestā€™ path. An unless you are doing some peer to peer zigbee app, such as a switch directly connected to a light or socket, the connection between two zigbee devices others than a device to the coordinator is going to change over time and nothing you can really control. What you want is to know that you devices are connecting to the coordinator (as far as the coordinator is seeing the picture) with low RSSI and high LQI.

Good hunting! Happy New Year!

The two quotes below are from the Silicon Labs web site and might be helpful.

ā€œThe minimum and maximum LQI values (0x00 and 0xff) should be associated with the lowest and highest quality IEEE 802.15.4 signals detectable by the receiver, and LQ values in between should be uniformly distributed between these two limits. At least eight unique values of LQ shall be used.ā€

https://silabs-prod.adobecqms.net/content/usergenerated/asi/cloud/content/siliconlabs/en/community/wireless/zigbee-and-thread/knowledge-base/jcr:content/content/primary/blog/lqi_in_silicon_labs-vvSq.social.0.10.html

"There is no direct relationship between RSSI and LQI. In a quiet environment, LQI (reliability) will decrease as RSSI decreases. But if there is any interference, it is possible for LQI (reliability) to decrease with no change or an increase in RSSI. It is important to be aware that these are fundamentally different quantities, and therefore no direct equation can be applied to convert one value to the other.

Remember that LQI is measuring the reliability of a link to a particular neighboring radio, based on the BER (bit error rate) of the current packet. This is not a linear measurement, as link reliability tends to drop dramatically (almost logorithmically) as BER increases. RSSI readings simply measure peak amounts of radio energy on the channel over a given period, regardless of where that radio energy comes from."

https://www.silabs.com/community/wireless/zigbee-and-thread/knowledge-base.entry.html/2012/07/05/can_i_convert_lqiva-R42W

So I have three Aqara Lumi temperature sensors that I am testing with a Sonoff Tasmota Zigbee bridge. I average about two and a half readings per hour from each, see query and table below. Not a super science study as yet, but you can see the unit that has two walls between it and the coordinator has the lowest number of readings.

Now as soon as I can get the RSSI and LQI values for the devices over time, perhaps this will offer more insight. I have done a similar analysis for my 4 station mesh WiFi system and about 20 Bluetooth Low Energy sensors that I am currently using to monitor temperatures around the property. Seeing how devices behaved on the same 2.4 gHz region was interesting and useful for placing the BLE collector and WiFi hubs.


-- aqara lumi zigbee AS008 temperature, humidity, pressure sensors update frequency
select
  date_trunc('hour', last_updated), 
  count(1)
from states_archive where (
last_updated > now() - interval '1 day'
AND
entity_id like 'sensor.lumi_lumi_weather_c205ca02_temperature')
group by 1;

1 Like

For reference, Iā€™ve been running 3 Aqara Temperature/Humidity/Pressure sensors for about 6mo now. The zha-network-card always shows them as having an LQI of 255 or very close. In the now deprecated zha-map AND in the new ZHA Visualization tab, they typically show up without a link, even when theyā€™re operating properly. The neighbors*.txt files also typically have no entries for them, though I did see them pop up occasionally when I was monitoring their contents a few months ago.

This is what my ZHA network looks like right now, and the two devices off to the left with no parents linked are the Aqara sensors (that are operating properly).

Iā€™ve noticed that these sensors donā€™t seem to send an update for 45+ minutes if the temp/humidity/press havenā€™t changed much since the last time the device checked in. Other sensors seem to check in much more frequently. The only ones that check in less frequently are some TP-Link door/window sensors that donā€™t have an integrated temp sensorā€¦ they only check in every 90min or so if the state hasnā€™t changed. But they have a green link in the graph :wink:

However, Iā€™m also running deCONZ with a second ConBeeII stick on my PC, and the deCONZ visualization consistently shows a link to the Aqara sensor I have connected to it:

image

So I suspect something in the ZHA integration and/or visualization at this point.

1 Like

I have 7 aqara temp/humidity sensors, 3 aqara motion sensors, 2 agara door magnets and getting more sensors soon, I am running also on Conbee2 stick with ZHA integration, everything works fine but like yourā€™s they are not all linked in visualization especially when you just restart the HA.

The thing with these battery powered devices is, most of these donā€™t send data if nothing changes (like temperature or movement) after a while some of them get linked in the visualization and some not, the linked ones will show later on not linked again, it is just confusing but they work without problems.

Same goes with zha-network-card (https://github.com/dmulcahey/zha-network-card) when HA is just restarted all battery powered devices show N/A for LQI even the IKEA on/off switches till they get used or send data. But zha-network-card keep showing the last LQI information and ZHA network visualization does not do that (at least that is what I think)

This is my Zigbee network, and you can see 4 not linked devices, again all battery powered devices:
zha_network

1 Like

Slightly off topic but does anyone know how to get RSSI, LQI and even Last Seen values as entity attributes. I switched over from zigbee2mqtt and those values were available but not with ZHA that I can see. thx

1 Like

Thanks for information.
It is good to know that I am not the only one having problem with the visualization. I have several other Aqara devices to connect and I will see if they behave the same (on top of weather and motion type, I also have ā€˜magnetā€™ and ā€˜buttonā€™ type).
As you have indicated, refresh of data happen a very low rate I guess to save power. When looking at ā€˜last seenā€™ values it seems that the weather and motion devices can stay around one hour without sending data. Of course as soon as a motion is detected the motion sensor send data immediately for zone and occupancy. Zone stay on for about 2 mn and occupancy for 10 mn

I have one of my weather sensor very far for Conbee and the LQI was as low as to 55 but still the device operated very well. I was surprised to see that ZigBee seems to propagate better than the Wifi!

One of the experiment I want to try is to use a Philips bulb as a router between the Aqara EndDevice located very far and the Conbee. But as aqara ZED do not show connected it will be hard to know for sure it is indeed connected to the Philips ZR. What I can test is to place the Philips bulb mid-way between the Conbee ZC and Aqara ZED and see if RSSI/LQI improve when Philips bulb is powered?

@fantangelo use the link provided by @Sp4wN to install the zha-network-card it will display all the information you are looking for.

I use the following yaml to display ZigBee information (you will have to enter this in Raw configuration editor) :

  - title: ZHA
    path: zha
    panel: true
    icon: 'mdi:zigbee'
    badges: []
    cards:
      - clickable: true
        columns:
          - name: Name
            prop: name
          - attr: available
            id: available
            modify: x || "false"
            name: Online
          - attr: manufacturer
            name: Manufacturer
          - attr: model
            name: Model
          - attr: device_type
            name: Type
          - attr: ieee
            name: IEEE
          - name: NWK
            prop: nwk
          - attr: rssi
            name: RSSI
          - attr: lqi
            name: LQI
          - attr: last_seen
            name: Last Seen
          - attr: power_source
            name: Power Source
          - attr: quirk_class
            name: Quirk
        sort_by: device_type
        type: 'custom:zha-network-card'

Yes, I already use zha-network-card and it is great for a visual aid to see what is happening. Additionally what I would like to see are the RSSI, LQI and Last Seen values as attributes in the entities that are created. This would allow me to create automations that could trigger if any of those values are out of whack. I have done quite a bit of googling to see it this is possible without success and I do not know of any other way to access those values in templates or automations. thx

If you want to see your temperature sensors ā€˜in actionā€™ a hair dryer or floor heater is a good ā€˜toolā€™ :wink:
Interested to hear what you find. From my experience, with three Aqara end devices, LUMI Weather, LUMI Motion, LUMI Sensor Cube (it can be argued that this is a very difference ā€˜end deviceā€™ than the other two). Only the LUMI Sensor Cube will ā€˜hopā€™, aka have it parent be a device (router) other than the coordinator. In comparison with the LUMI Motion, I have a Ikea Motion sensor that does ā€˜hopā€™, itā€™s parent is a router and not the coordinator.

As far as I know the ConBee coordinator can have a maximum of 32 children (including End devices and routers). Therefore above 32 devices you must use routers.

There is an interesting article about building a ZigBee network as I indicated here

Many youtubers mention that in order to add a new device you have to place it close to the coordinator and these seems definitively a bad practice. The best solution seems to build your network incrementally starting from coordinator adding first the intermediate routers then finally your end devices that are assumed to be connected to these routers. Normally the network is self healing so at the end it should always work but it seems a good idea to build it as close as possible to its final ā€œshapeā€. Currently I am only experimenting but in the future home I am about to build I intend to put smart plugs as routers (reliable always on routers).

I found that if you click on a ZigBee ā€œrouter deviceā€ (in ā€œconfiguration | devicesā€ or directly clicking on router device in zha-network-card) you will see a ā€œadd devices via this deviceā€ command. From what I understand this forces to add the new device as a child to this router where if you use the add new device from the ZigBee integration the new device adoption is initiated by the coordinator (eventually this might end up in adoption through the router). I have not yet tried this command.
While talking about commands available from ZigBee device display I did not found any documentation about the following commands:

  • ā€œreconfigure deviceā€: what does this command do ?
  • ā€œmanage clusterā€: What for? how to use it?
  • ā€œZigBee device signatureā€: Meanings of information returned by this command?
  • ā€œremove deviceā€ just remove the adoption by the coordinator or also send information to the device to reset it? This is especially important for devices like Philips bulb that have to be reset in order to be readopted.
3 Likes

Good info, thanks! And good questions. This ā€˜add via a deviceā€™ is a new one to me today via you and another member. The wording seems to flow into the idea of having end devices ā€˜hang offā€™ of routers rather than only the coordinator. I think the proper term is a end deviceā€™s ā€˜parentā€™.
Much to learn!

2 Likes

I just completed new tests.

First, I tried to add a new ZigBee device (Aqara magnet) using the Ā« standard way Ā»: configuration | Integration | ZigBee ā€œConfigureā€ | + Add Device. On purpose, the device was located beyond two thick walls and is therefore out of range of the Coordinator. A router device (Philips hue bulb) is placed mid-way between the Coordinator and the Aqara device. The add device command fails.

Second, I have used on the Router device (Philips) the ā€œadd devices via this deviceā€ command. A window similar to the ā€œnormalā€ add device widows open and after a few seconds, the Aqara device is correctly discovered. I tested the behavior of the device in Lovelace and it reports correctly the close/open state.

This is quite interesting because this seems to indicate that when using the ā€œnormalā€ add device command only the Coordinator is looking for new devices and that the search is not ā€œpushedā€ to the Routers. This is may be why most people recommend adding new devices close to the coordinator. Hopefully once added in the network and placed to their final location the mesh should heal itself and the parent of the end device should be transferred from the coordinator to the router?

I am still unclear about the meaning of LQI and RSSI values (see Reported LQI values for Zigbee devices with ZHA integration for interesting question/information). As stated in this reference, it is unclear if the LQI / RSSI only refer to the last hop or if the values take in account the complete path from the ZC to the ZED. In my test, the LQI for the ZR (Philips bulb) and the ZED (Aqara magnet) are equal to 255 and this seems very high as the devices are located far through wall?


Interesting also is the fact that the connection between the router and the end device show up on the network graph! As you know the arcs that tie the nodes together display LQI value(s). On a Coordinator (ZC) to Router (ZR) connection, there are two values v1/v2? Originally, I thought these two values represented max/min values. However, by looking at the content of the zigbee.db I have found that actually these two values correspond to the LQI in each direction of the link (strange no? why different values?). On a Router (ZR) to End Device (ZED) connection, only one LQI value is displayed. This probably make sense because only the children can initiate an exchange with its parent.

As I have moved the Philips Router device in a location placed between the coordinator and the Aqara weather end device the network should normally ā€œheal itselfā€ and switch the parent of the Aqara end device from the coordinator to the router. By doing so, the RSI would probably change from around 100 to around 255. There are a lot of articles on the subject of ZigBee network construction/healing. If I understand correctly, in my case the Aqara end device should keep track of a working but bad link to the coordinator, and use a better link to the coordinator through the router. All the different possible paths should be displayed on the graph (extracted from the routing table kept in devices?). As I understand, the process takes time and therefore I will allow 24 hours for this process to complete. If the link is not modified automatically, I will try to force it to happen using the ā€œadd devices via this deviceā€ command.

4 Likes

Found the following interesting information:
LQI less than 255 indicates packet loss. RSSI should ideally be something like -75 dB or above (e.g. -75 is better than -85).

Just installed the new ZigZag custom panel that displays a graphical layout of Zigbee devices and the connections between them. It was released yesterday from Zigzag-panel showing no graph but unfortunately it does not work for me Zigzag-panel showing no graph.

Anyone tried it?

Your results are consistent with my experience and with what I believe is expected behavior of the Zigbee mesh.

One note about signal rangeā€¦ 2.4GHz signals can pass through many building materials quite well. Interior walls made of wood and drywall shouldnā€™t cause much of a problem. Concrete or brick is another matter, though. As is metal.

As for adding devices, my understanding is that ideally, you add your devices from the location in which they will be ultimately installed. When you click on ā€œadd devices from this deviceā€ in ZHA, my understanding is that ZHA will use that device to search for and configure new devices. If that device (coordinator or router) canā€™t pick up the signal from the new device youā€™re trying to add, the new device will not be added.

Whenever I add a new device, I try to pick a Router device thatā€™s fairly close to the new deviceā€™s intended loction (i.e. pick a device I think will make a good parent) and then select the ā€œadd devices from this deviceā€ option from that deviceā€™s Device page in HA. That works 95% of the time. In the few cases where it doesnā€™t I move the new device a little closer to the Router device Iā€™m trying use to add the new device, and then it has always worked.

2 Likes

This may be accurate. My understaning is that an LQI of 255 indicates no packet loss, not necessarily a perfect analog signal. Also, if the wall in question is wood/drywall, it shouldnā€™t attenuate the signal too much.

Excellent point about bidirectional communication - depending on the hardware and the environment, the RSSI and LQI values could certainly be asymmetric. In one direction, you have the Router/Coordinator transmitter interacting with the deviceā€™s receiver. In the other direction, you have a completely different transmitter (in the device) sending information to the Router/Coordinatorā€™s receiver. One transmitter might be significantly more powerful than the other, or tx/rx antennas might be better or worse.

In the deCONZ UI, Endpoint devices always have a 0 for the second number - e.g., 255/0 or 110/0. Looking at my mesh, most of the pairs are withing 10 of each other; but a few have larger differences, such as 174/100.

1 Like

I am in a very old house (build before 1800!) with walls that are about 60cm thick and 2.4GhZ has hard time to pass through. I have been obliged to add a Unifi mesh network with a lot of access points!
But for some reason ZigBee seems a bit better.

Good to know that we are getting same kind of results. I just added a new device (aqara push button) to the intermediate router. In order to correctly adopt the device I had to add it several times (?) but finally it works. Not so easy to test as a button does not have state so I had to test it in developer tools listening to ā€œzha_eventā€ events

I have also read somewhere (not sure if is true) that in the Xiaomi implementation end devices once connected to a parent (router or coordinator) do not try to find other ā€˜pathā€™. Once connected they do not change link even if connection is no more working. ā€¦

The newly added device through the router is not connected by a link in the graph. So one is connected (the magnet) and the other one is not (the button). Therefore I am wandering if the problem of devices not showing is due to a problem of Xiaomi not reporting correctly the link information to the coordinator. So far the only one that report the link seems to be the magnet type device. As I have other magnet device from Xiaomi I will try to connect them directly to the coordinator and may be they will show up.

As I have mentioned the ā€˜ZigZag network displayā€™ reports exactly the same informationā€¦

New information since yesterday!

First contrary to what I have reported based on Internet information the Xiaomi Adara End devices (like the Aqara weather device) are performing network mesh optimization correctly.

Reminder about my current ZHA ZigBee test network topology (excluding the Zigbee devices managed by the Hue hub and therefore using a different PAN):

  • A Conbee II coordinator (ZC) connected to my RPi4 running HA.
  • Close to it an Aqara weather sensor + an Aqara motion sensor.
  • Far from the ZC (but still in RF range) an Aqara weather sensor.
  • Very far from the ZC (meaning out of range of the ZC) an Aqara button + an Aqara magnet sensor.
  • Located half way between the ZC and the ā€œfar /very farā€ ZED I have a Philips Hue bulb acting as a router (ZR). The ZR is in range with the ZC and in range with the ā€œfarā€ and ā€œvery farā€ ZED.

In order to add the ā€œvery farā€ ZED devices I had to adopt them using the adequate command from the ZR (i.e. not adopting directly from the ZC). The ā€œcloseā€ and ā€œfarā€ ZED devices where adopted directly by the ZC.

Within less than two hours (unfortunately I canā€™t be more precise) the ā€œfarā€ Aqara weather device with a LQI of about 100 has been automatically re-routed from the ZC to the Philips ZR! Now the LQI for this device is 255.

Originally (yesterday evening) on the Zigbee network visualization graph, only the link between the ZC and the ZR did show up. This morning to my surprise all the links between the ZR and the three connected ZED also show up! Not sure why it took so much time for the graph to display these links. So now, all the links are showing EXCEPT the links between the ZED and the ZC.


I have learned a lot about the ZigBee network and routing concept by reading the Silicon lab UG103.2 document. LQI as its name imply is an indicator of the quality of the link between two nodes. It is used to find best possible routes. In the case of a ZED is help finding the best ā€˜parentā€™. This explain why the Aqara weather ZED device switched from the ZC (with LQI=103) to the ZR (with LQI=255).

As a reminder an LQI=255 means that no messages has been lost between the ZED and its parent this can happen even if the RSSI (received signal strength) is not very strong.
Generally for RSSI:

  • anything -60 dBm and above (meaning -50, -40, etc.) would be considered a very strong signal.
  • -75 dBm is usually OK for ZigBee home automation transmissions.
  • Anything at -80 dBm and below (meaning -85, -90, etc.) is mostly noise and you risk losing messages.

In my case, I had an LQI of 103 and a RSSI of -81 dBm for the far ZED (note that the device was still working correctly).


Sorry for this long message but hopefully it might be helpfulā€¦

4 Likes

After removing the manually-installed zha-network-visualization-card, and moving to use ZHA Network Visualization, I noticed that the network address (nwk) showin in the device map ais now reported in decimal, rather than the traditional hexadecimal.
This is not consistant with the nwk format shown in the Lovelace card, which is reported in hex.
It would be nice if the nwk address value was reported with the same format in all places.

Is there a way to change the network address (nwk) in the device map to be shown in hex again?

The zha network visualization use the code developed for the custom:zha-network-visualization-card (now deprecated) and it has always displayed the NWK in decimal (just checked now)
The visualization now is included in HA core and as far as I know there no configuration options other than database_path and enable_quirks