Zigbee devices with poor LQI, not finding better routes that I am certain exist

FYI, relocating your Zigbee Coordinator using a long USB extenstion can make a huge difference.

See general troubleshooting tips here → https://github.com/home-assistant/home-assistant.io/commit/970295a277e8f01d3ee39eeeaacf453625b988d3

That is from ZHA docs PR → https://github.com/home-assistant/home-assistant.io/pull/18864

and also always follow best practice tips here as well → https://www.home-assistant.io/integrations/zha#best-practices-to-avoid-pairingconnection-difficulties

@Hedda I have always kept my coordinator on top of my networking rack as shown in the picture in this post above : Zigbee devices with poor LQI, not finding better routes that I am certain exist - #10 by aruffell

Before moving to the Sonoff coordinator, I was using a ConBee II in the same spot and everything was super solid except for the occasional meltdown of the entire mesh (after upates, reboots, etc) that finally led me to upgrade to a Sonoff Zigbee 3.0 USB Dongle Plus coordinator.

I got the impression that some devices do not pair well sometimes, meaning something goes wrong in the process, and they then just cause trouble with all the devices depending on them. I say this because the issues seem to be in 2 spots of the house and when I re-pair routers in those areas things seem to improve (but I often have to re-pair all the battery devices too)… but I still have battery devices going unavailable and not working or unavailable but working (might just be that they don’t check in often enough – my current “Consider battery powered devices unavailable after (seconds)” is set to 28800 seconds – is that too short?)

I am definitely learning a lot about Zigbee while I troubleshoot this. I believe I saw one of your comments elsewhere that led me to find how to get and set how often the device checks in with the coordinator. I have not tried it yet, but I am guessing I can do so by using “Get Zigb Attribute” for “PollControl Cluster” & “checkin_interval” attribute. I realize that if I were to override this setting making the device check-in more frequently, this will cause the battery to deplete faster so I am not planning on doing so.

EDIT: While having these issues I am also having to replace batteries frequently. Seems like this issue is causing extra power consumption (messages being repeated?).

I may have found the cause for some of my devices going unavailable… can someone confirm this could be it?

I have a number of devices, mostly battery operated, that keep dropping off as unavailable. One of the following actions often revives them: pulling battery, repeatedly pressing “reconfigure device”, just waiting, etc

Just now I was trying to get the check_in interval of a Samjin motion sensor that I revived by pulling the battery. The setting came back as 0… I tried several times with the same 0 being returned. I pressed “reconfigure device” and the setting changed to 13200. Could it be that often my devices are NOT completing the configuration properly and are therefore misbehaving? In the case of this sensor, the 0 check_in interval was possibly causing it not to check in and therefore HA / ZHA was marking it as unavailable?

On a side note, the pairing process used to be quite fast, now it is most often slow and at times I don’t even get the Green box where you complete the name/location. When this happens I try again as I assume something did not work/complete. I have a similar issue with Zwave and there it is due to too much traffic.

Been following this thread with some interest. I too have recently switched to the SONOFF Zigbee 3.0 USB Dongle and have had a bit of a journey in the configuration. First I used the suppliers written integration in HA which was just truly awful and a bit of a disappointment considering all the devices I have are from that same manufacturer. So reverted to HA MQTT and Zigbee2mqtt integration.

Although my network is x10 smaller than yours and the devices (all battery) are within 12ft of either of the two routers on the network, I have been experiencing drop-outs every evening with most of the motion sensors. I run the latest firmware on the co-ordinator (attached to the RaspPi HA and two routers (which coincidently are SONOFF Zigbee 3.0 dongles as well) CC1352P2_CC2652P_launchpad_coordinator_20220219 and CC1352P2_CC2652P_launchpad_router_20220125 respectively. Plus have swapped the supplied antennas with 22dB high yield ones. The coordinator is sitting, as yours is, on top of my machine rack and the routers are placed centrally to all the sensor’s on the ground and first floor.

I thought at first that it may have been the houses structure that could be causing this issue, it’s five hundred years old and has a thumping oak frame, but the LQI for all devices has been showing in the hundreds (this obviously would drop to zero when the devices would disconnect).

I started right from the beginning, after some research before implementation, using a combination of timeouts and device parameters in the zigbee2mqtt configuration.yaml and I think that this was a mistake. For the last couple of weeks I have been running around resetting devices on this unstable network.

I have now stripped out all the generic timeouts and added where needed for individual devices using the devices.yaml file, segregating them from the configuration.yaml. I also found a parameter that seems to have helped considerably, the transmit_power in the advanced section of the configuration.yaml, when not otherwise stated, uses 5db. I updated mine to 12 (not wanting to shout to loudly) and this seemed to give a more accurate and decisive communication.

Yesterday and today have been quite stable, but only time will tell…

Just a few remarks:

  • “high yield” antenna’s don’t provide that yield/gain for free: you’re giving up omnidirectionality. In other words, such antenna’s are more directional:
    image

  • higher transmit power can work paradoxically: because the coordinator’s signal can overpower the signal of routers, battery-powered devices may think that connecting directly with the coordinator will yield a better route. However, those battery-powered devices will still be transmitting on their (low) level and increasing transmit power doesn’t mean the coordinator can pick up those signals any better.

3 Likes

Why don’t you just try to move that antenna in a different place? It can be the easiest solution. The argument that your previous setup worked in that spot is just invalid and you are being stubborn.

1 Like

That’s interesting thanks, in particular about the coordinator overpowering the routers. I have tried multiple combinations of various antenna configuration on the coordinator/routers within my set-up and have found, so far, that this works best for me (at least for the time being). It also helps that the routers are placed almost centrally to the device clusters in there respective floors…

I did have a question about the mapping function within zigbee2mqtt. On previous occasions it would seem that one or two of the devices become disambiguated from the router and are spinning without an LQI line, but working fine? Could this be a mapping error? and if so, is there a more sure fire way of determining what’s connected to what?

I’ve seen similar situations with the z2m map.

Because it takes a bit of time to generate the map, I assume that it’s asking all router devices in the network about which connections they have (to other routers and to end devices). Perhaps when a router takes too long to answer it will not be taken into account, and since z2m knows which devices are in the network, it will show the devices “floating by themselves”.

Stubborn? lol - I can’t move it because the computer running it has to be in the rack and the USB extension is only so long. I don’t want the complexity of having another system, even an RPI, running ZHA just to be able to relocate it especially knowing that the network was rock solid prior to 2022.4 and my switch to the Sonoff coordinator.

I am testing the antenna in horizontal position though as this way the lobes are enveloping more devices. The thought is that it might receive signals from the floor below better (see omnidirectional antenna lobes).

1 Like

I recently noticed that devices the coordinator shows with very low LQI have tons of children with very strong LQI. This one device I was looking at was also very close to the coordinator itself. It just leaves me puzzled on what this LQI actually refers to… the worst possible route it finds? lol

I have also been testing (but don’t know how to confirm) source based routing. I have yet to notice any improvements after several days. Anyone do this and have pointers?

I bought a couple of extension cables for the antenna from amazon for a few quid.

https://www.amazon.co.uk/Antenna-Extension-Coaxial-Extensionl-Wireless/dp/B08D9HNPY5/ref=sr_1_15?keywords=antenna+cable&qid=1651326100&sprefix=antenna+%2Caps%2C78&sr=8-15

Saved me ripping the ‘stick on’ motion sensors off the wall and I could wander around the various areas when they needed pairing. It also eliminated any possibility that the machine rack may be causing the issue.

I did consider buying the small coax cable for the antenna so I can relocate just the antenna but my options are really limited by the metal network cabinet and its location. I am actually tempted to install an antenna with less gain as I keep thinking about the imbalance created by a coordinator with way better reception and transmission gain compared to the average zigbee device… it is a two way communication so I believe the imbalance may introduce more issues that benefits in some cases.

A few comments.

The LQI numbers are just some number they show. It is probably showing some signal strength multiplied by something, however I have no indication that a high number is performing any better than low numbers. When I was using ZHA and ConbeeII most of my devices was one 255 and the lowest was in 100+. Using Sonoff 3.0 dongle and Z2M I have no numbers above 200 and most are below 100. However, the zigbee network is significantly better than before, to the extend that I do not expierence any errors.

I have tried “over-engineer” the zigbee mesh, changing routers, where to install additional routers and move the conbeeII dongle. I have concluded that all this is giving low to no difference

My believe is that a few and simple steps are important.

  • Have a long extention cable and put the coordinator far away from anything. Serveral meters.
  • Make sure you wifi channels are not overlapping
  • For the Sonoff 3.0 dongle, use Z2M
  • Do not enginere the setup, do not change any numbers. This will lead to problems.

This is all based on my 1 year of using zigbee and HA. A personal experience and nothing else.

3 Likes

Yeah, I noticed that Sonoff and Conbee report LQI differently. If we knew exactly what it is trying to show it would be easier to optimize the network. I am not too fussed about the LQI othen than the fact that I keep running into unavailable devices. Some come back on their own, some do not. It is frustrating not to figure out the cause.

Most recently some of the troublesome devices were reacting to a command 4 times… meaning I turn a switch on and then off, and the devices turns on and off 4 times with varying delays. I have read that zigbee sometimes sends the command up to 3 times (?) if there are issues. So, this leads me to guess that the coordinator is not hearing back from the device and therefore tries to repeat the command.

Your metal cabinet, which seems to include some CPU(maybe 2.4Ghz) devices plus a DISK-Rack and whatever(i.e few high traffic-devices in USB 3.0) , is actually 1 of the worse place to have a Zigbee Coordinator, your metal-rack will most likely generate so much interference that your nearby “router/wall-plug” ( that you mention ) seems like the best “connection” point for these distant devices, i would for sure look for a Ethernet Connected Zigbee Coordinator, so you have best options to place the Coordinator, where it doesn’t have to “compete” or struggle to receive messages on the 2.4Ghz frequency

@boheme61 On a good note, there is no Wifi inside the cabinet, and no USB 3.0 devices. The only device emitting RF is a GLEDOPTO led controller so that my cabinet lights up red when you open the door. I am guessing there is some amount of EMI from all the equipment, but it is also all grounded and the cabinet itself should act as a shield to some extent… well except on the front ;).

The metal cabinet it surely going to block or impair signal coming from the opposite side of the cabinet from the antennas perspective (when the metal is in between), but a mesh should not care and should simply route accordingly. In the media room, where the cabinet is, are at least 4 zigbee routers, with 3 more right above it (in the attic) and a ton below it, and on its sides. As mentioned prior, the Conbee was rock solid other than the occasional meltdown which got me to upgrade. The past 7 years prior to HA, I was using a SmartThings hub sitting on top of this rack and my zigbee devices worked flawlessly.

While I concur with many of the objections on the placement, especially those regarding its location (top floor, side of house, instead of ground floor in the middle which I consider the best spot), I have 8 years of flawless zigbee mesh network… most of which with that rack.

I am not discounting any of the feedback and suggestions… they are all making me think.

I do like your suggestion of using an ethernet to zigbee interface with the exception that where it would be best for me to put the coordinator I do not have the option to bring wired ethernet. This also means that when I update, reboot, etc my networking gear (20 switches, 5 APs, etc you get the idea) I would bring down part of my automation as well. That was the case with ST and I was enjoying the fact it was no longer the case with HA.

It’s all food for thought… thank you for the flow of suggestions everyone!

Edit: Correction… the networking gear does emit Bluetooth (so 2.4GHz) at least during setup. I’ll check but I believe it is off, as it would have no use, after the gear is setup. The ethernet to RF interface for my Chamberlain garage opener is also in there but that operates at 3xx/4xx MHz so should not be an issue… mentioning for completeness as I forgot it was even there.

Edit2: The core systems of the house are in there so that they can all connect to the core switch to minimize points of failure. Once I update my UDMP (router) to the newer version, I will likely move some of those critical devices, such as DNS, VPN, and possibly HA to be connected directly to the embedded switch on the newer UDMP (the old one does not have POE that I use to power some of that gear). HA is running on one of the 2 Lenovo m920q (i7-8700T, 32GB, SSD, etc) so it needs to be in the cabinet for direct connectivity to the core switch and for the UPS that is in there too… plus other important VMs run on it too.

Ok, that gave me a more clear picture of your Rack, and situation, i just got caught up by

So my thought was the that this nearby device, maybe not was build to handle a huge bunch of children, and therefore was part of your problem.
Beside the “Rack” looked like a busy server-rack with a disk-cabinet etc. :slight_smile: , also i was not aware you had so many “repeaters” in your zigbee mesh, i still have a very small amount of device, compared, so with the various Topics i’ve seen here , beside the Various hardware(Coordinators), i came to think about that (“Mesh” in all honor), but my new wifi-router supports wifi-mesh, and 2 plugs can act as zigbee repeaters, but i have never been a fan of the popular Mesh(term), due to the fact i still stick to old proven tech. , and not have looked deeper into the “new” Mesh tech, other than noticed the various problems people seems to experience, like less control over the route(in particular), confused devices, etc.
Hope you “narrow” down your issue soon, … i think i will invest in a “signal analyzer” if (when) my system grows to large :slight_smile: , but i definitely got many hint’s, cons and pros since starting here in this Forum
PS: in regards to LQI, i have atleast 7 devices working fine beneath 70-80

Hi, what app do you use to view this on your mobile?

a special app to see that?

No that screen grab was from the device info in the ZHA integration.

go to integrations->ZHA->devices and click on the desired device.