Zigbee Connectivity in my house sucks šŸ˜¢

Hi guys,

Iā€™m still trying to figure out why Zigbee is so buggy in my environment.

I started a thread about a month ago why connectivity was so bad in my house. People recommended to buy and install a few Zigbee power plugs, which I did. Almost every room has one dangling around now, so if thereā€™s someone out there who has a zigbee mesh, itā€™s me. I also got myself a sky connect stick which is dangling on the usb cable and away from other devices (WiFi, SSD, router etc). I also have a power plug right next to the sky connect so I have an immediate second zigbee router/mesh device. HA is running in a Pi 4 with 4gb RAM.

Iā€™m still having trouble with several zigbee devices, foremost PIR. I have several sonoff devices of which I have heard many were of bad quality, so I bought a more expensive one from Aqara. Same issue.

Sometimes it works, sometimes it doesnā€™t. In the beginning, it worked fine, now itā€™s like once per day, maybe twice. Sometimes it works better for a couple of hours after I restarted HA, sometimes it doesnā€™t at all. Iā€™m so lost.

I have zero idea what is going on here. In the network diagram of ZHA, I see some/many device that have no connection to a hub, but when I zoom in, it doesnā€™t say they were offline, so no indication of an error. It can also not be an issue of connectivity because of so many zigbee power plugs. It can also not be due to low battery. The aqara is new and I bought brand new batteries for the sonoff devices too. The pi is not too slow, the zigbee stick is of good quality (I guess?). I have no clue at all what could be going wrong here. I do not expect that I might have too many zigbee devices,too. There are about 15 connected devices, additional 10 offline that were there for testing sometime ago.

Do you guys have any recommendations? Iā€™d love to have a stable environment where the PIRs are working properly. This makes me so sad :sob::sob::sob:

Looking forward to what you might say!

Hey mombro, did you already check the channels of ZigBee and 2.4 GHz WiFi for interference?

Here are some explanations: ZigBee and Wi-Fi Coexistence | MetaGeek

I pinned my WiFi to channel 6 and set my ZigBee to channel 11.
This works out pretty well for me with 28 devices.

3 Likes

What?! Howā€™s that possible and why have I never heard of this before.

That sounds extremely plausible! Iā€™ll check this ASAP :thinking::ok_hand::+1:

I also have wifi channel set up on 6 for 2.4 wifi and zigbee on 11. And I have 80+ zigbee devices.
Just check if there is some bluetooth speaker or something that uses bluetooth as it can produce a lot of interference and can be cause of frequent devices drop out.

1 Like

I swapped to WiFi in channel 1+6 for my 3 APs and migrated zigbee from 20(?) to 11.

Letā€™s see what happens tomorrow! :heart:

It should be oke now. This is the working setup for many people as I know and this channel setup works the best for many.

Recommend read and try to follow all the tips here ā†’ Zigbee networks: how to guide for avoiding interference and optimize for getting better range + coverage

That was very helpful, thank you!

I must say, I think the coverage is pretty fine now. But I think something is wrong with HA. Somehow, no entities were connected anymore today, not even the closest and most reliable ones. I reloaded the integration and it worked fine again immediately.

Any recommendations what I should do next time to do some trouble shooting? Any logs I should look at or post here?

An entities card with all your LQI values on it can be illuminating sometimes, especially if you group them by area.

Individual LQI values donā€™t mean much - they change all the time and different manufacturers calculate them differently - but over time you may be able to identify areas where LQI is consistently lower. Thatā€™s where to put an extra routing device.

You mention the ZHA network visualisation in your OP. This is a snapshot of all the possible routes Zigbee is aware of, not of the ones it is using. A lot of them (red and grey) it wonā€™t use if thereā€™s an alternative. Itā€™s not uncommon to see end devices that donā€™t appear to be connected to anything - it usually means theyā€™re sleeping to conserve battery. They should wake immediatly if a sensor value changes and periodically they will wake anyway to check in with their parent router.

For the same reason end devices will sometimes have a value of ā€œunavailableā€ in the device info card, particularly after a restart. It can take some time for them all to check in, but again if a sensor changes they should wake immediately and update all their entity values (battery level, etc). Wave your hand in front of a PIR before you assume itā€™s not working!

You mention a power plug in each room to act as a router - that may not be enough. Zigbee works by having many, many connections between routers. When a message goes from the coordinator to a device at the other end of the house itā€™s route is determined by evaluating the quality of the connections at each hop - it needs to have a choice or you get bottlenecks. End devices like motion sensors donā€™t contribute to this at all.

Finally, Zigbee networks settle over time. After a few days you may find link quality improving a little.

Hereā€™s what I got ā€¦ Left is the basement, right is the living room + kitchen.

No connection on most Zigbee devices, or rather ā€œonly power plugs are workingā€.

I turned on the LAM(P)_Garage and the PIR_Keller (basewent) went online, too, which is odd. But all other PIR not working despire really good coverage by power plugs :frowning:

ā€œFinally, Zigbee networks settle over time. After a few days you may find link quality improving a little.ā€

For me, itā€™s the opposite. I restart HA, most devices work, I wait 4-6 days, none work at all.

Is there really no log or something like that where I can find debug information? :frowning:

You can turn on debug logging on the integration card in Settings | Devices & Services | Integrations

If you really want to deep dive thereā€™s a HACS integration ZHA Toolkit

1 Like

Before you dig anywhere deeper there is one thing you can try. You have problems with battery powered devices. Buy new battery, replace it in at least one of devices and pair it again close to the coordinator.
If battery is low devices tend to fall of from network although you might get battery level report as 100%

1 Like

I thought about batteries, too, so I bought new ones on amazon already. Also, my aqara motion sensor is brand new and should have batteries for 2 years :see_no_evil:

Speaking of the aqara, I will attach a screenshot. Lqi is above 100, but didnā€™t update for 2 hours already :unamused:

WiFi channel one is right on top of Zigbee 11.

That said, LQI is just a vague hint, and different devices calculate it differently. I have values ranging from single digits to over 200, all devices work well.

Should I have used 1 and 11 for WiFi and 6 for zigbee, then?

Also, I donā€™t care much for lqi value, but whatā€™s interesting is that it didnā€™t change for 2 hours in the mentioned screenshot. I think there is a connectionā€¦ I think other devices are not working for same duration. I must check and get evidence nextā€¦

Iā€™m guessing you didnā€™t actually look at the link in the first reply to you?

For simplicity:


WiFi on 1 and 6 would mean Zigbee on 25 - assuming no neighbours have WiFi on 12 or higher.

2 Likes

Not everything is reported at the same rate. A device may poll its parent every few seconds to see if there are any buffered messages, update its LQI every five minutes and update its battery status every couple of hours. This is a design choice on the part of the manufacturer to conserve battery. Different devices will do it differently. Most battery powered devices will be unavailable when you start HA - how long they take to appear will depend on their polling cycle.

On the channel thing, if you go to Download diagnostics on the ZHA integration card youā€™ll get a list of channels, with current usage (itā€™s right at the bottom of the report).

  "energy_scan": 
      "11": 82.35373987514762,
      "12": 73.50699819621309,
      "13": 78.25348754651363,
      "14": 84.164247274957,
      "15": 43.057636198227904,
      "16": 55.9836862725909,
      "17": 88.70042934643088,
      "18": 10.914542804728702,
      "19": 19.00785284282869,
      "20": 1.5075412082833717,
      "21": 6.789392891308996,
      "22": 10.914542804728702,
      "23": 17.086630587133605,
      "24": 4.15070068297423,
      "25": 1.7132450748239665,
      "26": 70.89933442360993

You need to check this several times over a period of days to allow for neighboursā€™ noise. Unlike Wi-Fi scans, it includes Zigbee loading (channel 11 in this example).

2 Likes

Just checking up on this thread and I didnā€™t know this. Sounds super useful. Unfortunately, Download Diagnostics seems to be broken for me. The JSON serialisation fails. Is it just me?

Failed to serialize to JSON: config_entry/bd15ef695cfdf244bf2b96a4b557fc06. Bad data at $.application_state.network_info.nwk_addresses<key: 00:17:88:01:0b:76:dd:b6>=00:17:88:01:0b:76:dd:b6(<class...

Edit: Restarting HA and then trying it again, worked.

Iā€™m not having problems at the moment, but Iā€™m always worried by that usage warning in the logs (e.g., channel 15 is 95.7%ā€¦). Unfortunately the channels recommended (15, 20) are crowded (usage above 95%) and channel 25 (specifically not recommended in the current docs) is particularly empty (only 8% usage). Iā€™m reluctant to try that channel.

1 Like

Oh, ok, sorry man, I have a 14-month old child and my life is super stressful :cry: I was thinking this zigbee thing would be a hell lot easier, and now it seems it was a science of its own!

Iā€™m gonna swap over to channel 25 on zigbee then!

It would be odd, though, that the lqi reports continously, and suddenly stops doing so! It has been off now for 6 hoursā€¦