Zigbee super unreliable?

I’m really at a loss here what to do. I keep having devices that have to go thru a repeater (and the repeaters themselves) dropping out for no apparent reason and can’t figure out how to diagnose it.

Most forums and such say “just turn the gateway of for 15 minutes then wait 24-48 hours and it will heal” but that only works about half the time, and then I have to un-pair and re-pair each sensor and test them all gain, not a trivial task.

Presently half my IKEA TRADFRI repeaters are showing in DECONZ as “unreachable” and I have no idea why. Unplugged them and re-plugged to try and reboot, no dice. About 1/4 of my motion sensors are also totally unresponsive, though I have no idea if its related because they don’t seem to pick the nearest repeater to pair to, its totally random from what I can tell.

I don’t see how it could be a range thing, in the case of the office one its literally right above the gateway thru the floor, and 1 wall away from the garage…the garage one is “online” but the one closer is not. Most all these things are within 50ft and 1-2 walls of the gateway.

I’m really out of ideas and spending so much time debugging makes me think this is why zigbee is so much cheaper that I should simply invest more money and get Z-Wave sensors which seem far more bulletproof.

Does anyone have ideas how to troubleshoot the “just works” when it “just doesn’t”?

You’ve gone down most of the normal questions but you haven’t covered…

What coordinator stick/device you’re using and
How it’s connected to the machine running deconz (yes it can matter)
What Zigbee channel your system is using
What WiFi channel(s) your 2.4Ghz wifi is using.

RF interference can knock even the best planned zigbee network on its knees. As low power as zigbee is - 50ft could still be too far in a noisy rf environment

1 Like

Conbee II that is on a USB extension so it sticks straight up away from my server rack while the host is mounted inside the enclosure. I’m running HassOS. This seemed to provide the best performance having it stick straight up and away from the other cables and such. See photo.

I changed it to Zigbee channel 25 after some suggested that may be better, but it still seems horribly unreliable.

There’s WiFi on all the channels around me. I only use 2.4GHz for a couple ESPHome devices, all my phones/cameras/laptops/tablets are on 2.4GHz. My own network the 2.4GHz APs are on 1, 6, 11 but I have neighbors on 1, 4, 6, 7, 9, 11.

I also have a spectrum analyzer I’ve used for chasing WiFi issues but its not showing anything obvious to me right now…and I assume that interference would have to be non-stop in order to keep it broken? Or would losing connectivity briefly make it give up until its manually removed and re-paired?

That answer is unfortunately VERY brand specific. How Ikea handles disconnect v. how Aquara, V. another are all different.

You have 1, 6, and 11 on your 2.4 segment AP’s and Zigbee on 25? Your Wifi2.4-11 and Zigbee 25 are in direct conflict (someone will link the metageek image here I’m sure - I can’t find it right now) Short version - in your case I would immediately stop using 11 on your 2.4Ghz Wifi.

Compared to Wifi Signals, Zigbee is like whispering next to a bullhorn. Or someone on that same bullhorn next to a cruise ship horn… Orders of magnitude. Wifi ALWAYS wins in an overlap situation. The net result is knocking out the Zigbee mesh, sometimes temporarily - sometimes, not until a manual intervention - depends on your routing device.

In my case I had a great zigbee mesh I thought was running perfectly on a ST hub for two years, then one morning Poof - the entire western half of the system went unresponsive. Turns out my neighbor got new wifi gear and MY gear compensated by switching channels. Guess how long THAT took to figure out. (And three more days to remember to switch off the auto channel optimization on my Unifi gear AFTER I fixed it the first time…)

Back to your first original core question - Which I read - is there a magic bullet to make this bulletproof for me? No. Neither will there be for ZWave - nor Esp32 Devices on Wifi, or Bluetooth LE… They ALL have their quirks.

I was under the impression zigbee 25 is past channel wifi11?

I’m also not sure that would matter given there are so many other networks on all the other channels anyway. For example, had to move some of my APs away from channel 1 because someone got some kind of high power thing that is now overpowering my own devices in my own house from wherever it is, and based on the OID its some kind of chinese mesh thing now spamming a lot (I think that’s what the activity bars are on the first third of the spectrum analyzer). I’ve got 4 APs to cover the house reliably so I also need to be careful I’m not interfering with myself since that already is more than the number of non-overlapping WiFi channels.

It also gets better - I have played the “tuning” game myself - I think some of the neighbors’ WiFi jumps around channels…so it may be a futile effort.

So far all my Z-Wave stuff has been rock solid, as has my ESPHome (though mDNS doesn’t behave reliably). The Z-Wave even seems to quickly reconfigure itself if (for example) a circuit is shut off to do maintenance nothing noticeable happens. Once in a while, one will fail to shut off with an automation but I think that’s an automation glitch not radio because it always is fully responsive if I manually toggle it in the app.

Matt, I fully agree to you points, however I have come to the conclusion that a lot of this theory is a little outdated and I can not find good information on wifi as it is used today.

The problem with the original drawings of the overlap are they rely on 20MHz wifi channels, and most modern wifi access points use 40 MHz as standard. This creates much more overlap. Further they have “performace boost”, hence will spam even more in smaller periods. In EU we have 13 wifi bands, creating full overlap with zigbee channel 25 and 26, and I suspect channel 11 in 40 mhz is using the same frequencies. I have not been able to find good information on how all this influences zigbee.

Personally I use Zigbee channel 11 and channel 13, due to neighbors using mainly channel 6 and 11. Have my own wifi on channel 6 (20mhz), 9 (20 mhz) and 13 (40 mhz). It works super solid for me, both for wifi and zigbee. I have only limited traffic on my 2.4ghz wifi, most devices being 5ghz.

As far as I know, it still has to stay in the same overall width allocated to the country - for example here in the USA the highest WiFi channel number 11 with 40MHz can’t go higher so it actually runs on 6+11 while channel 1 with 40MHz runs on 1+6. So that’s why being “above” the outer edge of the highest 20MHz WiFi channel should, at least in theory, be reduced interference nomater what combination of WiFi you are on.

You can kind of see it on the spectrum analyzer too, where 1-6 is more or less a solid block of RF energy consumed by the 40MHz networks on 1 and 6 which I guess were more active at that moment.

At least on Linux I can see that in the WiFi channel list whether there is a secondary channel and if its above/below the primary channel.

Also, inexplicably, yet another round of “turn it off and on again” which failed last night suddenly its working again today. You’ll ask me what changed and as far as I can tell nothing changed.

My graph is also magically complete again, which I also can’t explain. It was still broken when I went to bed giving up at the end of yesterday.

I hate it. It’ll probably break again in 0-30 days.

Agree, it should stay within the total spektrum. So, your setup in US is good. It do not work in EU due to our 13 Wi-Fi channels.

One thing you could try. The coordinator, move it far away from everything. I have It on a 2m long extendercable with nothing electronic within the closest 1.8m. When I started it was only 0.3-0.5m from the RPi, and I got much better stability when moving it much further.

Tried to do that to a degree already by having it on the top exterior of my 42U rack.

I would not think the “repeater” plugs need to be in separate farther away extension cord locations would they?

It’s the coordinator I would move away from the rack, as it made of iron. Maybe not shielding enough for the electronic in the rack, and limiting the coordinator radio in the direction of the rack. Hence, my idea of moving it fare away from everything.

Also limited by where the computer gear is at all, its already on the top of a 6ft rack and the ceiling is only 7ft tall. Can’t relocate the server because its down with the switches and patch panel there.

The other strange thing is it was actually some of the closer by things that quit working? For example stuff easily twice as far away continued to work and it was located in the same direction as some sensors that stopped working. That’s the other strange thing, you would expect things physically nearer to work better not worse.

Remember the coordinator do not need to go up. It need to be far away from other electronic parts, or other parts interfering with the signal. So, move it right, left or even down, will be as effective.

If interference is the reason, then the direction of the interference will influence which devices, maybe even more than the distance.

Moving it is easier said than done, the only other place it could really reach would be sitting on the mass of network cables at the patch panel.

Physical access and possibilities are a real problem for everybody trying to make “dumb” houses into smart versions. Im especially challenged with access to the power cables, most of them there is no real access to at all, as they are in the walls and too short to change original switches.

You could try and buy a additional 2m long USB extender and use instead. Then try for some days to have it hanging somewhere in the middel of the air and see if it helps. If yes, then you need to figure out something. If NOT, then you will know it is something else.

I expect “middle of the room” would be far worse, since then it’d be with a basement wall on 1 side and between the server rack and steel shelving in the isle-way instead of above it near the ceiling. The room is only like 7ft x 8ft.

I thought that’s supposed to be the whole point of it being a “mesh” that the repeater nodes like 4ft away thru the floor above it in the office room outlet should be helping it get out - along with the repeater node in the garage ceiling plug, livingroom plug, and 2nd floor spare room plug.

Then try and buy a 10m long USB extender cable and move it outside the room for testing. Aim is to determine if it get much more stable when away form the rack.

I have quite a similar experience. I have tried 3 different zigbee stick vendors, and have approx 30 ZB devices from 5 different vendors.

Some devices are quite reliable and others are junk. Some including powered sockets remain on the network quite reliably, but some powered devices even within 3m & no walls drop off the network. Huge annoyance when I have to disassemble the light fitting to re-pair a ZBmini. The fitting is already broken now from so many disassemblies.

My overall experience: 90% of the time there’s at least one device that’s come off the network for no good reason that needs re-pairing.

My devices in order of reliability (best first, with approx length of time before I have to re-pair).
-Ikea light bulbs (9months+)
-Ikea power outlets (6months+)
-salus power outlets (~4months)
-sonoff ZB minis (~3months)
-sonoff battery temperature humidity sensors (~1month)
-sonoff battery door sensors (~1month)
-aqara battery powered door sensors (~1month)
-aqara battery push button (~1month)
-aqara battery temperature/humidity sensor (2weeks)
-tuya battery water leak sensor (~1week)
-tuya power outlets (~5minutes!)

If there really are ‘weak network/interference’ issues, I think we need better diagnostics & recommendations. E.g overall health of the network based on SNR between nodes, and concrete recommendations, such as ‘add more repeaters to get your network to 5stars’.

As it is, my ZB network is a time thief and not something I can depend upon.

3 different coordinators? Which one do you use?
This is a fireware or interference issue. Check you are on a known good firmware and have long cable on.

I tried the following one at a time
-cc2531
-electrolama ZZH
-skyconnect

I updated each of them to the newest firmware and used a 1m usb extension.

I honestly doubt it is a stick firmware or interference issue.

Ok, on a RPi4 or? Remember to use the USB2 port. If a SSD is used through USB this will create interference, as it uses 2.4ghz, so secure good distance to coordinator.

You use ZHA? The standard is zigbee channel 15, so make sure your Wi-F is using high channels, like 11 (in us) or 13 (if europe). search for zigbee channel overview and you will easily find drawings of the interference between zigbee and Wi-Fi. Remember Wi-Fi will always win.

More specific. The aqara battery devices. Remember to help them to connect to a mains close to. They will not optimize later.

And, I agree. The diagnosis tools are weak or even not existing…