Unreliable Zigbee Network

Hi guys!

I am at the verge of despair regarding my zigbee network in my smarthome.

i realy did everything i could to get a reliable network:

  • I got the SONOFF Zigbee 3.0 USB Dongle Plus Gateway as a coordinator that is recommended for zigbee2mqtt what i am using in HA and i used an USB cable to get separation from the PC/Wlan there
  • I got a lot of repeaters (IKEA Trätfri) to make sure there is a tight network with short connections
  • I made sure no wlan network is active on the channel i use (11)
  • I made sure all sensors (Especially from aqara) are paired at their intended location and to a repeater near them

But despite all of my effort…the network is terribly unreliable. Especially the aqara door/window sensors keep missing activity. If i open and close my door 10 times i am lucky to get 3 or 4 open/closes reported…

If i set new temperature values to my 4 Thermostates (MOES) i can be sure one of them does not get updated…

If i look at my network map i see the LQI values are realy low I think with an average of 100 … i dont know why because as said, the repeaters are only a few meters apart from each other

Does anyone have a suggestion to solve this problems i have not tried yet?

Thanks and best regards

1 Like

You said your wlan is wifi2. 4 channel 11? Or your Zigbee channel is 11? (they’re separate unrelated arbitrary numbers and ironically wifi 2.4 11 and Zigbee 11 don’t overlap - it’s a valid strategy)

Reading the rest while we wait for that answer.

Actually, I cannot tell what’s what in your map… but what we do know is that Aqara :heart: IKEA. On top of that, Aqara devices are in general quite “sticky” so they tend to stay on the router they were once paired to. All that said, try to force connect your Aqara device only to a close IKEA router ie. it will not be sufficient only to have the IKEA routers within your mesh.

This is key for me at least having a Zigbee-mesh containing 74 devices in total, whereas 16 of them are IKEA routers with 36 connected and reliable Aqara end-devices.

1 Like

Sorry yes that was unclear, my zigbee channel is 11 and my wlan channel is also 11 what means they are on the opposite side of the frquency spectrum, so there should be no interference…

And yes i read that about the aqara sensors so i made sure to pair them at location to one specific ikea router that is closest to that sensor

2 Likes

i also scanned the wlan networks via an smartphone app to see if any other neigbouring wlan might interfere, but there is none, zigbee channel 11 should be as free as possible

The LQI values are a tricky thing and sometimes not that reliable, but I tend to agree that an average of 100 seems to be on the lower side of it all. That said, it might still look like you are having some interference problems so this link can perhaps give you some pointers.

There are no other networks i could find on the zigbee ch 11 … most of the wlan networks are on ch 11, some on 6 but thats it…

is there a possibility that some other devices are interfering? I once read about a microwave that was the problem (although i do not own one). But if so…how can i search for such a thing?

Or is it possible that one (or more) of my repeaters is not functioning correctly? How does the mesh work…does the message take one path (and if a faulty repeater is on that path the message gets lost) or does the message travel to the coordinator via multiple different paths?

The message should take the best of many different paths.

In theory all your routers should be continually re-evaluating and changing their connections, so that messages get the optimum path, and if one router fails the network self-heals. End devices connected to a failed router should connect to another - although as @Calle says Aquara devices can be quite “sticky” and may not.

Rather than a router failing, it is much more likely that one or more of your routers has a weak connection with the others, so that the devices “stuck” to it can’t get a reliable link to the co-ordinator.

Basically, you need more routers.

I know you have a lot of “repeaters”, but it may still not be enough. It’s not just the distance between them that counts, but also the density of the mesh - each device has to have many different ways of talking to each of the others.

Unfortunately it’s impossible to predict how many are needed - depends on the design of the building, size of the rooms, structure of the walls etc. etc. Bear in mind too that not all routers are equal - wall sockets and plugs tend to have a stronger signal than bulbs.

LQI values… Well, how long is a piece of string? But if your average is 100 some will probably be much lower. As you add routing devices you should see the average creep up.

Something I’ve only recently discovered: if you have Zigbee groups, sending quick-fire commands to them (in an automation, for example) can swamp the network and cause messages to be lost. This is because messages to a group are blasted out to every node in the mesh, just to make sure all the members of the group get them. Better to address devices individually, in automations at least. (I use ZHA, but presumably the same applies to Z2M.)

1 Like

Hi there,

a short update on my struggles towards a reliable zigbee network.

…there is simply no way to get it to work reliably and i am SOOOO frustrated right now.

I tried to switch the channels and moved to channel 24 (while changing the wifi to ch1 of course)…still the same problems

I switched from the SONOFF to the skyconnect dongle and repaired all devices to this one, the LQI got a bit better with all devices now > 120 but ultimately still the same problems

I double and tripple checkt for interfering WLANS, correct bindings of sensors to their nearest routers, overall good coverage of repeaters, made sure the dongle is far away from the usb port and on USB2…

The strange thing is, the problems seem to come and go…for 10min the network is rock solid, then suddenly devices start to make problems like door/window sensors not reporting changes at all…then some time later all works perfect again until the next problematic phase…

It seems like there is some kind of interference that is occuring not always but in some intervals, but i have no clue how to verify that or find out whats the source of this disturbence…

Is there any way to scan the 2,4Ghz band for interference?
Anyone got any further ideas?

These kind of radio connectivity problems can in fact drive you nuts. And if it’s intermittent, even worse. You’d need specialized equipment for accurate scanning of the WiFi spectrum. In could very well be that sudden interference buries the Zigbee channel. But without dedicated scanning tools that’s hard to prove. If you really are hitting a wall, check on Amazon for a not too expensive spectrum analyzer.

Once I had a problem more or less like that. Somehow I made to my ISP the argument that it was related to their cabling. They came home and scanned all surroundings several times until triangulated the source to inside my own house. It turned out to be a loose coaxial connection.

There are tools to scan for deep 2.4 GHz network data, but you are looking at some coin. Low end that I know of is going to set you back USD 300 to 500. And the sky is the limit from there. I pretty solid and useful low to medium cost system in link below:

You do not describe how you are running Zigbee2MQTT and your MQTT broker that I can see in your posts. I am not a big fan of running these services from within Home Assistant HAOS, if that is what you are doing. Too many ‘nooks and crannys’ of things you are not in full control of, and to tightly coupled IMHO.

If I can spend some more of your coin and sweat equity…

I would, either on a dedicated Linux machine running docker such as a Raspberry Pi 4, or a docker container on your current Home Assistant host if it is running a generic Linux. Either in parallel with you HA server (but not as a service under HA in HAOS or any other ‘parent’) spin up Docker and then run two instances of Zigbee2MQTT (each with it’s own TI CC2652 coordinator USB dongle) and a Mosquitto MQTT broker. The reason for two instances (make sure that each have a unique MQTT base topic) is that you can have one as your production Zigbee network and one as your test network. Build up your production Zigbee2MQTT at a pace your are comfortable with (make sure to solidly factory reset each Zigbee device before you add it to your production (or test) Zigbee2MQTT network). As you get a stable production Zigbee2MQTT network, as you find new devices, first test them in your test Zigbee2MQTT network for a period, then when you are comfortable they are working okay, move them to the production network (again, make sure the device is factory reset before added).

From your Zigbee2MQTT network map, you seem to have a good mesh of devices. That said, it is NOT possible to see the specific device manufacture, firmware and more details of each device. And one of the BIG holes in diagnosing and monitoring at zigbee mesh, is that is is just a ‘point in time’ look into your network. What is really needed, are hard to get, is a look over time at changes in your network mesh.

Also, you say you are using a ‘recommended’ device as a coordinator. But, devil is in details. Is it a Sonoff P or E device? Pretty big difference. What firmware version is installed in the coordinator, again lots of differences between firmware versions. I’m hoping you are not trying the whole multi-protocol Sky connect thing firmware in your coordinator, if you are STOP and just do Zigbee ONLY firmware.

Physical layout of your devices is also something that would be helpful to know. The zigbee2mqtt map does not show ‘wall and door material’ nor physical distance.

IMHO, the LQI stuff is worthless. This because, each manufacture can come up with their own algorithm for calculating LQI. So total ‘apples to oranges’ in most comparisons.

To your question on ‘abby normal’ router devices, I would look there before worrying about any 2.4 GHz interference (I’m not saying that there can not be interference as a cause of Zigbee issues, just in my experience network conflict and noise are down the list of possible causes. From my experience, some Aqara device have been highly problematic. I’ve not had any issues with Moes Zigbee devices. Sonoff, I’ve not had any issues that seemed to impact the network, but some of their firmware versions have shown to cause issues as Zigbee end devices.

With the large number of new Zigbee devices coming on the market, it is unfortunate that Zigbee as a ‘standard’ does not have more good and reasonable cost tools to diagnose issues and monitor network operation.

Good hunting! Keep the faith, I think zigbee with zigbee2mqtt can be a very solid part of Home Assistant. Divide and test! Having a test zigbee2mqtt network has been very helpful to me.

1 Like

I know I know i said don’t give much weight to LQI, however looking at your Zigbee2MQTT map, if I read correctly you have a number of ‘zero’ values for coordinator to router connections. Those seem odd. In a map from one of my zigbee network (below), no zeros for router to coordinator connections anywhere.

hsh_keSteckdos??uere
eg_bz_LampeMitte
hsh_sz_Steckdose
eg_sz??uftentfeucher

and perhaps some more.

I know your pain! At that point I just decided to switch technologies and went for esphome (wifi) devices and not regretted it for even one second. Today I have over 50 esphome nodes and all work brilliant since the moment installed. Not one Zigbee device left in my home (sold them all) and it is such a relief to know everything is stable and I don’t need to fear that a (zigbee) device will just stop working.

hi guys!

My problems regarding the zigbee reliability seems to be fixed now!
At least i hope so, since my last changes two days ago everything works rock solid with not a single message getting lost so far (fingers crossed)

I just wanted to share my findings regarding the solution:
After nearly giving up i finally decided a last try and switch away from Z2M to ZHA.

In hindsight I dont know if this switch alone did anything for the better BUT I discovered a feature in ZHA that I could not find in Z2M (please tell me if it is available there to): the possibility to perform an energy scan on all the zigbee channels.

And that scan really showed a clear picture: It was very clear that the upper channels where really busy (where my WLAN operates) but the most crucial finding was that the zigbee ch11 (where i was operating my zigbee mash for the whole time) was also very very busy with >80%!! So a clear indication of some interference going on on that channel

Therefore I switched over to channel 13 which was shown very empty and VOILA … everythig is working like a charm since then. And even if my network is very active the energy scan never goes above 20%.

I am sooo happy right now, that energy scan feature saved the day…its all about what you can measure you can control! :slight_smile:

2 Likes

Awesome, congratulations! Maybe I’ll be there one day, too. Still much to do, my network is still a mess. If you’d like to help, you could share your knowledge about that:

Where and how can I do that?

Of Couse…you can find it under:
Settings → Devices → SkyConnect (or whichever ZHA device) → ⋮ menu → Download diagnostics
The results of the energy scan can be found at the bottom of the downloaded file