My Zigbee network was working really well until part of the house was off power for a while for electric service.
Initial Condition - Everything Working:
The network was well balanced, with end devices well distributed across all routers and some connecting directly to the coordinator.
Disruptive Event:
I cut off the power for the HA server and some of the routers to add a new power outlet.
Current Situation - Unstable Network:
After restoring power, all devices are (trying) to connect directly to the Coordinator, and apparently the connection limit has been exceeded. Some devices are offline and I constantly get errors (Delivery of BROADCAST failed).
I waited overnight hoping devices would reorganise themselves back to a balanced network, but unfortunately it didn’t happen.
How can I solve this?
I can turn off some devices, but it would be a lot of work (removing batteries one by one, etc…). Anything more practical maybe?
My suggestion is with Z2M you can pull up the map and see what is routing off of what. That is where your balance is. Turn things off and/or force pair them to better connections. Some things just connect to the first thing they see and won’t try to improve their connection.
I experience the same problems. ZigBee network works fine for months but then after a power outage I always have a few end devices drop off and have to be re-paired. Not sure how this can be avoided. Mostly it’s the Aqara sensors that drop off. Tuya ones seem to reconnect ok.
I did exactly that.
Everything was connected directly to the Coordinator.
I have 44 door/window sensors that are battery powered.
All the rest was offline. I suspect the coordinator was dropping requests from these to connect for some reason.
This way, routers couldn’t get online, and the network wouldn’t rebalance itself.
I’m not sure if what I’m saying is reasonable.
Since then, I turned off a few of the door/window sensors by removing the battery, and the routers managed to get online.
Apparently the network started to rebalance, even though some of the devices are still offline.
I think I’ll have to turn off all door/window sensors, wait a bit, and then bring them back one by one, so these connect to one of the routers.
Is it possible battery powered devices manage to reconnect to the coordinator before hardwired devices come back from a power outage? This way the coordinator gets overwhelmed and routers can’t really get online, offering a way for the system to stabilise again.
Check my previous message for more info.
I wonder if there is a way to assign priorities to routers, forcing the coordinator to establish connection with these first. This way the network would definitely converge to stability.
There are still a lot of devices hanging from the coordinator.
But right after the issue, the coordinator looked like a Dandelion, and quite a few of the devices on the left side were offline.
I see the problem as you have like 60 or more things connected to the coordinator directly. A zigbee network is intended to be a mesh, where you have a dozen ot 2 (maybe) devices on the coordinator directly and the rest connected to routing devices. So for a better network, more resilient, add more routing devices. Power plugs, bulbs that are always powered, etc in places near the end devices.
Check out some of the WIKI’s in the Cookbook list on Zigbee to explain more, but put a zigbee wall plug in each room, it will be much more stable.
I added a few Ikea smart plugs to increase the number of routers across the house.
Apparently it increased the stability, but it is also clear I’ll have to re install quite a few of my sensors in order to rebalance the network, as too many devices are still linked to the coordinator directly.
After a lot of comes and goes, I decided to nuke my whole Zigbee network and restart from scratch (stopped Z2M and deleted the DB file). For some reason it was unstable and I had trouble getting devices to join the network. This was easier than I was expecting, as devices will be recognised instead of being paired like brand new (entity ids are kept basically).
Now that’s the final result: a much more balanced network. Only 2 devices connecting directly with the coordinator.
Seems to be working well most of the time. Except for:
I had to repair a few devices a few times, for no apparent reason.
A few devices insist on not being connected on the Map (top right). I believe that’s because these are battery powered and won’t really be active all the time. Still, had to do (1) for some as these started to misbehave, like not reporting a window was open, etc.
Marcus,
Have you had another blackout/power outage since you configured youir network, and do you have any issues?
My experience is similar - let me explain. I have recently added a ups to my HA Server/ coordinator, so it now doesn’t go down when there’s a blackout. I have 37 devices in my ZHA Zigbee networks, and 13 of the devices are routers, so generally the endpoints are connected to the (generally) closer router. When the power goes down and the server/coordinator keeps on running, the end devices go and connect directly to the coordinator. However, when the power goes back on, they end up staying connected to the coordinator, just like the dandelion you describe above. This also used to happen (partially) when I didnt have a ups connected, and the end result was the same - flaky zigbee reception on distant end devices that had marginal signal strength trying to connect directly to the coordinator. Sometimes it can take ages for the problem to resolve itself, and for the end devices to connect to the closest router, sometimes they just never make that superior connection.
Is there anyway to somehow make the end devices connect back up to the optimum/closest router, rather than back to the coordinator? It just doesn’t make sense that an end device stays connected to the coordinator(some 6 or 7 metres away) when there is a a router with good siignal and connectivity less than a metre away?
Can the end devices be somehow paired or assigned to the closest/optimum router, so that that path can be used if its available, rather than trying to reach the coordinator directly?