My Zigbee network was working really well until part of the house was off power for a while for electric service.
Initial Condition - Everything Working:
The network was well balanced, with end devices well distributed across all routers and some connecting directly to the coordinator.
Disruptive Event:
I cut off the power for the HA server and some of the routers to add a new power outlet.
Current Situation - Unstable Network:
After restoring power, all devices are (trying) to connect directly to the Coordinator, and apparently the connection limit has been exceeded. Some devices are offline and I constantly get errors (Delivery of BROADCAST failed).
I waited overnight hoping devices would reorganise themselves back to a balanced network, but unfortunately it didn’t happen.
How can I solve this?
I can turn off some devices, but it would be a lot of work (removing batteries one by one, etc…). Anything more practical maybe?
My suggestion is with Z2M you can pull up the map and see what is routing off of what. That is where your balance is. Turn things off and/or force pair them to better connections. Some things just connect to the first thing they see and won’t try to improve their connection.
I experience the same problems. ZigBee network works fine for months but then after a power outage I always have a few end devices drop off and have to be re-paired. Not sure how this can be avoided. Mostly it’s the Aqara sensors that drop off. Tuya ones seem to reconnect ok.
I did exactly that.
Everything was connected directly to the Coordinator.
I have 44 door/window sensors that are battery powered.
All the rest was offline. I suspect the coordinator was dropping requests from these to connect for some reason.
This way, routers couldn’t get online, and the network wouldn’t rebalance itself.
I’m not sure if what I’m saying is reasonable.
Since then, I turned off a few of the door/window sensors by removing the battery, and the routers managed to get online.
Apparently the network started to rebalance, even though some of the devices are still offline.
I think I’ll have to turn off all door/window sensors, wait a bit, and then bring them back one by one, so these connect to one of the routers.
Is it possible battery powered devices manage to reconnect to the coordinator before hardwired devices come back from a power outage? This way the coordinator gets overwhelmed and routers can’t really get online, offering a way for the system to stabilise again.
Check my previous message for more info.
I wonder if there is a way to assign priorities to routers, forcing the coordinator to establish connection with these first. This way the network would definitely converge to stability.
There are still a lot of devices hanging from the coordinator.
But right after the issue, the coordinator looked like a Dandelion, and quite a few of the devices on the left side were offline.
I see the problem as you have like 60 or more things connected to the coordinator directly. A zigbee network is intended to be a mesh, where you have a dozen ot 2 (maybe) devices on the coordinator directly and the rest connected to routing devices. So for a better network, more resilient, add more routing devices. Power plugs, bulbs that are always powered, etc in places near the end devices.
Check out some of the WIKI’s in the Cookbook list on Zigbee to explain more, but put a zigbee wall plug in each room, it will be much more stable.
I added a few Ikea smart plugs to increase the number of routers across the house.
Apparently it increased the stability, but it is also clear I’ll have to re install quite a few of my sensors in order to rebalance the network, as too many devices are still linked to the coordinator directly.
After a lot of comes and goes, I decided to nuke my whole Zigbee network and restart from scratch (stopped Z2M and deleted the DB file). For some reason it was unstable and I had trouble getting devices to join the network. This was easier than I was expecting, as devices will be recognised instead of being paired like brand new (entity ids are kept basically).
Now that’s the final result: a much more balanced network. Only 2 devices connecting directly with the coordinator.
Seems to be working well most of the time. Except for:
I had to repair a few devices a few times, for no apparent reason.
A few devices insist on not being connected on the Map (top right). I believe that’s because these are battery powered and won’t really be active all the time. Still, had to do (1) for some as these started to misbehave, like not reporting a window was open, etc.