I don’t think it’s a general network problem, because most other devices work just fine.
I’ve tried power cycling the devices in question, doesn’t change anything.
I can repeat the request 5, 10 or 15 times, but it never succeeds
Intermittently, if I try reconfiguring neighbouring devices, or reloading the zigbee integration, the device will work for a little while.
My logs show an error like this:
zigpy.exceptions.DeliveryError: Failed to deliver message: <sl_Status.ZIGBEE_DELIVERY_FAILED: 3074>
I’m using SkyConnect v1.0 with Firmware: 7.4.4.0 build 0 which was just updated.
I could be wrong, but this seems like a software issue rather than a zigbee mesh network issue. Reloading the zigbee integration sometimes leads to improvement, sometimes it’s temporary. That implies to me that something is going on with home assistant rather than the network, but this is just my uninformed guess.
Can anyone suggest steps I could take to investigate this issue further? Could this be related to interference somehow?
The devices are a mixture, a handful of hue bulbs and at least one socket. Of the 57 devices on the zigbee network, about 40% have batteries. The rest are mostly bulbs and sockets. Mixture of bulbs and lots of Nous A1Z sockets. I presume the bulbs and sockets are all routers.
But one of the lights that has been playing up is 4m away from the dongle. During the problem, all other devices, lights, and so on, seem to work fine. The network often runs smoothly for months at a time.
Not a good indicator. Messages to each end device will take different routes, and the routes will change over time.
More routers. Zigbee is self-configuring and self-healing, but only if there are lots of different paths for messages to take between two points. Sounds as if you have a bottleneck somewhere so that a small number of paths are being overloaded. A solid mesh depends on redundancy.
So one misbehaving device could be blocking signalling to the devices which are not working? It seems weird that it works for a while after reloading the zigbee integration.
How can I figure out what is an appropriate number of routers? My senes is that I have multiple routers in every room, and strong coverage throughout the building. I’ve specifically added devices in such a way as to try and provide good mesh coverage and multiple paths.
I do have a couple of devices offline, maybe that’s an issue. People are always turning off the light in one room with the switch instead of via alexa. That might be causing the network to reorganise and cause knock on problems.
I’m realising the problem is more widespread than I thought. It seems like a bunch of devices are generating the delivery failed message. Sockets, lights from different manufacturers. All while a bunch of lights and motion sensors are working perfectly.
Yes, I’m using the extension cable on the dongle. I’ve now tried moving it’s location to be higher up above the rack on top of a cardboard box.
I ordered a 3m shielded USB 2 extension cable. I will try adding that tomorrow to see if it helps. Then I can put the dongle far away from any potential source of interference.
Holy smokes, I turned that light back on, and it seems like my whole network is working again. It’s in a completely different part of the building, but apparently it was really messing things up. I will try to replace it with one that doesn’t have a switch!
This morning the same devices seem to be back to not working, same error code, but this time the light I thought was causing these issues has been powered continuously.
I also found a few issues about this topic, the most recent one is here. It seems like there are sometimes zigbee network issues, maybe caused by adding new devices, maybe by some routers behaving weirdly, and fixing it can be time consuming and unpredictable. I will try to investigate removing devices from the network, see if I can figure out how to get back to a reliable mesh. I’ll also be more cautious about adding new devices in the future, it seems like that can cause issues, even if not immediately.
I have the same problem. Frequently occurring, but not always, so it’s difficult to troubleshoot. The devices affected changes as well. Dealing with this for about six months now