Big problems with Zigbee all of a sudden - Please help me debug

For about 1-2 weeks now, I’ve all of a sudden been suffering from issues with my Zigbee devices. Many, if not all, of them stop responding until I reload ZHA, and this happens multiple times a day.

As a relative newbie to HA and the intricacies of Zigbee, I’d really appreciate any insights or tips on how to debug this.

Here’s the log information from a recent failure of trying to turn on a light which is only about 10 feet from the controller:

> 2024-04-11 11:51:14.734 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [547772682560] Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>
> Traceback (most recent call last):
>   File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 64, in wrap_zigpy_exceptions
>     yield
>   File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 84, in wrapper
>     return await RETRYABLE_REQUEST_DECORATOR(func)(*args, **kwargs)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/local/lib/python3.12/site-packages/zigpy/util.py", line 131, in retry
>     return await func()
>            ^^^^^^^^^^^^
>   File "/usr/local/lib/python3.12/site-packages/zigpy/zcl/__init__.py", line 377, in request
>     return await self._endpoint.request(
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/local/lib/python3.12/site-packages/zigpy/endpoint.py", line 253, in request
>     return await self.device.request(
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/local/lib/python3.12/site-packages/zigpy/device.py", line 339, in request
>     await send_request()
>   File "/usr/local/lib/python3.12/site-packages/zigpy/application.py", line 841, in request
>     await self.send_packet(
>   File "/usr/local/lib/python3.12/site-packages/bellows/zigbee/application.py", line 931, in send_packet
>     raise zigpy.exceptions.DeliveryError(
> zigpy.exceptions.DeliveryError: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>
> 
> The above exception was the direct cause of the following exception:
> 
> Traceback (most recent call last):
>   File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 239, in handle_call_service
>     response = await hass.services.async_call(
>                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/src/homeassistant/homeassistant/core.py", line 2543, in async_call
>     response_data = await coro
>                     ^^^^^^^^^^
>   File "/usr/src/homeassistant/homeassistant/core.py", line 2580, in _execute_service
>     return await target(service_call)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 971, in entity_service_call
>     single_response = await _handle_entity_call(
>                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 1043, in _handle_entity_call
>     result = await task
>              ^^^^^^^^^^
>   File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 631, in async_handle_light_on_service
>     await light.async_turn_on(**filter_turn_on_params(light, params))
>   File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 361, in async_turn_on
>     result = await self._on_off_cluster_handler.on()
>              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 83, in wrapper
>     with wrap_zigpy_exceptions():
>   File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
>     self.gen.throw(value)
>   File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 75, in wrap_zigpy_exceptions
>     raise HomeAssistantError(message) from exc
> homeassistant.exceptions.HomeAssistantError: Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>

Is your Zigbee channel on “auto”?
Asking as I saw your error on this thread…

How many devices do you have, and what kind? What are you running HA on?

@VPC, Yes it’s on auto and it’s selected channel 25.

@Stiltjack, I have 11 devices total:

  • 1 x SkyConnect on a USB extension cable (reporting as EZSP by Silican Labs, Firmware: 7.1.1.0 build 273)
  • 5 x Singled E11-G13 bulbs
  • 3 x CentraLite 3326-L motion sensors
  • 1 x SmartThings motionv4 sensor
  • 1 x CentraLite 4257050-RZHAC switch
    This is running on a RPi 4 with 4GB RAM.
    Core: 2024.4.2
    Supervisor: 2024.04.0
    Operating System: 12.2
    Frontend: 20240404.1

After reading through the thread linked above by @VPC, I realize that I have only 1 routing device (the switch). So I’ve ordered a 4 pack of Third Reality smart plugs to see if they help out.

Update the firmware on your SkyConnect, to 7.3.2.0 build 212.

UPDATE:

Firmware upgrade is probably a good idea and when doing that do change away from RCP Multi-PAN (multiprotocol) so move to Zigbee NCP only firmware if you have not already, see → https://skyconnect.home-assistant.io/about-firmware-options/

Also note that EMF interference and not using enough/good Zigbee Router devices can prevent proper Zigbee communication so make sure to follow all best practice tips, see → Zigbee networks: how to guide for avoiding interference + optimizing using Zigbee Router devices (repeaters/extenders) to get best possible range and coverage

1 Like

Yeah, you do not have enough Zigbee Router devices in your Zigbee wireless mesh network!

Zigbee is not only very sentitive to EMF interference but due to poor radio propegration and low-power radios these devices uses they totally depends on mesh networking technology in order for all messages send to reach its intended target without them having to be re-sent over and over, meaning you always need many Zigbee Routers as for robustness Zigbee devices need to have multiple paths to reach the Zigbee Coordinator. Read:

Again, start by follow all these tips (including adding a few “known good” Zigbee Router devices):

You should know how Zigbee works so also recommend reading all these additional community guides:

Thanks for the wealth of information.

I’m still waiting for my smart plugs to come in, but after this morning, I’m not convinced they will solve the issue. Things stopped working again today for some of my lights, even though most all of them have a green connection right now.

And the log is scrolling with information like this:

> 2024-04-13 08:42:56.568 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'26a6a1a9602a675278f87e'
> 2024-04-13 08:42:56.569 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'83401b7e'
> 2024-04-13 08:42:56.570 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendUnicast: [<EmberStatus.MAX_MESSAGE_LIMIT_REACHED: 114>, 224]
> 2024-04-13 08:42:56.570 DEBUG (MainThread) [bellows.zigbee.application] Request (0xF6C0, 159) failed to enqueue, retrying in 0.5s: EmberStatus.MAX_MESSAGE_LIMIT_REACHED
> 2024-04-13 08:42:56.773 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0xBEE7](E11-G13): Device seen - marking the device available and resetting counter
> 2024-04-13 08:42:56.774 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0xBEE7](E11-G13): Update device availability -  device available: True - new availability: True - changed: False
> 2024-04-13 08:42:57.072 DEBUG (MainThread) [bellows.ezsp.protocol] Send command sendUnicast: (<EmberOutgoingMessageType.OUTGOING_DIRECT: 0>, 0xf6c0, EmberApsFrame(profileId=0, clusterId=32768, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=245), 159, b'\xf5\x00%\xb0\xfd\xfe\xffA-\x14\x00\x00\x00\x00')
> 2024-04-13 08:42:57.076 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'63a721a9602a1572af944a252a5592099d4e275e72c0928bd8769e77033f12b3ebcdde6f5a947e'
> 2024-04-13 08:42:57.087 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'37a7a1a9602a6753aadf7e'
> 2024-04-13 08:42:57.087 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'8430fc7e'
> 2024-04-13 08:42:57.089 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendUnicast: [<EmberStatus.MAX_MESSAGE_LIMIT_REACHED: 114>, 225]
> 2024-04-13 08:42:57.091 DEBUG (MainThread) [bellows.zigbee.application] Request (0xF6C0, 159) failed to enqueue, retrying in 1.0s: EmberStatus.MAX_MESSAGE_LIMIT_REACHED
> 2024-04-13 08:42:57.107 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0xAD17](E11-G13): Device seen - marking the device available and resetting counter
> 2024-04-13 08:42:57.107 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0xAD17](E11-G13): Update device availability -  device available: True - new availability: True - changed: False
> 2024-04-13 08:42:58.092 DEBUG (MainThread) [bellows.ezsp.protocol] Send command sendUnicast: (<EmberOutgoingMessageType.OUTGOING_DIRECT: 0>, 0xf6c0, EmberApsFrame(profileId=0, clusterId=32768, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=245), 159, b'\xf5\x00%\xb0\xfd\xfe\xffA-\x14\x00\x00\x00\x00')
> 2024-04-13 08:42:58.096 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'74a421a9602a1572af944a252a5592099d4e275e72c0928bd8769e77033f12b3ebcdde6f995e7e'
> 2024-04-13 08:42:58.107 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'40a4a1a9602a6750352a7e'
> 2024-04-13 08:42:58.107 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'8520dd7e'
> 2024-04-13 08:42:58.109 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendUnicast: [<EmberStatus.MAX_MESSAGE_LIMIT_REACHED: 114>, 226]
> 2024-04-13 08:42:58.109 DEBUG (MainThread) [bellows.zigbee.application] Request (0xF6C0, 159) failed to enqueue, retrying in 1.5s: EmberStatus.MAX_MESSAGE_LIMIT_REACHED
> 2024-04-13 08:42:59.248 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'50a4b5a9362aba43ad9e4a4aa755d0047e'
> 2024-04-13 08:42:59.248 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'8610be7e'
> 2024-04-13 08:42:59.250 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received incomingSenderEui64Handler: [00:0d:6f:00:0a:f4:f1:af]
> 2024-04-13 08:42:59.250 DEBUG (MainThread) [bellows.zigbee.application] Received incomingSenderEui64Handler frame with [00:0d:6f:00:0a:f4:f1:af]
> 2024-04-13 08:42:59.255 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'60a4b1a9112a11b259944a25aa5596499caf97632d389874f63046390180c0e6c6d9de6f59187e'
> 2024-04-13 08:42:59.255 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'87009f7e'
> 2024-04-13 08:42:59.260 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received incomingMessageHandler: [<EmberIncomingMessageType.INCOMING_BROADCAST: 4>, EmberApsFrame(profileId=0, clusterId=0, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_SOURCE_EUI64: 1024>, groupId=0, sequence=225), 176, -56, 0xf6c0, 255, 255, b'\xf6%\xb0\xfd\xfe\xffA-\x14\x00\x00']
> 2024-04-13 08:42:59.261 DEBUG (MainThread) [bellows.zigbee.application] Received incomingMessageHandler frame with [<EmberIncomingMessageType.INCOMING_BROADCAST: 4>, EmberApsFrame(profileId=0, clusterId=0, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_SOURCE_EUI64: 1024>, groupId=0, sequence=225), 176, -56, 0xf6c0, 255, 255, b'\xf6%\xb0\xfd\xfe\xffA-\x14\x00\x00']
> 2024-04-13 08:42:59.262 DEBUG (MainThread) [zigpy.application] Received a packet: ZigbeePacket(timestamp=datetime.datetime(2024, 4, 13, 13, 42, 59, 262272, tzinfo=datetime.timezone.utc), src=AddrModeAddress(addr_mode=<AddrMode.NWK: 2>, address=0xF6C0), src_ep=0, dst=AddrModeAddress(addr_mode=<AddrMode.Broadcast: 15>, address=<BroadcastAddress.ALL_ROUTERS_AND_COORDINATOR: 65532>), dst_ep=0, source_route=None, extended_timeout=False, tsn=225, profile_id=0, cluster_id=0, data=Serialized[b'\xf6%\xb0\xfd\xfe\xffA-\x14\x00\x00'], tx_options=<TransmitOptions.NONE: 0>, radius=0, non_member_radius=0, lqi=176, rssi=-56)
> 2024-04-13 08:42:59.263 DEBUG (MainThread) [zigpy.zdo] [0xf6c0:zdo] ZDO request ZDOCmd.NWK_addr_req: [14:2d:41:ff:fe:fd:b0:25, <AddrRequestType.Single: 0>, 0]
> 2024-04-13 08:42:59.264 DEBUG (MainThread) [bellows.zigbee.application] Sending packet ZigbeePacket(timestamp=datetime.datetime(2024, 4, 13, 13, 42, 59, 264650, tzinfo=datetime.timezone.utc), src=AddrModeAddress(addr_mode=<AddrMode.NWK: 2>, address=0x0000), src_ep=0, dst=AddrModeAddress(addr_mode=<AddrMode.NWK: 2>, address=0xF6C0), dst_ep=0, source_route=None, extended_timeout=False, tsn=246, profile_id=0, cluster_id=<ZDOCmd.NWK_addr_rsp: 0x8000>, data=Serialized[b'\xf6\x00%\xb0\xfd\xfe\xffA-\x14\x00\x00\x00\x00'], tx_options=<TransmitOptions.ACK: 1>, radius=0, non_member_radius=0, lqi=None, rssi=None)
> 2024-04-13 08:42:59.265 DEBUG (MainThread) [bellows.ezsp.protocol] Send command sendUnicast: (<EmberOutgoingMessageType.OUTGOING_DIRECT: 0>, 0xf6c0, EmberApsFrame(profileId=0, clusterId=32768, sourceEndpoint=0, destinationEndpoint=0, options=<EmberApsOption.APS_OPTION_RETRY|APS_OPTION_ENABLE_ROUTE_DISCOVERY: 320>, groupId=0, sequence=246), 160, b'\xf6\x00%\xb0\xfd\xfe\xffA-\x14\x00\x00\x00\x00')
> 2024-04-13 08:42:59.266 DEBUG (Thread-32) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f3d616b60>, 'UPDATE devices_v12\n                    SET last_seen=:ts\n                    WHERE ieee=:ieee AND :ts - last_seen > :min_update_delta', {'ts': 1713015779.262272, 'ieee': 00:0d:6f:00:0a:f4:f1:af, 'min_update_delta': 30.0})
> 2024-04-13 08:42:59.268 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'07a521a9602a1572af944a252a5592099d4e275d4dc0918bd8769e77033f12b3ebcdde6f7d38f47e'
> 2024-04-13 08:42:59.268 DEBUG (Thread-32) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f3d616b60>, 'UPDATE devices_v12\n                    SET last_seen=:ts\n                    WHERE ieee=:ieee AND :ts - last_seen > :min_update_delta', {'ts': 1713015779.262272, 'ieee': 00:0d:6f:00:0a:f4:f1:af, 'min_update_delta': 30.0}) completed
> 2024-04-13 08:42:59.270 DEBUG (Thread-32) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f3d616b60>)
> 2024-04-13 08:42:59.270 DEBUG (Thread-32) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f3d616b60>) completed
> 2024-04-13 08:42:59.278 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'71a5a1a9602a6756ec827e'
> 2024-04-13 08:42:59.278 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'8070787e'
> 2024-04-13 08:42:59.280 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received sendUnicast: [<EmberStatus.MAX_MESSAGE_LIMIT_REACHED: 114>, 228]
> 2024-04-13 08:42:59.281 DEBUG (MainThread) [bellows.zigbee.application] Request (0xF6C0, 160) failed to enqueue, retrying in 0.5s: EmberStatus.MAX_MESSAGE_LIMIT_REACHED
> 2024-04-13 08:42:59.537 DEBUG (MainThread) [zigpy.application] Feeding watchdog
> 2024-04-13 08:42:59.538 DEBUG (MainThread) [bellows.ezsp.protocol] Send command readCounters: ()
> 2024-04-13 08:42:59.540 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'10aa21a9a52a7d5d0a7e'
> 2024-04-13 08:42:59.558 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'02aaa1a9a52a34b35994a525aa5592499c4ed9abedce408bfdc66389fc7e3fa7ebcdde6f8fffc7dbd5d2698c4623a9ec763ba5ea758241984c2613b1e070381c0e07bbe5ff648a45984d9e4f9ff7c3d9d46a35a251904824a57c7e'
> 2024-04-13 08:42:59.558 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'8160597e'
> 2024-04-13 08:42:59.561 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received readCounters: [[289, 0, 239, 0, 0, 0, 254, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 309, 0, 2, 0, 0, 0, 0, 0, 0, 0]]
> 2024-04-13 08:42:59.563 DEBUG (MainThread) [bellows.ezsp.protocol] Send command getValue: (<EzspValueId.VALUE_FREE_BUFFERS: 3>,)
> 2024-04-13 08:42:59.565 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'21ab21a9fe2a16ddbe7e'
> 2024-04-13 08:42:59.571 DEBUG (bellows.thread_0) [bellows.uart] Data frame: b'13aba1a9fe2a15b39dfce87e'
> 2024-04-13 08:42:59.572 DEBUG (bellows.thread_0) [bellows.uart] Sending: b'82503a7e'
> 2024-04-13 08:42:59.573 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame received getValue: [<EzspStatus.SUCCESS: 0>, b'\xc4']
> 2024-04-13 08:42:59.574 DEBUG (MainThread) [bellows.zigbee.application] Free buffers status EzspStatus.SUCCESS, value: 196
> 2024-04-13 08:42:59.574 DEBUG (MainThread) [bellows.zigbee.application] ezsp_counters: [MAC_RX_BROADCAST = 19607, MAC_TX_BROADCAST = 7536, MAC_RX_UNICAST = 137263, MAC_TX_UNICAST_SUCCESS = 107830, MAC_TX_UNICAST_RETRY = 8069, MAC_TX_UNICAST_FAILED = 1660, APS_DATA_RX_BROADCAST = 10943, APS_DATA_TX_BROADCAST = 48, APS_DATA_RX_UNICAST = 58450, APS_DATA_TX_UNICAST_SUCCESS = 50588, APS_DATA_TX_UNICAST_RETRY = 0, APS_DATA_TX_UNICAST_FAILED = 6522, ROUTE_DISCOVERY_INITIATED = 137, NEIGHBOR_ADDED = 1, NEIGHBOR_REMOVED = 0, NEIGHBOR_STALE = 22, JOIN_INDICATION = 0, CHILD_REMOVED = 0, ASH_OVERFLOW_ERROR = 0, ASH_FRAMING_ERROR = 0, ASH_OVERRUN_ERROR = 0, NWK_FRAME_COUNTER_FAILURE = 0, APS_FRAME_COUNTER_FAILURE = 0, UTILITY = 0, APS_LINK_KEY_NOT_AUTHORIZED = 0, NWK_DECRYPTION_FAILURE = 0, APS_DECRYPTION_FAILURE = 0, ALLOCATE_PACKET_BUFFER_FAILURE = 0, RELAYED_UNICAST = 0, PHY_TO_MAC_QUEUE_LIMIT_REACHED = 0, PACKET_VALIDATE_LIBRARY_DROPPED_COUNT = 0, TYPE_NWK_RETRY_OVERFLOW = 66415, PHY_CCA_FAIL_COUNT = 117044, BROADCAST_TABLE_FULL = 128, PTA_LO_PRI_REQUESTED = 0, PTA_HI_PRI_REQUESTED = 0, PTA_LO_PRI_DENIED = 0, PTA_HI_PRI_DENIED = 0, PTA_LO_PRI_TX_ABORTED = 0, PTA_HI_PRI_TX_ABORTED = 0, ADDRESS_CONFLICT_SENT = 0, EZSP_FREE_BUFFERS = 196]

Im convinced you have no prayer of a stable mesh without the extra routers. You’ll be very suprised what happens when you no longer have to deal with signal issues.

Also the map is only good for a snapshot I time troubleshooting point connections. Don’t take ‘it’s green’ as anything being good. Save yourself time and wait until the routers show up before you try to fight it.

3 Likes

OK, I get it. My 4 new Third Reality smart plugs are scheduled to arrive on Monday. I’ve also just finished the SkyConnect firmware upgrade to 7.3.2.0 build 212.

So let’s see what happens after I get the new routers added into the mesh.

1 Like

Note that it normally takes at least 24-hours for the Zigbee network mesh to stabilze after you add addional Zigbee Router devices, and note that you might need to manually re-pair some buggy Zigbee End Devices if they do not move to a better/closer router after that.

1 Like

After upgrading the firmware on my SkyConnect and before even adding the 4 new switches to my network, I didn’t have any issues for a few days. And now, I’ve got the 4 new switches added in and spread around the house and so far, so good.

Thanks everyone for the help!

5 Likes

dose some one tested 7.4.0.0 ?

I still get errors like "
Fehler beim Aufrufen des Diensts switch/turn_on. Failed to send request: Failed to deliver message: <EmberStatus.DELIVERY_FAILED: 102>"
with Firmware : “7.4.0.0”