One brand of ZigBee devices become unresponsive

After a few days post HA restart I’m finding that some of my ZigBee devices become unresponsive and log the below error when I try to switch them on or off. They are all SZ-ESW01-AU smart sockets.

Most of my other ZigBee devices stay functional although one smart light switch seems to randomly turn itself on when this happens.

So far my Googling has only come up with old reports that were apparently patched…

Error:

Logger: homeassistant.components.websocket_api.http.connection
Source: components/websocket_api/commands.py:239
integration: Home Assistant WebSocket API (documentation, issues)
First occurred: May 9, 2024 at 20:20:49 (8 occurrences)
Last logged: 05:42:57

[139697706285248] Unexpected exception
[139697218499904] Unexpected exception
[139697203075648] Unexpected exception
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 64, in wrap_zigpy_exceptions
    yield
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 84, in wrapper
    return await RETRYABLE_REQUEST_DECORATOR(func)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/util.py", line 131, in retry
    return await func()
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/zcl/__init__.py", line 377, in request
    return await self._endpoint.request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/endpoint.py", line 265, in request
    return await self.device.request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/device.py", line 338, in request
    with self._pending.new(sequence) as req:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/zigpy/util.py", line 290, in new
    raise ControllerException(f"Duplicate TSN: {sequence}")
zigpy.exceptions.ControllerException: Duplicate TSN: 98

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 239, in handle_call_service
    response = await hass.services.async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2738, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2779, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 975, in entity_service_call
    single_response = await _handle_entity_call(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 1047, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/template/switch.py", line 169, in async_turn_off
    await self.async_run_script(self._off_script, context=self._context)
  File "/usr/src/homeassistant/homeassistant/components/template/template_entity.py", line 563, in async_run_script
    await script.async_run(
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 1731, in async_run
    return await asyncio.shield(create_eager_task(run.async_run()))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in async_run
    await self._async_step(log_exceptions=False)
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 503, in _async_step
    self._handle_exception(
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 533, in _handle_exception
    raise exception
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 501, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 736, in _async_call_service_step
    response_data = await self._async_run_long_action(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 699, in _async_run_long_action
    return await long_task
           ^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2738, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2779, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 975, in entity_service_call
    single_response = await _handle_entity_call(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 1047, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 638, in async_handle_light_off_service
    await light.async_turn_off(**filter_turn_off_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/zha/light.py", line 471, in async_turn_off
    result = await self._on_off_cluster_handler.off()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 83, in wrapper
    with wrap_zigpy_exceptions():
  File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/usr/src/homeassistant/homeassistant/components/zha/core/cluster_handlers/__init__.py", line 75, in wrap_zigpy_exceptions
    raise HomeAssistantError(message) from exc
homeassistant.exceptions.HomeAssistantError: Failed to send request: Duplicate TSN: 98

Describe your Zigbee mesh.

Some Zigbee tips

I have 35 devices running off a Conbee2 stick which is plugged into my NUC via a 2m extension cable.

There are 5 of these Sercomm SZ-ESW01-AU sockets (that are playing up) which have worked fine for years. There is one dual light switch which is the newest addition (my first thought of where the issue might be coming from), 3 hard-wired presence sensors (2x Moes House and 1 random something), 2x IKEA bulbs, a bunch of IKEA wireless mulitbuttons and bunch of Xiaomi wireless buttons.

When the issue arises it only seems to affect the hard-wired devices (the Sercomm plugs and the dual light switch.

All the hard-wired devices are quite evenly spread throughout the house and are all acting as routers except the dual light switch.

This probably isn’t much help…

The 2 red ovals are the IKEA bulbs which I have’t operated in weeks so not sure what is up with those and the red circle is a dead battery in an IKEA remote.

Also, for clarity given the error message, the Sercomm plugs show up in HA as lights.

So less than 10 routers in a mesh of 35 is not a great ratio. If not dispersed perfectly you will have some devices dropping off. I suggest more routing devices. Also a Conbee is not a top of the line device (any more). You might be looking at improving that. It can probably handle 35 devices, but not sure if you get to 50 what happens.

That link I sent has these and other tips on helping your mesh.

The router devices are quite well spread through the house which is only single story so the distances between them aren’t huge at all.

The issues only started about a week or so ago when which is roughly when I added the dual light switch. Got any tips for seeing if that device specifically is causing trouble on the network?

In the meantime I’m going to investigate if I can force it to be a router given that it’s hard-wired.

Have you done a reset of that dual light switch and re-pair it with the setup to sanity check it?

Not yet, will try that over the weekend.

having a look up just to check things. This error seems to indicate a duplicate device issue:

image

i.e device sends the response twice indicating a weak signal between device and nearest router.

All of these devices are within range of the coordinator, with the farthest being maybe 8m and so all within very close range of each other.

When this happens it takes a HA restart to get them working again. (Although I haven’t tried just reloading ZHA in this instance)

It could just be that dual switch sending out too much chatter on the network causing the issue perhaps.

Yes but through what kind of materials/walls? Also, are the routers closer to the ground or higher up? Etc…

I am not saying you are wrong, just that in my experience lots of things can affect radio propagation, besides distance.

Examine network map, and/or look at LQI of connected devices. Although I guess you are using ZHA, as your map looks different than mine (I use Z2M).

Alternatively, disconnect power from the new device, to see if things work better. You might have to re-pair some other routers, or wait some time for them to ‘heal’ the network. And then of course, some devices from certain mfrs. are known to be ‘sticky’ instead of switching to some other device with a better signal.

Could be some combination of issues with this specific hardware in combination with those other router devices. Maybe try moving some things around or adding some other router devices, perhaps from different mfrs. even?

Interesting. I just had a look and I have 44 devices, 24 end and 20 router.

I seem to have a few sensors dropping off here and there, But I did not even begin to look into that yet at all, and I don’t want to hijack this thread, so I will make my own thread later.

For now I will just follow along, and perhaps learn a little more about Zigbee generally which may help me with my own network.

Cheers! :slight_smile:

1 Like

I have 13 router devices (9 smart plugs and 3 hue bulbs, 1 hue led strip) and 10 end devices (3 motion sensors, a temp sensor, a leak sensor being used as a bed occupancy sensor and the rest are light switches for the ceiling lights).

All are stable at this time in my small unit.

Waiting to see what is said about that double light switch that was added when the issue started in his setup.

I agree, this also sounds to me like the problem in this case. At least that’s where I would also begin looking, since that seems to be when the problems started.

Eight metres is quite a long way. Mine are 10-15 feet apart. They don’t need to be in range of the coordinator - messages get passed from router to router and the important thing is to make sure there are lots of alternative routes for them to take.

Another thing to bear in mind is that not all routers are created equal. Bulbs are likely to be less effective than plugs, for example. You might try adding two or three “repeaters” (ie routers that do nothing else) - I have at least one in every room.

Has anyone given you this link?

I linked them the entire zigbee cookbook section my first link.

That’s to the furthest router though, there are others within that distance. The thing is, this issue is not frequent at all. eg: it hasn’t happened again since my first post 4 days ago.

The dual switch which is the latest addition to the Zigbee network is only roughly 3m from the coordinator. I have another one or two of the Sercomm sockets so I’ll add that in somewhere close by as well. Unfortunately I didn’t have time to work on this over the weekend.

UPDATE: yesterday the dual light switch became unresponsive so I’m definitely thinking this is the culprit for network issues. Currently trying to investigate further.

There are a lot of factors that play into network issues. The first thing to do is review that cookbook that @Sir_Goodenough linked because it’s great information. On a high level, here’s what I look for in my network:

  • Am I on the right channel given my 2.4g wifi
  • Do I have routers spread evenly throughout, and compensate for walls. This is important because a 6" wall at 20 degrees can be like trying to penetrate a 20’ wall. So make sure there are 90 degree angles as much as possible for your base network
  • Have I disabled every device entity that means little or nothing to me - such as current or voltage. I know my voltage, I know my current, I don’t need 40 devices chatting up my network trying to report it
  • Are there other wifi’s close enough to interfere (neighbors, commercial, etc)
  • Are my battery devices charged up, because I’ve seen weak battery devices get very chatty at the end of their battery
  • Are all my devices name brand

Those are my go-to’s. Aside from that there’s some outstanding advice already given.

I have gone through all that which has let me to where I am.

Using Zigbee channel 15
Routers are throughout the house, which is quite open plan.
Issue only arose recently, which seems to coincide with the dual light switch being installed.
Nearest wifi device would be the nearest AP which is about 4m away on the other side of a brick wall from the suspect light switch.
There are a couple of low battery devices which I’ll look into, so thats a good one.
The devices that are playing up… light switch is Tuya (so not a great start) and the others that get affected during this issue are Sercomm (rebranded to Telstra which is an Australian telco)

I’ve contacted the supplier of the light switch because I think this is causing the network issues given that everything was perfectly fine prior to it’s install. I guess my question now is how to debug that particular device…

Sounds like a good start. Why not just turn off the power to it and see how or if things react to that?