Cannot get any Zigbee devices to stay online (ZHA)

I have a Zigbee network based on the SONOFF 3.0 dongle, six IKEA TRADFRI repeaters, a USB repeater and a Zigbee socket that acts as a repeater as well.

The devices have operated all for over a year, but following intermittent problems I revised the layout last October.

  1. I moved the hub away from a TV that was producing 2.4 GHz-band interference for some reason
  2. The hub and dongles are in a central location
  3. I moved the SONOFF dongle and a Bluetooth dongle about 2m away from the hub and each other to prevent mutual interference
  4. I ensured that the Zigbee and WiFi bands are well separated
  5. I redistributed the repeaters, keeping them away from WiFi access points and other known sources of interference

I was delighted that the network became much more stable and there were very few intermittent problems.

However, three months later it all fell apart. Pretty much all the Zigbee devices are offline, except the coordinator. If I reconnect the repeaters, they configure successfully but go offline again within minutes (I have the “Consider mains powered devices unavailable after” set to 60 seconds). If the repeaters are all offline, it is not surprising that the end devices are too.

I cannot think of anything I have changed apart from keeping the system up to date.

The general log shows no relevant errors.

If I enable debug logging and reload the Zigbee Coordinator, I get a debug log that I cannot decipher – I attach an extract from the end as the whole thing is apparently too big:


[...] 
2024-02-15 21:41:05.064 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=sensor.bed_occupied, old_state=<state sensor.bed_occupied=Unknown; icon=, friendly_name=Main bedroom bed occupancy @ 2024-02-15T21:40:50.586152+00:00>, new_state=<state sensor.bed_occupied=Unoccupied; icon=mdi:bed-empty, friendly_name=Main bedroom bed occupancy @ 2024-02-15T21:41:05.064420+00:00>>
2024-02-15 21:41:05.072 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': 00:12:4b:00:24:45:1b:a8, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.930135, 'min_update_delta': 30.0})
2024-02-15 21:41:05.072 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': 00:12:4b:00:24:45:1b:a8, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.930135, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.110 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.110 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.110 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:85:97:9e:2f:76, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.933835, 'min_update_delta': 30.0})
2024-02-15 21:41:05.111 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:85:97:9e:2f:76, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.933835, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.111 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.111 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.112 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:cb:fb:ec:c4:0d, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.936185, 'min_update_delta': 30.0})
2024-02-15 21:41:05.112 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:cb:fb:ec:c4:0d, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.936185, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.112 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.112 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.113 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:24:95:e2:78:8c, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.938618, 'min_update_delta': 30.0})
2024-02-15 21:41:05.113 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:24:95:e2:78:8c, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.938618, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.114 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.114 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.114 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:f9:65:93:0b:81, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.939173, 'min_update_delta': 30.0})
2024-02-15 21:41:05.115 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:f9:65:93:0b:81, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.939173, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.115 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.115 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.116 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:a0:c5:64:f8:c5, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.940265, 'min_update_delta': 30.0})
2024-02-15 21:41:05.116 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:a0:c5:64:f8:c5, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.940265, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.116 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.116 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.117 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:e6:4c:10:e7:04, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.943709, 'min_update_delta': 30.0})
2024-02-15 21:41:05.117 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:e6:4c:10:e7:04, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.943709, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.117 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.117 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.118 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:dc:4c:9a:f2:6e, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.94485, 'min_update_delta': 30.0})
2024-02-15 21:41:05.118 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:dc:4c:9a:f2:6e, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.94485, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.118 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.118 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:05.119 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:22:c0:ad:89:9f, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.945994, 'min_update_delta': 30.0})
2024-02-15 21:41:05.119 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method execute of sqlite3.Connection object at 0x7f46c65058a0>, '\n            INSERT INTO attributes_cache_v12\n            VALUES (:ieee, :endpoint_id, :cluster_id, :attrid, :value, :timestamp)\n                ON CONFLICT (ieee, endpoint_id, cluster, attrid) DO UPDATE\n                SET value=excluded.value, last_updated=excluded.last_updated\n                WHERE\n                    value != excluded.value\n                    OR :timestamp - last_updated > :min_update_delta\n            ', {'ieee': a4:c1:38:22:c0:ad:89:9f, 'endpoint_id': 1, 'cluster_id': 1280, 'attrid': 2, 'value': <ZoneStatus: 0>, 'timestamp': 1708033264.945994, 'min_update_delta': 30.0}) completed
2024-02-15 21:41:05.119 DEBUG (Thread-33) [aiosqlite] executing functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>)
2024-02-15 21:41:05.119 DEBUG (Thread-33) [aiosqlite] operation functools.partial(<built-in method commit of sqlite3.Connection object at 0x7f46c65058a0>) completed
2024-02-15 21:41:08.136 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event logging_changed[L]>

By “hub” do you mean your Home Assistant server? What are you running HA on?

Yes

A DELL Optiplex 3050 MFF

1 Like

Any close neighbours? They may have got a new wi-fi router.

No, it is a detached house.
The WiFi Analyzer app cannot see any WiFi networks at all apart from two in this house.

I have a 2.4Ghz only TP-Link Deco M5 mesh WiFi that has automatically selected channel 6 (it cannot be fixed, unfortunately), and a 5GHz only WAP on the original WiFi router that is not used by HA. Zigbee is on channel 15. I suppose I could move it up to 25 … ?

Zigbee ch25 should be a good choice if you use wifi ch6. Even better would be moving your wifi to ch1 as well.

FYI you can look at your energy scan too:

Unfortunately, the Deco M5 does not have the facility to set the channel or restrict the choice of channels. It is supposed to find a channel that is not busy, so should work around Zigbee traffic.

Wow. What is to stop it changing to Ch11 after you move your zigbee channel?

It is supposed to choose the least used channel but a quick search reveals this is not always the case and users have been calling for manual control for years.

I’d definitely be looking into alternate wifi hardware.

I have

    "energy_scan": {
      "11": 9.411764705882353,
      "12": 11.764705882352942,
      "13": 7.0588235294117645,
      "14": 15.294117647058824,
      "15": 12.941176470588236,
      "16": 31.764705882352942,
      "17": 49.411764705882355,
      "18": 52.94117647058823,
      "19": 30.58823529411765,
      "20": 2.3529411764705883,
      "21": 17.647058823529413,
      "22": 25.88235294117647,
      "23": 23.529411764705884,
      "24": 29.41176470588235,
      "25": 25.88235294117647,
      "26": 54.11764705882353
    }

What does that tell me?

Channel 20 has the least traffic so in theory the best one to use.(If I have that right…)

I appreciate all the comments, but we seem to be wandering off-topic. I am not looking for advice in setting up a new network. I am looking for a way of diagnosing a working network that suddenly stopped working. Almost, but not quite, completely. Practically the whole network is down and I do not know how to go about diagnosing the problem. The debug log seemed the obvious answer, but I cannot make head or tail of it. :anguished:

Does the debug log extract that I posted not give any clues?
Would a different part of the debug log help more?
Is there a way of adding a zip file (of the whole log) to a post on this forum?

In a possibly related problem, I cannot get my Zigbee network to move away from channel 15 :man_shrugging: If I try to change the channel it goes busy for a long while, then shows the channel as still 15. This is odd because I set up a different network before on channel 25 using the same type of dongle :thinking: Maybe I need to do some kind of reset on the dongle??

I was thinking that Zigbee was useless here until I added one of these in each room of my house. Two in large rooms.

Zigbee is susceptible to degradation by other 2.4 GHz devices because it has, by regulation, the lowest legal transmit power. Any radiation in the 2.4GHz band will decrease the signal to noise ratio of the Zigbee device, and if the background radiation is strong enough, the Zigbee device will not hear the Zigbee network. This is called desense.

I don’t recall if you said what Zigbee controller you are using, but if it’s a Zigbee Dongle, do not plug it into a USB 3.0 port.

1 Like

@AndySymons probably best if you open an issue to Home Assistant core and provide debug logs from ZHA as judging by the short snippet of debugging messages it might be a problem with the new OTA update features → https://www.home-assistant.io/integrations/zha#reporting-issues (i.e. create a new issue to Issues · home-assistant/core · GitHub and post all your details there so that ZHA developers can properly analyze the full logs).

As this is not the right place to post full debug logs and ZHA developers do not hang out here :stuck_out_tongue:

However if you wanted then you could upload the full logs to pastebin.com and post the URL to it.

That could be related so you should definitely mention that when you start the issue as per above.

It is possible to backup your Zigbee network to a file and then reset the Zigbee Coordinator before restoring the network from the file, (you can do all that by using zigpy-cli and as a stand-alone command line tool on a different computer or with the ZHA integration disabled), but I would advice to first open a new issue to Home Assistant core and wait to get a reply from ZHA developers after they analyzed your full debug logs.

That procedure would work similarly to migrating to a new Zigbee Coordinator adapter inside ZHA, with the difference that you reset the adapter instead of switching to a new adapter → https://www.home-assistant.io/integrations/zha#migrating-to-a-new-zigbee-coordinator-adapter-inside-zha

Understand that the way a Zigbee Coordinator NCP (Network Co-Processor) adapter works is not just a dumb radio adapter but instead, the Zigbee stack for the Zigbee Coordinator is running locally on that adapter and it controls the whole Zigbee network, while the host application running on the computer is more or less just keeping presenting the objects and only sending commands to it telling the Zigbee Coordinator what to change.

Yes if your computer only has USB 3.x/4.x ports then it recommend to buy a powered USB 2.0 hub for connecting Zigbee, Thread, and Bluetooth Low Energy (BLE) radio adapter dongles, because USB 3.x/4.x is infamously known for causing serious interference for all of those low-powered radios that use the 2.4GHz frequency range.

Read and follow → Zigbee networks: how to guide for avoiding interference and optimize for getting better range + coverage

That includes making sure all Zigbee, Thread, and BLE radio adapter dongles and devices are not located physically close to any other sources of EMF/EMI/RMI (like WiFi router/access points, alarm-system, electric appliances/Philippians/devices, power supplies, power cables/wired, etc.). Best to put on a tin-foil hat and overkill this whole task :wink:

If you have the ITead’s “Sonoff Zigbee 3.0 USB Dongle Plus” (model “ZBDongle-P” based on Texas Instruments CC2652P) then you should upgrade the Z-Stack Zigbee Coordinator NCP firmware on it → ITead's "Sonoff Zigbee 3.0 USB Dongle Plus" (model "ZBDongle-P") based on Texas Instruments CC2652P +20dBm radio SoC/MCU

If you instead have [Tead’s “Sonoff Zigbee 3.0 USB Dongle Plus V2” (model “ZBDongle-E” based on Silicon Labs EFR32MG21) then you probably do not need to update as the older Zigbee EmberZNet 6.10.x firmware they still ship on them is known to be very stable, however, if you can not solve your problems by other means then you should consider upgrading to the latest Zigbee EmberZNet 7.3.x bug-fix version (but not EmberZNet 7.4.x or later yet) → ITead’s “Sonoff Zigbee 3.0 USB Dongle Plus V2” (model "ZBDongle-E") based on Silicon Labs EFR32MG21 +20dBm radio SoC/MCU

FYI, unfortunately, Zigbee is low-energy, and thus the signals are so weak that your Wi-Fi channels will not even notice that they are there. That is, Wi-Fi can jam Zigbee/Thread/BLE but not vice versa.

PS: This was just one of the many reasons that made me migrate from having six TP-Link Deco M5 WiFi mesh routers to four Unifi U6 Lite access points (and the Unifi Dream Machine) a few years ago.

This is by design. Stupid, but still an artifact of the USB3 protocol. In order to get 5Gb data from USB3, you need a clock frequency of half or 1/4 of the data rate. This is the source of the RF noise at 2.5GHz that desenses receivers in the 2.4GHz band.

What does that mean exactly?

I said in the original post that it is a SONOFF 3.0. It is already plugged into a USB 2.0 port (the DELL Optiplex provides both USB 2.0 and USB 3 Gen 1) via a 2m extension cable. I also mentioned that I already have several repeaters. I am not setting up a new network; I am trying to debug one that was working and then stopped.

It means that you should open an issue on the GitHub reposity for Home Assistant core and there you provide debug logs from ZHA → Issues · home-assistant/core · GitHub

Follow the instructions from this ZHA docs section on reporting issues for details on what is needed → https://www.home-assistant.io/integrations/zha#reporting-issues

It will help to be more precise. Itead sell two different Zigbee dongles based on different radio chips (using different Zigbee stacks with different firmware), the Sonoff ZBDongle-P based on Texas Instruments CC2652P and the Sonoff ZBDongle-E based on Silicon Labs EFR32MG21.

What exact firmware version are you using?

FYI, most common problem symtoms are due to Zigbee network that is not optimized or user error in the setup/configuration, which is why it a generally a good idea to make sure you are already following all the best practices before as it makes troubleshooting much harder if you are not, see → Zigbee networks: how to guide for avoiding interference and optimize for getting better range + coverage

Not a good idea to assume that you are already following all best practices when most other people are not.

That is too technical for me!

What I did do today, in the absence of a reset procedure, was to unplug the SONOFF Zigbee dongle, delete the Zigbee coordinator device in HA, restart HA, then plug in the dongle and start from scratch. At least that unstuck whatever was stopping me change the channel, so I took the opportunity to change it to 25, though I do not expect that to solve the problem.

I then went around the house re-attaching all 47 devices – a good couple of hours work because HA had lost the friendly names for my devices so did not make a match. However, eventually everything worked.

I then made the mistake of updating the HA core to 24.02.1 without making my own backup. I did tick the box for HA to make one. After the restart, it started falling apart again – some of the repeaters went offline.

Thinking that the upgrade had caused the problem, I reverted to the previous release using the HA backup. That only made matters worse. Now the repeaters are online but 29 of the battery-operated devices have gone offline.

I have not had a good HA day!

I assure you I did all that back in October 2023. I even got a spectrum analyser so that I could find interference sources (like the TV I mentioned) and check the operation of the WiFi mesh. I have applied this in two houses now.