ZHA went from stable to unusable, looking for clues

I would appreciate some pointers to debug further. Since a few weeks, my Zigbee installation is almost unusable, without a conscious change on the user side.
Initially, I observed that there was some issue when HA was restarting - the integration was not able to finish loading; from the message it pointed to the serial device being in use. But that fixes itself after a while - and I have not seen it lately. Hm.
But the real troubles began a bit later, when the lights and switches all just stopped responding, despite still being “available” in the graph. I trigger a light change from HA, then just get errors like NWK_NO_ROUTE 205.
As a remedy, I powered off ALL mains devices in the network, restarted HA, and turned them on one by one: Result: A stable network - for a while. After a reboot of the HA PC (updates), the network was again in a broken state. Interestingly, even in this state, battery powered devices close to the coordinator still publish their info (like motion detectors, buttons etc).
From this, I deduce that there may be some sort of routing problem, which I have trouble figuring out further. Is there some state somewhere I can force to reset, to rebuild cleanly (similar to the staged repowering I did, but from a coordinator/routing perspective)?
Should I look into HW/radio issues (unlikely as there was no change there)?
Could it be a Zigbee device, trough which most messages would be routed, suddenly became insane and blocks the network?
What can be checked from software level?
Shoud I try to use source routing?
Is there a known issue as of late?

Zigbee network:
Several of Aquara Sensors, TRADFRI (on/off switch, control outlet, driver, bulbs E27 and GU10, switched outlet), TS0502B 24V dimmer, VINDSTYRKA. Lots of powered routers, but somewhat awkwardly spaced across 4 floors (so some are needed to jump distances).

HW Environment:
zha/znp stack, Sonoff_Zigbee_3.0_USB_Dongle_Plus TI CC2652P, not on the latest firmware (newer firmwares would be affected by some warnings I see, so I stayed with “working”).
Stick connected to 12V powered USB hub (nothing else there), connection by USB2.0 only

SW Environment:
x64 Linux Fedora, Home Assistant Container (privileged), /dev/ mapped, running 2025.11.1

{
  "home_assistant": {
    "installation_type": "Home Assistant Container",
    "version": "2025.11.1",
    "dev": false,
    "hassio": false,
    "virtualenv": false,
    "python_version": "3.13.9",
    "docker": true,
    "arch": "x86_64",
    "timezone": "Europe/Berlin",
    "os_name": "Linux",
    "os_version": "6.17.7-200.fc42.x86_64",
    "container_arch": "amd64",
    "run_as_root": true
  },
  "custom_components": {
    "config_editor": {
      "documentation": "https://github.com/junkfix/config-editor-card",
      "version": "5.0.0",
      "requirements": []
    },
    "rpi_gpio": {
      "documentation": "https://github.com/thecode/ha-rpi_gpio",
      "version": "2022.7.0",
      "requirements": [
        "RPi.GPIO==0.7.1"
      ]
    },
    "hacs": {
      "documentation": "https://hacs.xyz/docs/use/",
      "version": "2.0.5",
      "requirements": [
        "aiogithubapi>=22.10.1"
      ]
    },
    "average": {
      "documentation": "https://github.com/Limych/ha-average",
      "version": "2.4.0",
      "requirements": [
        "pip>=21.3.1"
      ]
    },
    "dwd": {
      "documentation": "https://github.com/hg1337/homeassistant-dwd#readme",
      "version": "2025.5.0",
      "requirements": [
        "defusedxml==0.7.1"
      ]
    },
    "thermal_comfort": {
      "documentation": "https://github.com/dolezsa/thermal_comfort/blob/d7a81bbcd38110742124d46e7c0d2a7e5d011831/README.md",
      "version": "0.0.0",
      "requirements": []
    },
    "dwd_pollenflug": {
      "documentation": "https://github.com/mampfes/hacs_dwd_pollenflug",
      "version": "1.0.4",
      "requirements": [
        "pytz"
      ]
    },
    "blitzortung": {
      "documentation": "https://github.com/mrk-its/homeassistant-blitzortung",
      "version": "1.3.11",
      "requirements": [
        "paho-mqtt>=1.5.0"
      ]
    },
    "daily": {
      "documentation": "https://github.com/jeroenterheerdt/HADailySensor",
      "version": "v2025.7.1",
      "requirements": []
    },
    "wemportal": {
      "documentation": "https://github.com/erikkastelec/hass-WEM-Portal",
      "version": "1.5.14",
      "requirements": [
        "scrapyscript==1.1.*",
        "scrapy==2.12.*",
        "fuzzywuzzy==0.18.0",
        "twisted<=24.11.0"
      ]
    }
  },
  "integration_manifest": {
    "domain": "zha",
    "name": "Zigbee Home Automation",
    "after_dependencies": [
      "hassio",
      "onboarding",
      "usb"
    ],
    "codeowners": [
      "dmulcahey",
      "adminiuga",
      "puddly",
      "TheJulianJES"
    ],
    "config_flow": true,
    "dependencies": [
      "file_upload",
      "homeassistant_hardware"
    ],
    "documentation": "https://www.home-assistant.io/integrations/zha",
    "iot_class": "local_polling",
    "loggers": [
      "aiosqlite",
      "bellows",
      "crccheck",
      "pure_pcapy3",
      "zhaquirks",
      "zigpy",
      "zigpy_deconz",
      "zigpy_xbee",
      "zigpy_zigate",
      "zigpy_znp",
      "zha",
      "universal_silabs_flasher"
    ],
    "requirements": [
      "zha==0.0.77"
    ],
    "usb": [
      {
        "description": "*2652*",
        "known_devices": [
          "slae.sh cc2652rb stick"
        ],
        "pid": "EA60",
        "vid": "10C4"
      },
      {
        "description": "*slzb-07*",
        "known_devices": [
          "smlight slzb-07"
        ],
        "pid": "EA60",
        "vid": "10C4"
      },
      {
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus v2"
        ],
        "pid": "55D4",
        "vid": "1A86"
      },
      {
        "description": "*sonoff*plus*",
        "known_devices": [
          "sonoff zigbee dongle plus"
        ],
        "pid": "EA60",
        "vid": "10C4"
      },
      {
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ],
        "pid": "EA60",
        "vid": "10C4"
      },
      {
        "description": "*tubeszb*",
        "known_devices": [
          "TubesZB Coordinator"
        ],
        "pid": "7523",
        "vid": "1A86"
      },
      {
        "description": "*zigstar*",
        "known_devices": [
          "ZigStar Coordinators"
        ],
        "pid": "7523",
        "vid": "1A86"
      },
      {
        "description": "*conbee*",
        "known_devices": [
          "Conbee II"
        ],
        "pid": "0030",
        "vid": "1CF1"
      },
      {
        "description": "*conbee*",
        "known_devices": [
          "Conbee III"
        ],
        "pid": "6015",
        "vid": "0403"
      },
      {
        "description": "*zigbee*",
        "known_devices": [
          "Nortek HUSBZB-1"
        ],
        "pid": "8A2A",
        "vid": "10C4"
      },
      {
        "description": "*zigate*",
        "known_devices": [
          "ZiGate+"
        ],
        "pid": "6015",
        "vid": "0403"
      },
      {
        "description": "*zigate*",
        "known_devices": [
          "ZiGate"
        ],
        "pid": "EA60",
        "vid": "10C4"
      },
      {
        "description": "*bv 2010/10*",
        "known_devices": [
          "Bitron Video AV2010/10"
        ],
        "pid": "8B34",
        "vid": "10C4"
      },
      {
        "description": "*sonoff*max*",
        "known_devices": [
          "SONOFF Dongle Max MG24"
        ],
        "pid": "EA60",
        "vid": "10C4"
      },
      {
        "description": "*sonoff*lite*mg21*",
        "known_devices": [
          "sonoff zigbee dongle lite mg21"
        ],
        "pid": "EA60",
        "vid": "10C4"
      }
    ],
    "zeroconf": [
      {
        "name": "tube*",
        "type": "_esphomelib._tcp.local."
      },
      {
        "name": "*zigate*",
        "type": "_zigate-zigbee-gateway._tcp.local."
      },
      {
        "name": "*zigstar*",
        "type": "_zigstar_gw._tcp.local."
      },
      {
        "name": "uzg-01*",
        "type": "_uzg-01._tcp.local."
      },
      {
        "name": "slzb-06*",
        "type": "_slzb-06._tcp.local."
      },
      {
        "name": "xzg*",
        "type": "_xzg._tcp.local."
      },
      {
        "name": "czc*",
        "type": "_czc._tcp.local."
      },
      {
        "name": "*",
        "type": "_zigbee-coordinator._tcp.local."
      }
    ],
    "is_built_in": true,
    "overwrites_built_in": false
  },
  "setup_times": {
    "null": {
      "setup": 4.6135000001612525e-05
    },
    "f19fe224a762633698aa3ce3626362b1": {
      "wait_import_platforms": -0.009448023000004468,
      "wait_base_component": -0.0003382189999996399,
      "config_entry_setup": 9.554294364999997
    }
  },
  "data": {
    "config": {},
    "config_entry": {
      "created_at": "1970-01-01T00:00:00+00:00",
      "data": {
        "device": {
          "baudrate": 115200,
          "flow_control": null,
          "path": "/dev/serial/by-id/usb-ITead_Sonoff_Zigbee_3.0_USB_Dongle_Plus_9ade2dce9e12ec11865d20c7bd930c07-if00-port0"
        },
        "radio_type": "znp"
      },
      "discovery_keys": {},
      "disabled_by": null,
      "domain": "zha",
      "entry_id": "f19fe224a762633698aa3ce3626362b1",
      "minor_version": 1,
      "modified_at": "2025-11-09T07:30:41.239434+00:00",
      "options": {
        "custom_configuration": {
          "zha_options": {
            "consider_unavailable_mains": 3000,
            "default_light_transition": 0,
            "enhanced_light_transition": false,
            "light_transitioning_flag": true,
            "group_members_assume_state": true,
            "enable_identify_on_join": true,
            "consider_unavailable_battery": 21600,
            "enable_mains_startup_polling": true
          }
        }
      },
      "pref_disable_new_entities": false,
      "pref_disable_polling": false,
      "source": "user",
      "subentries": [],
      "title": "TI CC2531 USB CDC, s/n: __0X00124B001CD4E27D - Texas Instruments",
      "unique_id": "**REDACTED**",
      "version": 5
    },
    "application_state": {
      "node_info": {
        "nwk": 0,
        "ieee": "**REDACTED**",
        "logical_type": 0,
        "model": "CC2652",
        "manufacturer": "Texas Instruments",
        "version": "Z-Stack 20210708"
      },
      "network_info": {
        "extended_pan_id": "**REDACTED**",
        "pan_id": 63184,
        "nwk_update_id": 0,
        "nwk_manager_id": 0,
        "channel": 15,
        "channel_mask": 32768,
        "security_level": 5,
        "network_key": "**REDACTED**",
        "tc_link_key": {
          "key": [
            90,
            105,
            103,
            66,
            101,
            101,
            65,
            108,
            108,
            105,
            97,
            110,
            99,
            101,
            48,
            57
          ],
          "tx_counter": 0,
          "rx_counter": 0,
          "seq": 0,
          "partner_ieee": "**REDACTED**"
        },
        "key_table": [
          {
            "key": "3f:51:22:03:ff:71:0e:78:94:29:4f:05:1a:12:07:01",
            "tx_counter": 0,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "58:8e:81:ff:fe:01:7d:72"
          },
          {
            "key": "a0:b6:5a:7e:df:0a:d2:82:a6:53:39:77:a6:a1:aa:ef",
            "tx_counter": 6149,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "cc:86:ec:ff:fe:aa:b6:5d"
          },
          {
            "key": "36:a8:b9:03:ff:1c:06:ec:9d:d0:d4:05:1a:7f:0f:95",
            "tx_counter": 6864,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "cc:86:ec:ff:fe:9a:84:7b"
          },
          {
            "key": "69:61:8f:0e:7f:cc:60:98:04:67:6a:6d:76:b5:cb:e0",
            "tx_counter": 6622,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "cc:86:ec:ff:fe:8f:9c:4a"
          },
          {
            "key": "74:73:c2:aa:b1:dc:18:17:7d:0a:69:d2:dc:da:fd:74",
            "tx_counter": 6732,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "84:fd:27:ff:fe:24:53:f4"
          },
          {
            "key": "a6:5e:f6:0e:7f:83:83:d8:cb:58:13:6d:76:fa:28:a0",
            "tx_counter": 3828,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "8c:65:a3:ff:fe:f6:a3:85"
          },
          {
            "key": "97:96:b2:86:a4:b1:98:e9:ef:fb:b4:63:c7:b8:e1:42",
            "tx_counter": 7018,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "a4:c1:38:37:63:49:d8:c3"
          },
          {
            "key": "06:ec:25:4c:07:75:ed:87:00:09:46:45:7e:de:95:ea",
            "tx_counter": 7018,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "a4:c1:38:5e:c5:b6:09:fd"
          },
          {
            "key": "0f:35:74:aa:b1:95:e7:2f:06:4c:df:d2:dc:93:02:4c",
            "tx_counter": 6205,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "bc:02:6e:ff:fe:92:15:8f"
          },
          {
            "key": "44:b4:81:0e:7f:79:d6:14:29:b2:64:6d:76:00:7d:6c",
            "tx_counter": 3828,
            "rx_counter": 0,
            "seq": 0,
            "partner_ieee": "40:30:59:ff:fe:81:49:67"
          }
        ],
        "children": [
          [
            130,
            165,
            253,
            254,
            255,
            60,
            11,
            0
          ],
          [
            180,
            184,
            139,
            6,
            0,
            141,
            21,
            0
          ],
          [
            54,
            171,
            183,
            6,
            0,
            141,
            21,
            0
          ],
          [
            48,
            190,
            129,
            7,
            0,
            141,
            21,
            0
          ],
          [
            114,
            125,
            1,
            254,
            255,
            129,
            142,
            88
          ],
          [
            169,
            165,
            103,
            254,
            255,
            114,
            2,
            92
          ],
          [
            110,
            242,
            141,
            254,
            255,
            114,
            2,
            92
          ],
          [
            106,
            198,
            200,
            254,
            255,
            114,
            2,
            92
          ],
          [
            199,
            177,
            245,
            254,
            255,
            35,
            164,
            96
          ],
          [
            101,
            52,
            98,
            254,
            255,
            80,
            75,
            128
          ],
          [
            116,
            95,
            84,
            254,
            255,
            39,
            253,
            132
          ],
          [
            234,
            231,
            130,
            254,
            255,
            236,
            134,
            204
          ],
          [
            147,
            137,
            244,
            83,
            36,
            254,
            255,
            39
          ],
          [
            215,
            213,
            74,
            156,
            143,
            254,
            255,
            236
          ],
          [
            151,
            154,
            244,
            83,
            36,
            254,
            255,
            39
          ],
          [
            47,
            234,
            93,
            182,
            170,
            254,
            255,
            236
          ],
          [
            195,
            232,
            123,
            132,
            154,
            254,
            255,
            236
          ],
          [
            93,
            75,
            151,
            8,
            0,
            141,
            21,
            0
          ],
          [
            87,
            230,
            244,
            83,
            36,
            254,
            255,
            39
          ],
          [
            5,
            196,
            239,
            254,
            255,
            80,
            75,
            128
          ],
          [
            65,
            135,
            162,
            6,
            0,
            141,
            21,
            0
          ],
          [
            224,
            129,
            131,
            8,
            0,
            141,
            21,
            0
          ]
        ],
        "route_table": {},
        "tx_power": null,
        "nwk_addresses": {
          "00:0b:3c:ff:fe:fd:a5:82": 51265,
          "00:15:8d:00:06:33:6e:cc": 52735,
          "00:15:8d:00:06:8b:b8:b4": 61238,
          "00:15:8d:00:06:b2:7a:7f": 27919,
          "00:15:8d:00:06:b7:ab:36": 60826,
          "00:15:8d:00:07:81:be:30": 17239,
          "58:8e:81:ff:fe:01:7d:72": 60394,
          "5c:02:72:ff:fe:67:a5:a9": 14071,
          "5c:02:72:ff:fe:8d:f2:6e": 37231,
          "5c:02:72:ff:fe:c8:c6:6a": 31836,
          "60:a4:23:ff:fe:f5:b1:c7": 3543,
          "80:4b:50:ff:fe:62:34:65": 52640,
          "84:fd:27:ff:fe:54:5f:74": 48205,
          "cc:86:ec:ff:fe:82:e7:ea": 19085,
          "cc:86:ec:ff:fe:aa:b6:5d": 59951,
          "cc:86:ec:ff:fe:9a:84:7b": 59587,
          "cc:86:ec:ff:fe:8f:9c:4a": 39676,
          "84:71:27:ff:fe:23:8c:4d": 26427,
          "ec:ff:fe:aa:b6:5d:ea:2f": 52358,
          "84:fd:27:ff:fe:24:53:f4": 58967,
          "8c:65:a3:ff:fe:f6:a3:85": 31117,
          "a4:c1:38:37:63:49:d8:c3": 3437,
          "a4:c1:38:5e:c5:b6:09:fd": 27393,
          "00:15:8d:00:08:97:4b:5d": 57872,
          "5c:02:72:ff:fe:bb:d7:5e": 23273,
          "27:ff:fe:24:53:f4:e6:57": 34045,
          "bc:02:6e:ff:fe:92:15:8f": 52684,
          "80:4b:50:ff:fe:ef:c4:05": 50975,
          "00:15:8d:00:07:05:ac:35": 27535,
          "00:15:8d:00:06:b6:34:8f": 12154,
          "00:15:8d:00:06:a2:87:41": 49878,
          "00:15:8d:00:08:7d:ee:3e": 3121,
          "40:30:59:ff:fe:81:49:67": 16676,
          "00:15:8d:00:08:a6:f5:a6": 33260,
          "00:15:8d:00:08:83:81:e0": 34386
        },
        "stack_specific": {
          "zstack": {
            "tclk_seed": "9389594d2c23fd00f08020e6544efbe5"
          }
        },
        "metadata": {
          "zstack": {
            "TransportRev": 2,
            "ProductId": 1,
            "MajorRel": 2,
            "MinorRel": 7,
            "MaintRel": 1,
            "CodeRevision": 20210708,
            "BootloaderBuildType": 0,
            "BootloaderRevision": null
          }
        },
        "source": "[email protected]"
      },
      "counters": {},
      "broadcast_counters": {},
      "device_counters": {},
      "group_counters": {}
    },
    "energy_scan": {
      "11": 18.823529411764707,
      "12": 17.647058823529413,
      "13": 17.647058823529413,
      "14": 12.941176470588236,
      "15": 15.294117647058824,
      "16": 8.235294117647058,
      "17": 0.0,
      "18": 0.0,
      "19": 0.0,
      "20": 0.0,
      "21": 0.0,
      "22": 0.0,
      "23": 0.0,
      "24": 0.0,
      "25": 0.0,
      "26": 0.0
    },
    "versions": {
      "bellows": "0.47.2",
      "zigpy": "0.86.0",
      "zigpy_deconz": "0.25.5",
      "zigpy_xbee": "0.21.1",
      "zigpy_znp": "0.14.2",
      "zigpy_zigate": "0.13.4",
      "zhaquirks": "0.0.148",
      "zha": "0.0.77"
    },
    "devices": [
      {
        "manufacturer": "Texas Instruments",
        "model": "CC2652",
        "logical_type": "Coordinator"
      },
      {
        "manufacturer": "TUYATEC-zn9wyqtr",
        "model": "RH3040",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_wleak.aq1",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_motion.aq2",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "TUYATEC-zn9wyqtr",
        "model": "RH3040",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_wleak.aq1",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_magnet.aq2",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_magnet.aq2",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_magnet.aq2",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI control outlet",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI control outlet",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI bulb E27 CWS opal 600lm",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI bulb E27 CWS opal 600lm",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI Driver 10W",
        "logical_type": "Router"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.weather",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI control outlet",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI on/off switch",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI control outlet",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI on/off switch",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI control outlet",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI on/off switch",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "Paulmann Licht",
        "model": "500.46",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI on/off switch",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI Driver 30W",
        "logical_type": "Router"
      },
      {
        "manufacturer": "_TZ3210_jtifm80b",
        "model": "TS0502B",
        "logical_type": "Router"
      },
      {
        "manufacturer": "_TZ3210_jtifm80b",
        "model": "TS0502B",
        "logical_type": "Router"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.weather",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.weather",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.weather",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.weather",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "VINDSTYRKA",
        "logical_type": "Router"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.weather",
        "logical_type": "EndDevice"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI bulb GU10 CWS 345lm",
        "logical_type": "Router"
      },
      {
        "manufacturer": "IKEA of Sweden",
        "model": "TRADFRI bulb GU10 CWS 345lm",
        "logical_type": "Router"
      },
      {
        "manufacturer": "_TYZB01_mfccmeio",
        "model": "TS0204",
        "logical_type": "Router"
      },
      {
        "manufacturer": "LUMI",
        "model": "lumi.sensor_magnet.aq2",
        "logical_type": "EndDevice"
      }
    ]
  },
  "issues": []
}

Has your neighbour got a new wi-fi provider?

Most Zigbee problems are not hardware/software issues. A lot of cheap devices are pretty flaky, but if they’re been working in the past, they’re probably not the first thing to check.

This may be a good place to start. It may be that your network was only borderline stable in the past and some small change (like the neighbour’s wi-fi) has tipped it over the edge. Even moving the furniture can do it.

Have you read this?

Step zero would be rebooting your wifi and/or checking which wifi channel it lands on, since wifi and zigbee share some frequencies, you don’t want to have them on the same channel. Likewise as Jack already mentioned, maybe a neighbor wifi, but you have less control over that, you can check with a wifi scanner app that can show other SSID and their bands.

this would be the first thing I check, remove the usb hub and plug it directly into your HA, let it run for a few days and see if it stabilizes.

This would be the second thing I would check, I would work on getting the dongle flashed to the most current firmware, check the release notes or their support website if there is a proper version migration path (if they want you to update in smaller version steps instead of going straight to latest).

Make sure NONE of these bulbs (or any other routing Zigbee devices) are being physically powered down and up from mains. Non routing bulbs like Singled are the exception. This can freak out routing in the mesh.

Make sure you are using a minimum 1 meter USB cable to get your radio away from other radios.

Plus what they said… /\ /\ /\

Thank you for your inputs!

My Wifi APs are pinned to Channel 1, while Zigbee is at Channel 15 which should be ok in this case. Of course there are neighbours, at 4/6/11, but they are weak.
The 1m cable thing (and it being closer to the other nodes) is the reason why I have the hub in the first place; but I’ll also try the other configuration of plugging it directly into the mini-PC (it will be further away and under a desk then).
I do not power down any routers (bulbs), they are not switched from mains (stay on all the time and only toggled via zigbee). I think I followed the guide while upgrading to this new coordinator stick, but will have a look again. I might also try the firmware and source routing at some point.

What gets me is that it was rock-solid after repowering all routers step by step - and then suddenly went into failure mode with the HA PC reboot (but of course it also did otherwise, some time in the night a few days before…).


The PC was booted at 4:41, and this zigbee outlet is ~4m from the coordinator.
Could be coincidence with something in the neighbourhood or a node going crazy, but not the first guess I would say.

What could lead to such a failure, and how could we get more info (debug logs maybe)?

I would say it’s the AP on Channel 4. It’s directly on Zigbee channel 15

Channels 15, 20 and 25 fill the gaps between wifi 1, 6 and 11.

It’s the USB socket that causes the interference - whether it’s in the PC or the powered hub, you need a cable - mine’s 3m.

On the channels, if you download diagnostics (overflow menu on the ZHA integration page) towards the end of the file you’ll get something like this:

    "energy_scan": {
      "11": 3.2311094587038967,
      "12": 4.15070068297423,
      "13": 75.96022321405563,
      "14": 89.93931580241996,
      "15": 89.93931580241996,
      "16": 80.38447947821754,
      "17": 5.317630506738386,
      "18": 4.69985354430736,
      "19": 0.9017765778954641,
      "20": 39.90320178295578,
      "21": 13.711043742539033,
      "22": 59.15797905332195,
      "23": 55.9836862725909,
      "24": 4.15070068297423,
      "25": 2.509919386096536,
      "26": 92.0598007161209
    }

It shows the percentage noise on each channel, including Zigbee and wi-fi. In this example, my Zigbee’s on 15.

This is how channels map onto each other:

2 Likes

Trying to optimize the RF/EMI situation has not yielded improvements: I connected the coordinator via a USB extension cord to a USB 2.0 port on the PC, powered down the USB3.0 hub and tried to optimize antenna placement a bit also. No change in behaviour.
The survey of the Wifi environment tells me, sure enough, Zigbee Ch 15 is not the best choice :slight_smile: but, it has run that way all the time (with my own APs being the loudest there). Moving to Ch 25 seems a good option, but from documentation it means - re-bind everything. Fun. But I guess I’ll bite the bullet (and also get the -E dongle as coordinator which seems to be a bit more popular nowadays. Will put the -P dongle to router duty).

Nevertheless, so far, the behaviour is the unchanged: Sometimes a few nodes “heal” after a few days, but it is GUARANTEED that after a server restart, nothing will work anymore. The TRADFRI switched outlet right next to the coordinator will not toggle, despite being “present” in ZHA.
Powering the outlet off and on again makes it work instantly.
These outlets form the backbone of my routers. Could it be that one or more of them are going bad (I had the case that one was toggling its relay on and off, that killed the network completely until I got rid of it. Not the case now though), or there was some background FW update pushed to them that broke thing? Any way to find out?

    "energy_scan": {
      "11": 18.823529411764707,
      "12": 17.647058823529413,
      "13": 17.647058823529413,
      "14": 12.941176470588236,
      "15": 15.294117647058824,
      "16": 8.235294117647058,
      "17": 0.0,
      "18": 0.0,
      "19": 0.0,
      "20": 0.0,
      "21": 0.0,
      "22": 0.0,
      "23": 0.0,
      "24": 0.0,
      "25": 0.0,
      "26": 0.0
    },

First thing I would do is check that the antenna on your coordinator is properly screwed in. Your energy scan readings for channel 15 seem too low for the number of routers & environment you describe.

At the very least I’d expect channels 17-26 to not be 0 if you say you have neighbours on wifi channels 4/6/11.

1 Like

I tried to simplify the setup further by powering down all devices except one tradfri outlet that was 2m from the coordinator (hopefully preventing RF issues, after implementing all measures discussed before and double-checking antenna firmness). Reproducible every single time:

  1. Toggling the outlet works
  2. Restart HA container
  3. After waiting for HA to load, the outlet is “available” in ZHA
  4. Try to toggle the outlet - fail.
  5. Unplug and replug the outlet.
  6. Toggling the outlet works instantly.

Hoping to exclude other error influences, I wanted to migrate to Channel 25, knowing that this would likely nuke the network. I tried clicking the “change channel” button in ZHA, but except spinning the “OK” button for a few hours and not telling what it was doing, the result was that the channel was not changed after cancelling and also a lot of battery powered devices were no longer available.
Soo, I tried it in a different way. Downloaded a radio backup, modified the JSON such that the channel is 25, and redeployed to the same radio (of course knowing that with this approach almost all devices will need a re-pair).
IMMEDIATELY the same scenario with the outlet described before worked flawlessly - stable connection after restarting.

I am inclined to think that this relates more to something in the coordinator having been corrupted, rather than RF issues. Good news is that (of course after spending almost a day re-pairing), the setup is stable again and now even on the Wifi-coexistence optimized Channel 25 :slight_smile:

I now have some woes regarding Groups. These do not seem to be in the backup or diagnostics JSON one can download, so have to be recreated.
Doing that, I always seem to switch members of one group along members of another group. Even after removing members from groups, deleting the group, and creating new groups with different IDs, this problem persists with some combination of the members reacting to commands to the other group.
I also noticed that the Confirm button to create groups is always greyed out but works nevertheless. Anyone know something about this? The existing groups worked fine of course before wiping the radio and reinstalling the backup.

Regarding the group issue, I figured out a workaround. The lights that erroneosly react to a group: ADD them to that group, then REMOVE them. Hope to remember this in case it occurs again…

On the original issue, after the radio reset / restore of the backup, it is working reliably again (and even more stable due to the change to Ch. 25). Case closed.