Hi there, I hope someone can help me with problems that recently started cropping up with my Zigbee network.
My ~130 device Zigbee network using ZHA and a TubesZB CC2652 coordinator is the backbone of my HA setup and has largely been operating stably for years. About a week ago, I started to get strange behaviour when trying to pair new devices. After a device successfully pairs, it would be able to communicate with HA (send/receive commands). However, after a few seconds, ZHA would restart the pairing process. This would loop until the pairing process times out. Some never-before-seen errors appeared in the logs:
- Cancelling previous initialization task for device xx:xx:xx:xx:xx:xx:xx:xx
- [0x1265:1:0x1000]: Couldn’t get list of groups: Device has re-joined the network
Error doing job: Exception in callback Gateway.device_initialized.<locals>.<lambda>() at /usr/local/lib/python3.13/site-packages/zha/application/gateway.py:457 (None)
Traceback (most recent call last):
File "/usr/local/lib/python3.13/asyncio/events.py", line 89, in _run
self._context.run(self._callback, *self._args)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/site-packages/zha/application/gateway.py", line 457, in <lambda>
init_task.add_done_callback(lambda _: self._device_init_tasks.pop(device.ieee))
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: xx:xx:xx:xx:xx:xx:xx:xx
I tried restarting HA but then ZHA completely refuses to start, complaining of a corrupted Zigbee database. Restoring a backup from before the restart allows ZHA to start properly, but now it seems that I can’t add more devices to my network. I’m also unsure if I can restart HA safely.
I tried to use this zigpy-cli command to repair the database using the Advanced SSH Terminal Add-on and a virtual environment, but it complains of FileNotFoundError: [Errno 2] No such file or directory: 'sqlite3'
. I then tried to copy the database into a Windows machine, install zigpy-cli WSL in a virtual environment, and run the command again. I can see the process starts but then it fails with sql error: no such table: sqlite_dbpage (1)
.
At this point, I feel like I am out of options. Can anyone guide me through the process to repair zigbee.db? Or am I stuck with trying to restore the database from backups, failing which I need to rebuild the Zigbee network from scratch?
Please let me know if I can provide more information. I can collect some more logs if it helps!