Ever since I updated my Home Assistant installation to the latest 2020.12.1, I’ve been running into trouble with my ZHA integration. About once a day my entire network goes down without any warning and I can’t seem to find a certain pattern in what leads up to the crash. Sensors and buttons just all go unresponsive and return to normal function right after restarting the server from inside the GUI.
In the meantime I’ve activated loging and this shows up every time around the time everything goes dark:
2020-12-24 11:36:14 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback SerialTransport._read_ready()
Traceback (most recent call last):
File "/usr/local/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.8/site-packages/serial_asyncio/__init__.py", line 119, in _read_ready
self._protocol.data_received(data)
File "/usr/local/lib/python3.8/site-packages/zigpy_deconz/uart.py", line 85, in data_received
self._api.data_received(frame)
File "/usr/local/lib/python3.8/site-packages/zigpy_deconz/api.py", line 359, in data_received
fut.set_result(data)
asyncio.exceptions.InvalidStateError: invalid state
2020-12-24 11:36:14 WARNING (MainThread) [zigpy_deconz.api] No response to 'Command.aps_data_indication' command with seq id '0xa3'
2020-12-24 11:36:14 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/zigpy_deconz/api.py", line 305, in _command
return await asyncio.wait_for(fut, timeout=COMMAND_TIMEOUT)
File "/usr/local/lib/python3.8/asyncio/tasks.py", line 498, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/zigpy_deconz/api.py", line 462, in _aps_data_indication
r = await self._command(
File "/usr/local/lib/python3.8/site-packages/zigpy_deconz/api.py", line 310, in _command
self._awaiting.pop(seq)
KeyError: 163
Background info:
I’m running HA on a Raspberry Pi 4 with 4GB of RAM, inside a Docker container. ZHA is handled through a Conbee II stick that I plugged in through a USB extension cable. Everything worked great until the update, but now it barely manages a full day without intervention.
What I’ve tried myself
I’ve been unplugging and plugging the Conbee stick, I’ve updated the firmware of the stick through the Phoscon app on my laptop, I’ve been switching between stable and latest images of HA hoping a bug had been fixed in the meantime… but all to no avail.
Edit: completely removed the integration and added it again, no cigar…
I’m a bit stuck on what to try next, so any tips or pointers would be greatly appreciated!