After stopping the containers to make a backup and restarting them, the devices reacted slow or not at all.
Here’s what I have and did(lots of information below, I tried as much as possible before posting, I hope someone can help out)
Home Assistant running in Docker container, as well as Mosquitto and Zigbee2MQTT (all in separate containers).
Zigbee2MQTT stick (coordinator): SONOFF ZigBee 3.0 USB Dongle Plus, TI CC2652P
No changes were made in any configuration file or docker-compose.yaml
Automations with time triggers still work (switch on light at 8.30, open curtain at sunset, etc.)
Automations based on states work at random: he states are not passed anymore (switch on light at 8.30 if state == off)
When controlling devices from a dashboard, they sometimes react immediately, sometimes not at all:
switch on a lamp: immediately at first, switching it off within a few seconds after switching it on: no reaction. Same goes for dimming or adjusting the color.
Error logs shows timeouts when trying to pass states:
- error: z2m: Publish ‘set’ ‘brightness’ to ‘Slimme lamp dimbaar kleur’ failed: ‘Error: ZCL command 0x08ddebfffe46e95a/1 genLevelCtrl.moveToLevelWithOnOff({“level”:97,“transtime”:0}, {“timeout”:10000,“disableResponse”:false,“disableRecovery”:false,“disableDefaultResponse”:false,“direction”:0,“srcEndpoint”:null,“reservedBits”:0,“manufacturerCode”:null,“transactionSequenceNumber”:null,“writeUndiv”:false}) failed (Data request failed with error: ‘Timeout’ (9999))’
- error: z2m: Publish ‘set’ ‘state’ to ‘eetkamer_lampen’ failed: ‘Error: Command 1 genOnOff.on({}) failed (AREQ - AF - dataConfirm after 3000ms)’
- error: z2m: Failed to read state of ‘Slimme lamp dimbaar kleur’ after reconnect (ZCL command 0x08ddebfffe46e95a/1 genOnOff.read([“onOff”], {“timeout”:10000,“disableResponse”:false,“disableRecovery”:false,“disableDefaultResponse”:true,“direction”:0,“srcEndpoint”:null,“reservedBits”:0,“manufacturerCode”:null,“transactionSequenceNumber”:null,“writeUndiv”:false}) failed (Data request failed with error: ‘Timeout’ (9999)))
Another series of errors are:
[2025-03-04 22:20:58] error: z2m: Error while starting zigbee-herdsman
[2025-03-04 22:20:58] error: z2m: Failed to start zigbee
[2025-03-04 22:20:58] error: z2m: Check https://www.zigbee2mqtt.io/guide/installation/20_zigbee2mqtt-fails-to-start.html for possible solutions
[2025-03-04 22:20:58] error: z2m: Exiting…
[2025-03-04 22:20:58] error: z2m: Error: Failed to connect to the adapter (Error: SRSP - SYS - ping after 6000ms)
at ZStackAdapter.start (/app/node_modules/zigbee-herdsman/src/adapter/z-stack/adapter/zStackAdapter.ts:119:27)
at Controller.start (/app/node_modules/zigbee-herdsman/src/controller/controller.ts:127:29)
at Zigbee.start (/app/lib/zigbee.ts:63:27)
at Controller.start (/app/lib/controller.ts:139:27)
at start (/app/index.js:154:5)
These seems to point at the coordinator as well.
Tried to analyse this in a structured way. The coordinator is:
- visible (symlink visible)
- active (’ adapter attached to ttyUSB0’)
- correct port (/dev/serialby-id/…)
- not used by another process
- user is in the dialout group
-? not known in the logs? ‘docker logs zigbee2mqtt | grep “Coordinator”’ gives no output. (I must admit iI don’t know if there was output when it did work normally)
Placed the coordinator closer to devices: no improvement
I did stop and restart the containers and the NUC more often and dit not have this problem before. 2 Weeks ago This problem occurred, in tested the same things as described above
, finally unplugged and replugged the coordinator and after plugging it into another USB port it seemed solved. I did stop and restart the containers a few times and it kept on working…until 3 days ago when I stopped the containers again and restarted them
(docker-compose stop; docker-compose up; with and without specifying the containers, to get the started in the correct order and wait ten minutes in between, just to be sure (first Mosquitto, wait ten minutes, then Zigbee2MQTT) (10 minutes, since I went to get some tea to kill the waiting time ;))
re-paired a device: gave timeout, then paired. Time triggered automation works. Controlling via a dashboard: same as before, so direct at first, then slowly or not at all.
My final thought is that the coordinator is broken, but since this behaviour did disappear before too and no started again, I find htat hard to believe.
What more can i try to find out what’s going on and what would be a solution?