ZwaveJS is this normal? Failed to execute controller command after 1/3 attempts. Scheduling next try in 100 ms

sender · December 9, 2021, 8:57am

Hi, is there any other thing I can do to get rid if these disconnects?

EDIT: created an issue:

Failed to execute controller command after 1/3 attempts. Scheduling next try in 100 ms

opened 10:35AM - 09 Dec 21 UTC

### Describe the issue you are experiencing I have continuous and loads of "Fai…led to execute controller command after 1/3 attempts. Scheduling next try in 100 ms." I do not see any issues in the function of my zwave network. Now and then I have some delays but not really serious. ### What type of installation are you running? Home Assistant OS ### Which operating system are you running on? Home Assistant Operating System ### Which add-on are you reporting an issue with? Z-Wave JS ### What is the version of the add-on? 0.1.50 ### Steps to reproduce the issue I have home assistant running in vmware esxi for about 2 years now. First with OZW (Beta) and now with ZwaveJS addon. The log always shows every few minutes: ``` 2021-12-09T09:23:55.910Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try in 100 ms. ``` ### Anything in the Supervisor logs that might be useful for us? ```txt no ``` ### Anything in the add-on logs that might be useful for us? ```txt 2021-12-09T09:23:55.910Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:25:55.404Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:28:55.208Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:32:55.482Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:34:55.531Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:35:05.707Z CNTRLR [Node 065] Timed out while waiting for a response from the node (ZW0201) 2021-12-09T09:35:05.780Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:36:55.321Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:38:55.770Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:41:55.317Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:41:55.505Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:43:55.288Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:43:55.624Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:44:55.188Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:44:55.553Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:45:55.905Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:50:55.253Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:50:55.859Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:51:55.271Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:53:05.700Z CNTRLR [Node 076] Timed out while waiting for a response from the node (ZW0201) 2021-12-09T09:54:55.330Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:56:55.214Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:56:58.192Z CNTRLR « [Node 057] received wakeup notification 2021-12-09T09:56:58.193Z CNTRLR [Node 057] The node is now awake. 2021-12-09T09:56:59.194Z CNTRLR » [Node 057] Sending node back to sleep... 2021-12-09T09:56:59.255Z CNTRLR [Node 057] The node is now asleep. 2021-12-09T09:57:55.597Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:57:55.986Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T09:59:11.519Z CNTRLR « [Node 021] received wakeup notification 2021-12-09T09:59:11.519Z CNTRLR [Node 021] The node is now awake. 2021-12-09T09:59:11.575Z CNTRLR « [Node 021] received wakeup notification 2021-12-09T09:59:12.577Z CNTRLR » [Node 021] Sending node back to sleep... 2021-12-09T09:59:12.634Z CNTRLR [Node 021] The node is now asleep. 2021-12-09T09:59:55.363Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:01:54.654Z CNTRLR « [Node 012] received wakeup notification 2021-12-09T10:03:55.237Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:04:55.468Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:06:55.292Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:07:55.348Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:08:55.224Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:09:55.535Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:10:55.449Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2021-12-09T10:15:55.287Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. ``` ### Additional information I also asked in the forum: https://community.home-assistant.io/t/zwavejs-is-this-normal-failed-to-execute-controller-command-after-1-3-attempts-scheduling-next-try-in-100-ms

RickKramer · December 9, 2021, 12:26pm

Maybe it’s something with the hypervisor you’re running on? I see you mentioning ESXi, what version? There are quite some issues with v7.0 Update 3x. As you’re running in a VM environment, can you double check the console to see any dmesg logging about USB (re)connection? If so, please also check the same on your host to see if maybe your hardware is starting to fail.

Otherwise I’m all out of ideas and I’ll continue to monitor the conversation/issue created for future references.

sender · December 9, 2021, 1:56pm

6.7. u3 I run (if it ain’t broken don’t fix… well it might be actually :-))

I can get the dmesg log via the ssh SSH & Web Terminal addon but cannot copy.paste it… tips?

via the ESXi host I can get it… but I do not wish to share publicly…

RickKramer · December 9, 2021, 4:11pm

Just wondering if you see your Z-Wave USB device triggering the logs every X seconds. Don’t need to actually see the logs Should you ever need to paste long texts you can use Pastebin. (Web service)

sender · December 9, 2021, 4:12pm

Hi Rick, not sure how I can see that in the logs. I do not now how to look that up in linux…

sender · December 9, 2021, 7:15pm

Some more info. I investigated the homeassitant host, the hypervisor host ans discovered some weird errors. I google them and saw some hits but no solution. And all on esxi 6.5 and 7. I have 6.7

Preventive rebooted the ESXi host which had been running for 364 days :-).

But still errors in homeassitant.

Below errors were in esxi before reboot. Now gone in esxi.

e[7m2021-12-06T11:16:42.685Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.685Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.686Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.686Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.693Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.693Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.694Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.694Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.694Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.694Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.694Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.695Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.695Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.695Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.695Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.695Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.695Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.696Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.696Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.696Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.699Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.700Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.700Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.700Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.700Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.700Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.700Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.701Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.701Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.701Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.701Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.701Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.702Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.702Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.702Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.702Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.705Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.706Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.706Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.706Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.706Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.706Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.707Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.707Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.707Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.707Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.707Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.707Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.708Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.708Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.708Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.708Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.709Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.709Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.709Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m
e[7m2021-12-06T11:16:42.710Z cpu3:2097577)WARNING: vmkusb: transfer error [18]: endpt = 0x82, nframes = 1, timeout = 0, interval = 0, max_packet_size = 32, max_frame_size = 32: Failuree[0m

I have this host (6.7u3) running for over 2 years now with usb passed trough all the time:

[root@localhost:~] lsusb
Bus 001 Device 006: ID 0781:5583 SanDisk Corp. Ultra Fit
Bus 001 Device 005: ID 8087:0aaa Intel Corp.
Bus 001 Device 004: ID 0403:6001 Future Technology Devices International, Ltd FT232 Serial (UART) IC
Bus 001 Device 003: ID 0658:0200 Sigma Designs, Inc. Aeotec Z-Stick Gen5 (ZW090) - UZB
Bus 001 Device 002: ID 1cf1:0030 Dresden Elektronik
Bus 001 Device 001: ID 0e0f:8003 VMware, Inc. Root Hub

Never had issues… even with OZW Beta. Until now with ZwaveJs I see the disconnects…
I do see some topics about 6.5:
https://kb.vmware.com/s/article/2147650

But none of them clearly mention 6.7, only 6.5

I do not “dare” to execute the command esxcli system module set -m=vmkusb -e=FALSE on my 6.7 machine since I might get into worse troubles then.

Any advise?

sender · December 10, 2021, 12:54pm

What’s also to add. I have no error at all for my deconz stick…

mwav3 · December 10, 2021, 1:22pm

Did you install the update yet? There was a bug in the old version. I posted about it here and it could be the cause of your errors. I would update and check again if you haven’t already.

sender · December 10, 2021, 1:33pm

Yes I did that that directly

mwav3 · December 10, 2021, 11:47pm

The logs appear to be from the addon. What do the logs set at “silly” level from the zwavejs server itself show? They can give more of a clue as to what’s going on. To get them go through these screens/steps:

Screenshot from 2021-11-04 20-12-36

Screenshot from 2021-12-10 18-39-05

Someone hinted that the USB is disconnecting, but I don’t think it’s that. These are what logs look like for a disconnecting USB, and yours don’t look at all like that.

Yours seem more like a routing issue, like the controller can’t get a command through to a specific node and keeps retrying. You mentioned sometimes there is a lag so maybe it fails the first time and succeeds on the retry? The silly level zwavejs logs should reveal more clues about what command exactly is failing to go through and to which node(s). Try posting them here. Its likely the developers will also need those on your github issue to further diagnose.

sender · December 11, 2021, 8:29am

Hi Tim, thanks for taking the time to help here!

Is the log of the addon different than the one pointed by you? I see the same output from silly in the addon. I think it is just a matter of selecting the right choice in the integration?

I have captured a part of the log this way. I did try to heal around 2 weeks ago but that took 2 days and I stopped it…

Then I tried this morning to heal and took around 3 hours an I stopped it.

Did anyone ever had a “successful” heal?

mwav3 · December 11, 2021, 12:30pm

I don’t use the addon myself (use zwavejs2mqtt in docker), but i think if you set them to “silly” in the addon configuration they’re probably the same. I know the ones generated on that integration configuration screen are the right ones for the zwavejs server though.

Rarely, and there was some discussion on another thread the network heal logic might actually be flawed, and could, at least temporarily, make issues worse.

70 Zwave device network stuck healing for over 24hrs - #20 by HeyImAlex .

I heal individual problem devices vs the whole network now, which for me works better. A node with log errors, or the device responding slowly, are the ones I heal.

sender · December 11, 2021, 12:55pm

I uploaded more logging to the github issue

sender · December 11, 2021, 5:18pm

Does anyone know how to interpret or where to find how to interpret the logging? What does CNTRLR SERIAL DRIVER mean?

This seems to be such an event of mine:

2021-12-11T17:17:41.027Z SERIAL « [CAN]                                                                   (0x18)
2021-12-11T17:17:41.028Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2021-12-11T17:17:41.130Z SERIAL » 0x0109001348022602250aa4                                            (11 bytes)

mwav3 · December 12, 2021, 5:51pm

Take a look at the zwavejs documentation, specifically for the failed to send error

Failed to send the message after 3 attempts
If this happens, your network is either extremely busy at the moment or something is really wrong with the Z-Wave controller. Try waiting a bit, try removing and re-inserting the stick or try restarting your host.

The documentation overview also explains the difference between controller, driver, etc

I looked at your logs and there seems to be a lot of traffic in your network. Node 61 seems to cause the most issues. What is it, how far is it away, and have you noticed any issues with its performance? I’d try healing it individually (go to the device in the zwavejs integration and select “heal device”).

You might also be “stuck” in a previous heal attempt. Try removing the stick or restart the host as suggested by the documentation, and don’t do another whole network heal. Also, are you polling devices at all? If so try to decrease the frequency of polling. It’s also possible you have an automation stuck in a loop continuously calling a switch to be on/off. I’d check those too.

Update

I see they are working on this under a new issue, hopefully the developers can get to the bottom of it for you

github.com/home-assistant/core

CNTRLR Failed to execute controller command after X attempts. Scheduling next try in 1100 ms.

opened 06:05PM - 11 Dec 21 UTC

closed 10:09PM - 18 Jan 22 UTC

sanderlv

stale integration: zwave_js

### The problem Constant problems with error: CNTRLR Failed to execute contr…oller command after 2/3 attempts. Scheduling next try in 1100 ms. ### What version of Home Assistant Core has the issue? core-2021.11.5 ### What was the last working version of Home Assistant Core? _No response_ ### What type of installation are you running? Home Assistant OS ### Integration causing the issue ZwaveJS ### Link to integration documentation on our website https://www.home-assistant.io/integrations/zwave_js/ ### Example YAML snippet ```yaml I started an issue in the addon section, but I think it is not the right place... https://github.com/home-assistant/addons/issues/2298 ``` ### Anything in the logs that might be useful for us? ```txt https://github.com/home-assistant/addons/issues/2298 ``` ### Additional information https://github.com/home-assistant/addons/issues/2298

sender · December 12, 2021, 7:33pm

Hi Tim, very much appreciated!!

node 61 is a cover device… I must say it’s an old one, but I have more of them (8 in total)
I successfully healed ID61 now.

I have restarted a lot already (also with the recent 2012.12 update and OS update and ESXi reboot, etc.)

May I ask, how did you see ID61 was a troublesome one?

And now you mention it…

I hall 8!) do not send back reliable instant status I have an automation running to poll them all at the same time every minute via zwave_js.refresh_value
I instantly disabled that now and will monitor…

Could this be the culprit? And if how could I overcome? Not to do the poll every minute for all 8 but do them 5-5-5-5 seconds or…?

mwav3 · December 12, 2021, 8:48pm

I chimed in on the git hub issue. It looks like your well on your way to solving this - think the key to your issues is reducing traffic - avoid S0 security, decrease polling, and decrease frequency of power management reports.

After a lot of the fail to send 1/3 attempt message, the next message was something to node 61 so figured that was one causing most of the issues. I usually just look for log entries with the same Node ID over and over. Those are “chatty” nodes that usually are generating too much traffic.

I also mentioned I think you can start a new thread to ask for the best way to do automations to get the polling better, and not do them all at once. I’m not great with Home Assistant automations (use Node Red primarily) so starting a new thread will likely get you a very good response on that.

firstof9 · December 13, 2021, 1:52pm

This smells like ESXi vmkusb issues, give this a shot.

You need to disable the new vmkusb module and load the older legacy USB drivers with this command:

esxcli system module set -m=vmkusb -e=FALSE

Execute this on the ESXi host via SSH.

sender · December 13, 2021, 2:05pm

Hi, thanks, since I stopped polling that error is totally gone… I think and thought this had nothing to do with esxi luckily!

firstof9 · December 13, 2021, 2:07pm

Could just be a symptom of the vmkusb driver.