Zwave JS Freezing Frequently

My Z-wave JS module seems to keep freezing and I am having multiple issues which makes troubleshooting difficult.

First, some background. I have setup about 55 z-wave devices in my home assistant instance that I have had working for less than a week and this problem has always been present. Sometimes, when I try go control a device, nothing happens. If I restart the z-wave JS service, after it boots, everything works fine for a while but will eventually stop within several hours. I am running this in a VM on a Dell server and resources seem fine.

image

Some of the other problems I’m having:

  • Zwave devices (switches, outlets, etc. Not the Zooz 800 itself) will never finish a firmware update, will eventually freeze, and I have to restart.
  • Automations will stop functioning even though I can directly operate a device.
  • Sometimes when adding a new device, the stick and the z-wave JS will freeze and I have to reset both.
  • I set logging to silly, but every time I go back into the configuration, it sets back to info.

Below is a typical error message. I wasn’t even checking for device firmware, but I’m guessing it was checking anyway.

2023-06-29T18:50:26.493Z CNTRLR   [Node 027] Timed out while waiting for a response from the node (ZW0201)
Z-Wave error ZWaveError: Cannot check for firmware updates for node 27: Failed to query firmware version from the node! (ZW0260)
    at ZWaveController.getAvailableFirmwareUpdates (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5815:10)
    at runNextTicks (node:internal/process/task_queues:60:5)
    at processTimers (node:internal/timers:509:9)
    at Function.handle (/usr/src/node_modules/@zwave-js/server/dist/lib/controller/message_handler.js:205:30)
    at Client.receiveMessage (/usr/src/node_modules/@zwave-js/server/dist/lib/server.js:106:62) {
  code: 260,
  context: undefined,
  transactionSource: undefined
}
2023-06-29T18:56:30.551Z CNTRLR   [Node 043] did not respond after 1/3 attempts. Scheduling next try in 500 ms.

Here is another:

2023-06-29T19:27:12.117Z DRIVER   Dropping message because it could not be deserialized: The command class Contr
                                  oller Replication is not implemented (ZW0303)
2023-06-29T19:29:25.197Z CNTRLR   No response from controller after 1/3 attempts. Scheduling next try in 100 ms.
2023-06-29T19:29:36.230Z CNTRLR   No response from controller after 2/3 attempts. Scheduling next try in 1100 ms
                                  .
2023-06-29T19:29:38.269Z DRIVER   unexpected response, discarding...
2023-06-29T19:30:24.361Z CNTRLR   No response from controller after 1/3 attempts. Scheduling next try in 100 ms.
2023-06-29T19:33:24.562Z CNTRLR   No response from controller after 1/3 attempts. Scheduling next try in 100 ms.
2023-06-29T19:33:54.648Z CNTRLR   No response from controller after 1/3 attempts. Scheduling next try in 100 ms.
2023-06-29T19:33:55.751Z CNTRLR   Failed to execute controller command after 2/3 attempts. Scheduling next try i
                                  n 1100 ms.
2023-06-29T19:33:59.678Z DRIVER   unexpected response, discarding...
2023-06-29T19:36:24.935Z CNTRLR   No response from controller after 1/3 attempts. Scheduling next try in 100 ms.
2023-06-29T19:36:55.658Z CNTRLR   [Node 009] Timed out while waiting for a response from the node (ZW0201)
2023-06-29T19:36:57.368Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2023-06-29T19:38:49.880Z CNTRLR   [Node 055] did not respond after 1/3 attempts. Scheduling next try in 500 ms.
2023-06-29T19:42:10.825Z CNTRLR » [Node 009] pinging the node... 

Overall I’m thinking this is either a series of bugs, I have a bad stick, or both.

On a side note, I think it’s interesting that it uses UTC time instead of my current time zone. Any way of changing this?

Thanks in advance for any help!

There are a know set of issues running zwave in a VM on windows. The primary one is the USB passtthru does not work.

Hi Pete, thanks for your response.

This is a VMWare system, not Windows or Hyper-V. I downloaded the image from Home Assist. From what I can tell, for whatever reason, there is a docker setup inside the VM and the HA is on top of that. I do have USB passthrough setup to the VM. Does this affect VMWare as well?

I’m checking now myself too.

Here is another thread which is very similar, but never mentioned if it was solved:

It did mention it, about 3-4 posts from when you posted on there. I also posted a reply to your question on that thread.

Your issue reads like a problem Z-Wave device. I know from past experience that when one goes bad it can cause havoc on your entire Z-Wave network. It might be a failing controller, I replaced mine for good measure and to have a spare device in case I’m ever in a jam.

1 Like

The errors you are seeing

Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms.

Indicate that the USB stick is not responding. In a healthy system you will see these occasionally because sometimes the controller is actually busy receiving a message and can’t respond. The fact you are seeing lots of these means something is wrong with the stick, the USB cable, USB port, device driver, VM pass thru, etc.

Are you able to take a NVM backup of the stick?

If it’s was me, I’d order another stick, a powered USB 2.0 hub and a 3’ USB extension cable. And see if some combination of that helps.

You may also want to dig around and see if you have any power saving profiles active, sometimes these can turn off USB ports, lower the voltage (especially on laptops),

1 Like

It is a new Zooz 800 Stick, but it’s definitely possible.

I am running this through a super long extension cable to a closet. I can try to see how this effects things. I have ran a Gen 5 stick with InControl in the past and it worked fine.

I can swap this to see if it has any effect.

These are a bit more difficult to deal with since I have no past versions of the driver, and I have no easy way of testing various configurations. I’ll try some of the above ones first.

This may be the ticket…

This is a Dell server with VMWare, so far, I haven’t been able to find any such settings like what Windows has. Unless this is something that HA does in it’s OS.

I will probably go ahead and do this too, it couldn’t hurt.

Perhaps try a shorter cable if you have one? Perhaps it’s a problem with the signal, that cable, or voltage drop across the cable.

I was thinking that. It is both a different stick and machine I was once using. I’ll report back later when I have a chance to test…

I have two HA’s running and both have 6’ cables and they work fine - and perhaps that’s the limit but I’m just noting for reference.

Well, I feel silly… It was the cable. I put in a 3 ft cable instead of the 30 footer and it works like a champ. The only question remaining, and I guess it’s what made me think it wasn’t the cable, is why does Z-Wave JS brick itself if the Comms are bad? When I simply restarted it, it would work fine again. Oh well, I’m just happy it’s working.

After Samsung killed it’s automations, it forced me onto something else. Habitat really didn’t work well, but so far, this is working a good bit better.

@CO_4X4 I will still probably get a spare, they are cheap enough.

Thanks All!

On startup zwavejs does a soft reset of the stick, to get it back to a default state. Later on the controller becomes non responsive due to voltage sag, garbled commands, firmware crash, etc.

Thanks, that makes sense. I wish I had known that while troubleshooting.