This week I decided to take the plunge and migrate from ZWave (deprecated) to ZWave-JS. It didn’t go well…
My installation is a RPi3B+ with an Aeotec Gen5 USB stick ZW090-A. It’s a largish network, about 125 ZWave devices of various ages and manufacturers. But it’s been working well for several years, as I’ve updated frequently to keep up with the software releases of HA. Haven’t done anything to the Gen5 stick firmware though, it’s just as it came from the factory.
Using the guidance on the HA website, I tried the “Start Migration” process. It ran for a while and then said “Migration failed.” It hadn’t gotten even to the point of telling me what it would be able to migrate. No other info about why it failed.
So I figured I should try the manual method. I powered down and removed the SD chip so that I could go back to the old system that had been working for years. I used a new SD chip (Sandisk 32GB “high endurance”), and flashed it using BalenaEtcher with the current HA release - https://github.com/home-assistant/operating-system/releases/download/7.4/haos_rpi3-64-7.4.img.xz Brought that up and did a restore-from-backup I had taken a few days earlier. Everything seemed to come back online fine, including other integrations (Sonos, Vizio, etc.) Deleted the ZWave integration, power-cycled, rebooted, and installed the ZWave-JS addon, using the network key I had saved from the old install. Looked at the log and it seemed to be methodically working through my ZWave devices. Hours later (125+ devices to interview…) the system showed it had found 125 devices, 124 of which were not ready. Installed the ZWave-JS integration. It showed lots of devices, some of which had the expected manufacturer and device model information. But nothing was “ready”.
Going to Plan B, I powered the RPi down and put the original SD chip back into service. Booted up, but after a few hours all ZWave devices were still in “Unknown” state. In the past, all but battery-powered devices would be fully functional in that time. So now the “old” system no longer works either. Tried again, using older backups, with the same result.
Going to Plan C, I decided to try a totally fresh install. Reflashed a SD chip to the same current HA release, booted it up, and did a new setup rather than a restore from backup. I’d have to set up all my other integrations again of course, but this seemed likely, as a fresh install, to get ZWave-JS running.
Plan C didn’t work either. I captured parts of the ZWave-JS add-on log as it went through the startup process, and watched the LEDs on the Gen5 stick as it worked. Everything looked good at first, but after somewhere around 100 devices the log started showing errors (e.g., “Status No Ack (ZW0204)” and the Gen5 stick seemed to become “stuck”. Excerpts from the log are below if anyone’s curious.
At this point, I’m suspecting that my USB stick died – while it was in the process of migrating. Not sure if that might be just coincidence or because of something that ZWave-JS does that the old ZWave integration didn’t trigger. Perhaps ZWave-JS is going through the interview process much faster than the ZWave 1.4 software and somehow that overloads something and starts getting errors? Curious though that even the old deprecated 1.4 configuration no longer works. Conclusion – the Gen5 stick has died somehow.
So, I’ve ordered a new USB stick (sticking to 500 series with a HUSBZB-1). Wanted to add Zigbee capability anyway. Hopefully I can get my setup working again after getting each ZWave device onto the new stick. It will take a while; some of the devices aren’t easy to reach.
Meanwhile, maybe someone sees some clues in these logs. The symptoms (large network, lots of devices, a dozen or so battery-powered devices, loss of ability to communicate, etc.) sound to me to be very similar to the problem recently reported with the new 700 devices. But my controller is a 500 device. Maybe there was a similar bug in the 500 system too? Or perhaps there’s something from those dead devices in the Gen5’s network data that is somehow causing the behavior I see? There were a bunch of devices that I remember took several tries to Include and left “dead” devices that I never managed to remove, but they didn’t seem to be a problem. Or perhaps my Gen5 stick just died; it’s been running continuously for 6 or 7 years.
Jack Haverty
============ excerpts from ZWave-JS addon logs, with a few comments I added =============
ZWave-JS addon started up, looks like it’s progressing normally. ZStick Gen5 LEDs are cycling yellow/red/blu every few seconds as expected.
2022-02-12T18:21:26.880Z CNTRLR [Node 068] The node is alive.
2022-02-12T18:21:26.889Z CNTRLR « [Node 068] ping successful
2022-02-12T18:21:26.891Z CNTRLR [Node 068] Interviewing Manufacturer Specific…
2022-02-12T18:21:26.892Z CNTRLR » [Node 068] querying manufacturer information…
2022-02-12T18:21:26.980Z CNTRLR [Node 069] The node is alive.
2022-02-12T18:21:27.002Z CNTRLR « [Node 069] ping successful
2022-02-12T18:21:27.004Z CNTRLR [Node 069] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.005Z CNTRLR » [Node 069] querying manufacturer information…
2022-02-12T18:21:27.086Z CNTRLR [Node 070] The node is alive.
2022-02-12T18:21:27.096Z CNTRLR « [Node 070] ping successful
2022-02-12T18:21:27.097Z CNTRLR [Node 070] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.098Z CNTRLR » [Node 070] querying manufacturer information…
2022-02-12T18:21:27.154Z CNTRLR [Node 071] The node is alive.
2022-02-12T18:21:27.163Z CNTRLR « [Node 071] ping successful
2022-02-12T18:21:27.164Z CNTRLR [Node 071] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.165Z CNTRLR » [Node 071] querying manufacturer information…
2022-02-12T18:21:27.256Z CNTRLR [Node 072] The node is alive.
2022-02-12T18:21:27.282Z CNTRLR « [Node 072] ping successful
2022-02-12T18:21:27.284Z CNTRLR [Node 072] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.285Z CNTRLR » [Node 072] querying manufacturer information…
2022-02-12T18:21:27.357Z CNTRLR [Node 073] The node is alive.
2022-02-12T18:21:27.368Z CNTRLR « [Node 073] ping successful
2022-02-12T18:21:27.370Z CNTRLR [Node 073] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.371Z CNTRLR » [Node 073] querying manufacturer information…
2022-02-12T18:21:27.468Z CNTRLR [Node 078] The node is alive.
2022-02-12T18:21:27.478Z CNTRLR « [Node 078] ping successful
2022-02-12T18:21:27.480Z CNTRLR [Node 078] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.481Z CNTRLR » [Node 078] querying manufacturer information…
2022-02-12T18:21:27.552Z CNTRLR [Node 088] The node is alive.
2022-02-12T18:21:27.562Z CNTRLR « [Node 088] ping successful
2022-02-12T18:21:27.563Z CNTRLR [Node 088] Interviewing Manufacturer Specific…
2022-02-12T18:21:27.564Z CNTRLR » [Node 088] querying manufacturer information…
2022-02-12T18:21:27.612Z CNTRLR [Node 098] The node is alive.
2022-02-12T18:21:27.661Z CNTRLR « [Node 098] ping successful
2022-02-12T18:21:27.667Z CNTRLR » [Node 098] Querying securely supported commands (S0)…
2022-02-12T18:21:27.764Z CNTRLR [Node 103] The node is alive.
2022-02-12T18:21:27.775Z CNTRLR « [Node 103] ping successful
2022-02-12T18:21:27.777Z CNTRLR » [Node 103] Querying securely supported commands (S0)…
.
.
.
ZWave JS startup continues, but after a minute or two starts getting errors. Stick LEDs are still cycling but very slowly.
2022-02-12T18:22:20.881Z CNTRLR [Node 122] The node did not respond after 1 attempts, it is presumed dead
2022-02-12T18:22:20.884Z CNTRLR [Node 122] The node is dead.
2022-02-12T18:22:20.907Z CNTRLR [Node 122] ping failed: Failed to send the command after 1 attempts (Status No
Ack) (ZW0204)
2022-02-12T18:22:20.908Z CNTRLR » [Node 122] querying node info…
2022-02-12T18:22:20.912Z CNTRLR » [Node 122] pinging the node…
.
.
.
ZWaveJS interview continues, but everything fails. Stick LEDs are stuck in a single color, not cycling at all
2022-02-12T18:23:07.577Z CNTRLR [Node 135] The node did not respond after 1 attempts, it is presumed dead
2022-02-12T18:23:07.580Z CNTRLR [Node 135] The node is dead.
2022-02-12T18:23:07.601Z CNTRLR [Node 135] ping failed: Failed to send the command after 1 attempts (Status No
Ack) (ZW0204)
2022-02-12T18:23:07.602Z CNTRLR » [Node 135] querying node info…
2022-02-12T18:23:07.603Z CNTRLR » [Node 135] pinging the node…
2022-02-12T18:23:11.806Z CNTRLR [Node 139] The node did not respond after 1 attempts, it is presumed dead
2022-02-12T18:23:11.809Z CNTRLR [Node 139] The node is dead.
2022-02-12T18:23:11.835Z CNTRLR [Node 139] ping failed: Failed to send the command after 1 attempts (Status No
Ack) (ZW0204)
2022-02-12T18:23:11.836Z CNTRLR » [Node 139] querying node info…
2022-02-12T18:23:11.838Z CNTRLR » [Node 139] pinging the node…
2022-02-12T18:23:11.972Z CNTRLR [Node 145] The node is alive.
2022-02-12T18:23:11.985Z CNTRLR « [Node 145] ping successful
2022-02-12T18:23:11.986Z CNTRLR [Node 145] Interviewing Manufacturer Specific…
2022-02-12T18:23:11.987Z CNTRLR » [Node 145] querying manufacturer information…
2022-02-12T18:23:13.624Z CNTRLR No response from controller after 1/3 attempts. Scheduling next try in 100 ms.
2022-02-12T18:23:14.013Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i
n 1100 ms.
2022-02-12T18:23:15.563Z CNTRLR [Node 146] The node is alive.
2022-02-12T18:23:15.578Z CNTRLR « [Node 146] ping successful
2022-02-12T18:23:15.579Z CNTRLR » [Node 146] querying node info…
2022-02-12T18:23:15.640Z CNTRLR [Node 147] The node is alive.
2022-02-12T18:23:15.653Z CNTRLR « [Node 147] ping successful
2022-02-12T18:23:15.654Z CNTRLR [Node 147] Interviewing Manufacturer Specific…
2022-02-12T18:23:15.655Z CNTRLR » [Node 147] querying manufacturer information…
2022-02-12T18:23:19.865Z CNTRLR [Node 148] The node did not respond after 1 attempts, it is presumed dead
2022-02-12T18:23:19.867Z CNTRLR [Node 148] The node is dead.
2022-02-12T18:23:19.885Z CNTRLR [Node 148] ping failed: Failed to send the command after 1 attempts (Status No
Ack) (ZW0204)
2022-02-12T18:23:19.886Z CNTRLR » [Node 148] querying node info…
2022-02-12T18:23:19.887Z CNTRLR » [Node 148] pinging the node…
ZWave stick LEDs are now off, no color shown at all.