Upgrades today have caused constant unresponsive warnings, resulting in repetitive driver restarts and z wave interruptions

Thats my exact concern too. This dead node being related to constant crashes has been mentioned more than once by different people.

Do you have any debug driver logs that demonstrate this? This is not my observation, a dead node is simply detected as dead and the controller is not restarted. Maybe there is more to it than that.

I don’t have logs yet (when i have time to deal with fallout of upgrade again i’ll be sure to get them), but in one example of logs earlier, reactions to dead nodes (or lack of ACK relating to them) have caused the driver restarts:

Comment directly below that stated that removing the dead node stopped it happening. Does seem somethings happening with dead nodes or how they are being handled

The logs looks the same as most already posted.

I did a test before I did a complete reset.

I installed a complete fresh HA on another rpi4 moved usb only installed zwave nothing else, added the old keys it just kept crashing.

As soon as the network with the dead node was removed everything works.

Seems like a lot of people who have had this issue who either got lucky and could remove the dead node or like me that did a complete reset the issue got resolved.
So I feel pretty comfortable with pointing to that as the crash reason.

Ok. It may be the case that removing these dead nodes solves the problem in certain cases, but it’s also true that having a dead node doesn’t mean you will hit the problem either (proof is myself, a dead node does not reset the controller). Could also be a 500 vs 700 series issue, as I have a 700.

If I were in this position and had access to a Windows PC, I would try using PC Controller to remove the dead nodes first, instead of rebuilding the entire network. Or just rollback to a previous version of the add-on that is not affected. These new add-on versions aren’t required until HA 2023.10, so there’s no reason to stay in them if they’re causing this much trouble.

1 Like

Still having issues using Z-Wave JS (ZJS) 1.93. I had to revert to HA 2023.8.x and ZJS 1.87.

I read on one of the GitHub threads that ZJS 1.9 wasn’t supported on HA 2023.8. Either way, ZJS keeps losing connection to the controller.

My Setup:

Here are my troubleshooting steps:

  1. Restore to HA Core 2023.8.2 – confirm working, leave for 24+ hours
  2. Update ZJS to 1.93 – Controller soft reset attempt fails
  3. Reboot entire server just to make sure there’s nothing that needs to be cleared out
  4. Restore to #1, confirm working.
  5. Update HA Core to 2023.9.3 – Controller soft reset attempt fails (logs attached)
    – Note that HA Core 2023.9.3 Installs ZHA v1.93. I read on one of the GitHub issue threads that ZHA 1.8x isn’t compatible with HA 2023.9.x
  6. Restore to #1, confirm working.
  7. Switch to ZWave UI v2.0.1, disable Soft Reset in UI, still fails (logs attached) :exploding_head:

What started this whole debacle instead of just leaving the restore well enough alone was that I needed to set associations between devices, thanks to a whole slew of new Inovelli dimmers and a plethora of 3-way switches (previous owner was an electrician…) :smile:

What I finally ended up doing is installing ZWave JS UI v1.16.0:
I forked the add-on repo from the official one and set the version I wanted so I could downgrade (v1.16.0). NOW, I can upgrade HA to the latest until this gets fixed for good.

After figuring out how to downgrade, I followed the excellent guide to switch from Z-Wave JS to Z-Wave JS UI by @freshcoast for the rest. The only thing that is different is the add-on identifier in Step 9, which should be the same if you use my repo:

ws://ed7eb4c2-zwavejs2mqtt:3000

You could be right on the 500 vs 700 dead node statement. I’m using a 500 controller now (downgraded from 700, as the not really dead nodes that kept occurring daily on the 700 despite latest firmware was driving me mad, and kept breaking heating automations which have to work all the time. Network has been perfectly stable ever since)

So when 2023.10 is out will this help the situation and we can upgrade safely to JS UI V2.0.1 ?

1 Like

No, HA versions have nothing to do with the problems expressed here. If you upgrade to HA 2023.10 you will be forced to upgrade your add-on.

Given that @freshcoast, right now it looks like the best preventative is what? Being On a 700 series coordinator or turning off soft reset on a 500?

Ive got this update on hold because the thought of reconnecting 80 devices on ladders is not exactly thrilling. But I also have been waiting on something in 2023.10 to fix something else.

I’m confused then, there seems to be a big problem with JS UI V2.0.1 where you can lose access to all your devices and HA 2023.10 will require it…

So what do we do? Is someone working on a fix, does an issue need to be raised, logs provided somewhere?

Yes it’s being worked on:
:rescue_worker_helmet: Version 12: Problems communicating with the controller :rescue_worker_helmet: · Issue #6341 · zwave-js/node-zwave-js (github.com)

This specific issue you’ve linked to is not something that is “being worked on”. It is a set of instructions telling users how to fix problems related to migrating to v12.

Apologies. There are four so far I’ve read this morning in both JS and JsUI that seem to be interconnected. This seemed to be the most concise summary. Which one is the root?
(and why were all confused)

No apology needed, just clarifying as it is indeed all confusing. I don’t think there is a single root issue, and not everyone has a problem even.

Here’s my take on the current situation.

First, I think many of the reported problems are related to the new soft-reset behavior and 500-series controllers. Configuring your VM (if using one) and using the by-id paths, as described in the linked issue, will go a long way. If you are unable to perform those fixes, then both add-ons have the ability to manually disable soft-reset as of right now. You can find reports that doing either of those have solved some people’s problems.

Second, I think users are affected in different ways by the new behavior for handling unresponsive controllers.

Some subset of users seem to have a supposedly unresponsive 500-series controller (according to the logs) despite either fixing the above soft-reset config requirements or disabling soft-reset entirely. So there seems to an unsolved problem there. Yet not everyone is affected by this. I don’t think you will know until you try.

If you have a 700-series, you may be affected by a possible 7.19 firmware issue that supposedly causes unresponsive controllers. The workaround here is also the automatic soft-reset recovery. This has been reported to Silicon Labs, but I personally haven’t heard of any update. I’m staying on 7.18 for now.

There is also a fix in v12.0.2 related to unresponsive controllers (jammed). The newest version of the official add-on has this version of the driver, but the ZUI add-on does not yet (at this very moment).

So prior to upgrading any add-ons, if you’re using a VM and a 500-series controller, I’d says it’s essential to get USB pass-through working completely. Second, use the /dev/serial/by-id paths when available whether VM or bare-metal. These are already provided by HAOS. If you are using Docker (or Snap?), make sure your OS is providing the path for you. There are some versions of systemd which broke this, updating can solve that. If none of those are possible, as a last resort you can disable soft-reset too.

So all I can really advise is to make sure you have a backup of any working add-on so you can rollback if needed, and be aware that HA 2023.10 does having breaking requirements (needs official add-on 0.1.91, ZUI add-on 2.0.0, or standalone ZUI 9.0.0, all of which use driver v12), so keep that in mind if you plan an HA update this week.

3 Likes

We have an eta on 12.0.2? Sounds like that’s where ZUI users want to be before next week.

Only the add-on maintainer can answer that, community add-ons are a one person show. Probably busy with the HA release.

HA release is tomorrow, not next week.

I don’t really know how prevalent the issue fixed in 12.0.2 is.

1 Like

I have access to the repo, so I’m going to try to push the release as long as we can get CI to pass

EDIT: pushed!

2 Likes

Will this create an increment update in HA ZUI from 2.0.1 i’m currently seeing available?

1 Like

Z-Wave JS UI 2.0.2 was released 12hrs ago according to Release v2.0.2 · hassio-addons/addon-zwave-js-ui · GitHub

I don’t know how often HA checks for a new addon?