Zwave nodes dropping out

My system was always very stable, as i managed to create a dense mesh of Zwave devices in my house. But a couple of days ago i started having problems with different nodes stopping to respond. They got stuck in CacheLoad or Probe state and stop refreshing the sensors. When the first sensor dropped out three days ago i just though that it broke down, but since then another three sensors have dropped out. My controller is Zwave Stick gen 5.

Here’s the Zwave log: https://pastebin.com/YCHmkiCS
The problematic nodes are: 4, 5, 9, 14.

What devices are node 4, 5, 9 and 14? Motion sensors? Light switches?

1x Fibaro Smoke Sensor, 2x Aeotec Multisensor 6 and Yale Keyfree Lock (in constant use, so its being woken up constantly). Also, each of those 4 devices is located in a completely different directions from the Zwave hub.

Per the log you posted, the devices are sleeping and queued to get their states when they wakeup and report in.

Do they stay in the “CacheLoad” state forever?
Have you waited for them to report in on their own?

Yeah, they’ve all been “sleeping” for at least last 4 days - this being said the door lock is in constant use, it should sent Zwave reports every time it operated. This hasnt happened before - my Zwave network was rock solid till now.
Switching the devices off and on again by removing a battery also doesnt work, but the devices work (for example smoke sensor reacts to my e-cigarette’s smoke)

PS. One of the devices, smoke alarm - no. 5 is actually stuck on “Probe”. As i said before - it detects the smoke and alarm turns on but doesnt communicate with Zwave. Weird as its located less than 2m from Zwave usb stick

Very strange, have to tried a network heal?

I have exactly the same problem here, almost same log output. Did not have any problems before with my Z-Wave Network of almost 30 devices… But since my last updates to 0.69.x this strange problem occurs. The network becomes unusable…

BTW, i’m on RPI3 with a razberrry GPIO module

Yes, ive tried “Heal network” without any luck.
Just noticed that when i try to “heal” node i get the following HA error:
Node state must a minimum set to awake

And Zwave log after trying to heal it

2018-05-26 15:01:38.210 Detail, Node005, AdvanceQueries queryPending=0 queryRetries=0 queryStage=Probe live=1
2018-05-26 15:01:38.210 Detail, Node005, QueryStage_Probe
2018-05-26 15:01:38.210 Info, Node005, NoOperation::Set - Routing=true
2018-05-26 15:01:38.210 Detail, Node005, Queuing (NoOp) NoOperation_Set (Node=5): 0x01, 0x09, 0x00, 0x13, 0x05, 0x02, 0x00, 0x00, 0x25, 0x0c, 0xcb
2018-05-26 15:01:38.210 Detail, Node005, Queuing (Query) Query Stage Complete (Probe)
2018-05-26 15:01:38.927 Detail, Node005, Notification: NodeAdded
2018-05-26 15:01:38.942 Detail, Node005, Notification: NodeProtocolInfo
2018-05-26 15:01:38.944 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.946 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.947 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.950 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.952 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.954 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.959 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.961 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.970 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.972 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.975 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:38.979 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.004 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.014 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.016 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.017 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.020 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.032 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.039 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.048 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.062 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.071 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.074 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.076 Detail, Node005, Notification: Group
2018-05-26 15:01:39.090 Detail, Node005, Notification: Group
2018-05-26 15:01:39.090 Detail, Node005, Notification: Group
2018-05-26 15:01:39.110 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.111 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.113 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.114 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.116 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.121 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.122 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.124 Detail, Node005, Notification: Group
2018-05-26 15:01:39.160 Detail, Node005, Notification: Group
2018-05-26 15:01:39.166 Detail, Node005, Notification: Group
2018-05-26 15:01:39.172 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.178 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.186 Detail, Node005, Notification: ValueAdded
2018-05-26 15:01:39.191 Detail, Node005, Notification: NodeNaming
018-05-26 15:01:49.766 Info, Node005, Sending (NoOp) message (Callback ID=0x0c, Expected Reply=0x13) - NoOperation_Set (Node=5): 0x01, 0x09, 0x00, 0x13, 0x05, 0x02, 0x00, 0x00, 0x25, 0x0c, 0xcb
2018-05-26 15:01:49.779 Detail, Node005, Received: 0x01, 0x04, 0x01, 0x13, 0x01, 0xe8
2018-05-26 15:01:49.779 Detail, Node005, ZW_SEND_DATA delivered to Z-Wave stack
2018-05-26 15:01:54.053 Detail, Node005, Received: 0x01, 0x07, 0x00, 0x13, 0x0c, 0x01, 0x01, 0xac, 0x4b
2018-05-26 15:01:54.053 Detail, Node005, ZW_SEND_DATA Request with callback ID 0x0c received (expected 0x0c)
2018-05-26 15:01:54.053 Info, Node005, WARNING: ZW_SEND_DATA failed. No ACK received - device may be asleep.
2018-05-26 15:01:54.053 Info, Node005, Node 5 has been marked as asleep
2018-05-26 15:01:54.053 Info, Node005, Node not responding - moving controller command to Wake-Up queue: Delete All Return Routes
2018-05-26 15:01:54.053 Info, Node005, Node not responding - moving controller command to Wake-Up queue: Assign Return Route
2018-05-26 15:01:54.053 Info, Node005, Node not responding - moving QueryStageComplete command to Wake-Up queue
2018-05-26 15:01:54.053 Detail, Node005, Notification: Notification - NoOperation
2018-05-26 15:01:54.054 Detail, Node005, Notification: Notification - Node Asleep

There’s no errors in that log, OpenZwave is getting probe data from the node, then your device goes to sleep. All normal from a battery operated device.

Are any of the sensors reporting anything?
You can check them in the <> menu.

The sensors were stuck with same values for days before i noticed that something is wrong. Right now, after some of my actions (trying to heal node/s, test node/s, etc) they’ve disappeared from HA completely except of Z-wave configuration menu.
When i run “Z-Wave Range Test” on the device, the led turns green, which means that the device has a good connection directly to the controller.

Restart Home Assistant to get the nodes to come back.
As for what’s going on, could be a configuration setting for the device causing it not to report.

I have got something similar to this. It seems that battery operated devices (Fibaro smoke sensors, Coolcam motion sensors etc) dissapear from Home Assistant. Usually after an upgrade and restart. I am using HomeAssistant from a NUC and upgraded with a working set to version 0.96.1 this morning. After that half of my sensors seems gone. If I open the zwave log, I do see all sensors communication.
It seems that this has something to do with the complex registration process in Home Assistant, that now also is stored in .storage. Most of the times I can fix this by restoring a backup from .storage and zwave_device_config.yaml and *.xml but it keeps me off the street for a couple of hours each update :wink:

I analyzed again and discovered this in my docker log:

homeassistant.exceptions.HomeAssistantError: Entity id already exists: sensor.neo_coolcam_battery_powered_pir_sensor_alarm_level_4. Platform zwave does not generate unique IDs

2019-07-19 15:11:43 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved

I do not quite understand how HA administrates the Zwave devices and why it is not repecting the ultimate source: the Zwave Stick combined with the Zwave Stack (OZW). It seems that there are differences between the two that generate this issues.

Exactly my situation too.
I have upgraded HA several steps along the way without any difference. Heal Network do not seem to work, nor the “wake node” by pressing the button on the device tree times make any progress.