Z-Wave concurrency issue - please help

I’m using Aeotec Z-Stick Gen5 and have bunch of Z-Wave devices. All my devices are Z-Wave Plus with security enabled. Some devices are battery operated with Flirs feature (thermostats and lock).
I’m getting bit frustrated with latency issues due to the way Z-Wave network (at least in my case) appears to work (see description below).
I wonder whether others are having similar issues. In particular whether this is general Z-Wave problem, OZW problem, Aeotec Z-Stick problem or just my config problem.

From what I’ve observed and OZW logs I’ve checked it seems that the Z-Wave network processes messages one by one even if they are sent to multiple devices.
The most common scenario is when lock is being (un)locked (which takes ~1 second) and light is being turned on. On its own, light with turn on instantly, but if the lock is doing its work, the light will wait until the lock finishes. In the OZW logs, it appears that the two commands (lock and turn on light) are sent at the same time, however they seems to get processed in sequence (one by one).

The most recent example happend after my changes to automations: my arrival triggers changes in 5 thermostats and unlock event. Due to sequencing issues the lock got unlocked after ~10 seconds (logs here).

Question to folks using Fibaro Home Central integrated with HA: does it exhibit similar concurrency issues ? What’s the delay between Central and HA ? Is it at the ping level (i.e. instant) or there is some inherent delay in the Central ?

So,
I tried using Z-Wave with Home Assistant + Aeotec Z-Stick and gave up on it.
I had issues ranging from long (10s+) delays to desynchronized states of switches and lights.
I think the issue is OZW sends a command (probably followed by a get command to get the status back) and then waits for status report from the device before doing anything else, if it doesn’t arrive OZW does not resend a command, just waits for like 10s and times out. From my experience writing quite a few device handlers for z-wave devices on SmartThings platform I know that many z-wave devices don’t like a behavior like that and It makes them freeze while otherwise they would happily send a status report when they are done.

Due to this I moved back to Fibaro Home Center Lite connected to Home Assistant.
Average delays for device reactions range from instant to about 1s but can take up to 3s if you are sending many commands to the same device (like toggling outputs of dual switch rapidly) but those are more related to z-wave behavior more than issues between Home Assistant and Fibaro controller. Overall the experience is much more stable and pleasant than with OZW.

The only thing I left on my OZW network are scene controllers like remotes since they are not visible in Home Assistnat when connected to Fibaro HCL (not sure why, Fibaro API seems to list them).

The issue is Zwave is a low bandwidth radio network and by that very nature is sequential. So sending lots of commands will queue up and be sent one at a time.

The stand alone hubs tend to appear to do a better job but that’s just because they are tuned better; often just poll the devices; and some offer some sort of priority queue for transmitting.

This is true but OZW seems to stall completely when not receiving expected response, making it IMHO unusable. I can’t be sure if it’s OZW fault or Aeotec Z-Stick Gen5 but my well meshed small network of about 30 devices that worked fine on SmartThings (despite the horrible cloud execution model) and works much better on Fibaro Home Center Lite (despite of how horribly under powered this device is) is unusable on OZW + Z-Stick.

I’m running Pi+Razberry and always had the feeling it was OZW causing it but then I heard that people on Aeotec stick did not see these issues but reading this gets me back to think it’s a OZW issue. OZW is reverse engineered and obviously not perfect. I know HA team is discussing a move to other z-wave implementation after Sigma released sdk.
For me, running OZW, z-wave feels too unreliable but I know other running Z-wave with Fibaro HC and others with Homey without these delays, lost commands and devices being out of sync.

Do you have something like this to see how well your nodes are distributed?

Yeap I installed it when testing the network, Even with 4 devices that could all see each other and the controller I still was getting delays and sync issues.

OK. I have an Aeotec with 4 devices and was getting delays until I used a USB extension cable to move the stick toward my devices.

I’ve got a Z-Stick G5 and OZW and it works no problem with about 50 devices.

Have you spent any time tuning your z-wave stack or just left it at the defaults?

1 Like

What do you have in mind by ‘tuning z-wave stack’ ?

I haven’t done anything except setting device parameters, pooling and healing the network, I’m new to HA and OZW and I’m not aware of any setting exposed to the user (in UI or in config files) that would change z-wave stack behavior but I’m really interested. Is there any documentation I can read into? Did you have similar issues before tuning?

What platform are you on, a rPI?

iocage jail running on FreeNas.

Thanks for this particular reply and thanks for replies in general.
On one hand I would be eager to experiment with Fibaro HC, but on the other hand my problems are not so big as yours and switching all devices to other controller is quite a hassle. Can you elaborate a bit more on reliability of Fibaro HC ?
How often do you need to restart it ? How long does it take ? Does it always succeed ? I’m looking at this from the perspective of restarting HA that restarts Z-Wave stack and sometimes leads to Z-Wave going crazy and needing additional restarts.
The 1-3 seconds max response time from Fibaro HC isn’t perfect either. Here I’m looking from perspective of Zigbee stack driven by Deconz. I have 41 devices and the network is very reliable and fast - it is able to handle dimming with push/release button that in the background keeps sending new brightness setting in ping-pong fashion (set brightness X, wait for light to report brightness X, set brigtness Y, …). Doing the same thing in my Z-Wave stack leads to Z-Wave hanging for a while.
Did you notice similar queueing as I’ve described in my case ?

Regarding optimizing Z-Wave stack. I’ve spent some time tweaking my stack:

  • I’m on NUC. This has significantly improved responsivness of the stack.
  • Z-Stick is on extension USB cable positioned at the center of my house. Almost all devices are directly reachable from it.
  • I have bunch of repeaters (dimmers, shutters) that make mesh quite dense
  • All my devices are Z-Wave Plus
  • Spontanious reporting of temperature, power or other stuff is disabled. Basically, I don’t have any sensors in the stack.
  • Polling is more-less disabled (set to 86400000)

It would be great to hear from @balloob a bit more concrete ETA (likelyhood of making this happen in 2019) regarding Z-Ware in HA and more details of anticipated benefits.

I never had to restart the HCL with the newest firmware (4.530) but I know it’s not the case for everyone. It takes few minutes (about 5) to boot after restart. Never had an issue with Fibaro HCL booting.

When restarting HA the HCL keeps on going so you can restart HA as many times as you want without worrying about the z-wave network.

While z-wave response was much faster with OZW + Z-Stick, the Fibaro HCL integration is much more consistent and stable. The 3s delay is an exception caused by the device itself not the Fibaro/HA integration.

I had no issue with similar loops using Z-Wave through Fibaro HCL but things like this in case of Z-Wave should be handled through Z-Wave Associations. The hanging you experience is, I think, the core of the issue with OZW. It is was causes desynchronization of the switch states in my case.

My setup is not to dissimilar:

  • I’m running HA in jail on a FreeNas server
  • Z-Stick is in the center of the home connected with USB over CAT 5e adapter (the server is in basement)
  • most of my devices are repeaters
  • all but 2 of my devices are Z-Wave +
  • I have many sensors but they report not to often
  • I tried both pooling disabled and enabled the problem persists.

I’m suffering the same issue with my AeoTec USB and Nano switches. The really frustrating thing is that it works fine with my cheap EverSpring plug sockets. They switch on and off instantly and using the manual switch gives an instant status update on Home Assistant.

Hi, just to provide an update on this. I am having exactly the same issue with the Aeotec USB and Dual Nano Switch on HA. I have raised a support query with AeoTec on this and pointed them at this thread.
I have also ordered a Samsung SmartThings Hub to test out the AeoTec devices, I’m assuming Samsung dont use the OZW stack.

If that works then perhaps the HA just gets used to bridge Homekit…

Thanks for the feedback again.
I’m reading through Fibaro Client API implementation and it seems to be done through constant polling. This should be fast enough since there are no waits between calls, however it is a bit sad that Fibaro does not provide push API for changes (e.g. via WS). Also, have more visibility into latency of these calls and error rate (an error leads to 1 seconds sleep time) would be great.

The other issue I’ve noticed is that this integration (and the Fibaro API) does not provide data regarding scenes activated. I have few dimmers wired with buttons and I use scenes to trigger automations in HA. I wonder whether this functionality can’t be implemented using Fibaro LUA that would call a webhook in HA on scene activation in Fibaro - is that possible in HCL ?
EDIT: After reading HCL documentation I’m quite convinced that the approach with LUA based scene activation will work, however LUA is not supported in HCL (HC2 is required). From what I understand, normal scenes (in HCL) can’t call URL.
Adding @Peter_Balogh for additional feedback.

Thanks for raising this issue further. I wonder how folks with tens of Z-Wave devices and OZW deal with this issue. Yesterday I had to wait ~10 seconds for motion activated Z-Wave light to turn on, because it was exactly 9 PM and at 9 PM all my Z-Wave thermostats get updated with new temperature …

EDIT: I’ve spent some time going through OZW code and checking forum and it seems to be OZW issue that might be due to limitations of some Z-Wave sticks. In 2014 they’ve talked about adding support for concurrent writes.
In OZW there is just one queue for messages to be sent. That queue is processed sequentially. For secure devices there are additional messages sent that encrypt the actual message. Comments in OZW log indicate that hardware doesn’t handle concurrent messages that well.

It’s not pooling. There is a feature in the HCL API that opens a connection that stays open until there are changes, then it reports the changes and is closed. After it’s closed another one is opened immediately. Its not pooling as HA does not send request constantly - only after the changes are received. Changes are reported immediately. AFAIK Fibaro uses the same API to update the web interface.

When it comes to scenes I’m experimenting right now with a virtual device sending GET requests (to Node-Red in my case). The scenes call the virtual device button which in turn calls an URL.

You can also hack LUA scenes on to HCL (it stores all scenes as LUA, there is a jailbreak that allows you to FTP into HCL and edit scene files)