Zwave stops working shortly after boot

Recently after a reboot zwave started to stop working about 2-5min after reboot. I’ve had some success by deleting the zwcfg file and doing a reboot via power toggle. It usually “works” find 80% of the nodes. If I reboot (via hass) it goes back into the zwave crash cycle.

  • Hass.io on RasPi3
  • Aeotec Z-wave stick 5
  • I’ve tried a powered usb hub in case that was an issue

Here is a log example

2018-04-23 21:29:36.110 Always, OpenZwave Version 1.4.2926 Starting Up
2018-04-23 21:29:52.640 Info, Setting Up Provided Network Key for Secure Communications
2018-04-23 21:29:52.640 Info, mgr,     Added driver for controller /dev/ttyACM0
2018-04-23 21:29:52.640 Info,   Opening controller /dev/ttyACM0
2018-04-23 21:29:52.641 Info, Trying to open serial port /dev/ttyACM0 (attempt 1)
2018-04-23 21:29:52.642 Info, Serial port /dev/ttyACM0 opened (attempt 1)
2018-04-23 21:29:52.645 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-04-23 21:29:52.645 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_MEMORY_GET_ID: 0x01, 0x03, 0x00, 0x20, 0xdc
2018-04-23 21:29:52.645 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_CONTROLLER_CAPABILITIES: 0x01, 0x03, 0x00, 0x05, 0xf9
2018-04-23 21:29:52.645 Detail, contrlr, Queuing (Command) FUNC_ID_SERIAL_API_GET_CAPABILITIES: 0x01, 0x03, 0x00, 0x07, 0xfb
2018-04-23 21:29:52.645 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_SUC_NODE_ID: 0x01, 0x03, 0x00, 0x56, 0xaa
2018-04-23 21:29:52.645 Detail, 
2018-04-23 21:29:52.645 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-04-23 21:29:53.646 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-04-23 21:29:53.646 Detail, contrlr, Removing current message
2018-04-23 21:29:53.646 Detail, contrlr, Notification: Notification - TimeOut
2018-04-23 21:29:53.651 Detail, 
2018-04-23 21:29:53.651 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x20) - FUNC_ID_ZW_MEMORY_GET_ID: 0x01, 0x03, 0x00, 0x20, 0xdc
2018-04-23 21:29:54.651 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-04-23 21:29:54.651 Detail, contrlr, Removing current message
2018-04-23 21:29:54.652 Detail, contrlr, Notification: Notification - TimeOut
2018-04-23 21:29:54.667 Detail, 
2018-04-23 21:29:54.667 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x05) - FUNC_ID_ZW_GET_CONTROLLER_CAPABILITIES: 0x01, 0x03, 0x00, 0x05, 0xf9
2018-04-23 21:29:55.668 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-04-23 21:29:55.668 Detail, contrlr, Removing current message
2018-04-23 21:29:55.668 Detail, contrlr, Notification: Notification - TimeOut
2018-04-23 21:29:55.670 Detail, 
2018-04-23 21:29:55.670 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x07) - FUNC_ID_SERIAL_API_GET_CAPABILITIES: 0x01, 0x03, 0x00, 0x07, 0xfb
2018-04-23 21:29:56.670 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-04-23 21:29:56.670 Detail, contrlr, Removing current message
2018-04-23 21:29:56.670 Detail, contrlr, Notification: Notification - TimeOut
2018-04-23 21:29:56.671 Detail, 
2018-04-23 21:29:56.671 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x56) - FUNC_ID_ZW_GET_SUC_NODE_ID: 0x01, 0x03, 0x00, 0x56, 0xaa
2018-04-23 21:29:57.671 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-04-23 21:29:57.671 Detail, contrlr, Removing current message
2018-04-23 21:29:57.671 Detail, contrlr, Notification: Notification - TimeOut
2018-04-23 21:34:13.294 Detail,   Received: 0x01, 0x10, 0x01, 0x15, 0x5a, 0x2d, 0x57, 0x61, 0x76, 0x65, 0x20, 0x33, 0x2e, 0x39, 0x35, 0x00, 0x01, 0x99
2018-04-23 21:34:13.294 Detail, 
2018-04-23 21:34:13.295 Info, Received reply to FUNC_ID_ZW_GET_VERSION:
2018-04-23 21:34:13.295 Info,     Static Controller library, version Z-Wave 3.95
2018-04-23 21:34:13.295 Detail, CAN received...triggering resend
2018-04-23 21:34:13.296 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-04-23 21:34:13.296 Always, 
2018-04-23 21:34:13.296 Always, Dumping queued log messages
2018-04-23 21:34:13.296 Always, 
2018-04-23 21:34:13.296 Always, 
2018-04-23 21:34:13.296 Always, End of queued log message dump
2018-04-23 21:34:13.296 Always, 
2018-04-23 21:34:13.296 Detail, WriteMsg CAN m_currentMsg=00000000
2018-04-23 21:34:13.296 Detail, CAN received...triggering resend
2018-04-23 21:34:13.296 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-04-23 21:34:13.297 Always, 
2018-04-23 21:34:13.297 Always, Dumping queued log messages
2018-04-23 21:34:13.297 Always, 
2018-04-23 21:34:13.297 Always, 
2018-04-23 21:34:13.297 Always, End of queued log message dump
2018-04-23 21:34:13.297 Always, 
2018-04-23 21:34:13.297 Detail, WriteMsg CAN m_currentMsg=00000000
2018-04-23 21:34:13.297 Detail, CAN received...triggering resend
2018-04-23 21:34:13.298 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-04-23 21:34:13.298 Always, 
2018-04-23 21:34:13.298 Always, Dumping queued log messages
2018-04-23 21:34:13.298 Always, 
2018-04-23 21:34:13.298 Always, 
2018-04-23 21:34:13.298 Always, End of queued log message dump
2018-04-23 21:34:13.298 Always, 
2018-04-23 21:34:13.298 Detail, WriteMsg CAN m_currentMsg=00000000
2018-04-23 21:34:13.298 Detail, CAN received...triggering resend
2018-04-23 21:34:13.298 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-04-23 21:34:13.298 Always, 
2018-04-23 21:34:13.298 Always, Dumping queued log messages
2018-04-23 21:34:13.299 Always, 
2018-04-23 21:34:13.299 Always, 
2018-04-23 21:34:13.299 Always, End of queued log message dump
2018-04-23 21:34:13.299 Always, 
2018-04-23 21:34:13.299 Detail, WriteMsg CAN m_currentMsg=00000000
2018-04-23 21:34:13.299 Detail, Node041,   Received: 0x01, 0x09, 0x00, 0x04, 0x00, 0x29, 0x03, 0x20, 0x01, 0xff, 0x06
2018-04-23 21:34:13.299 Detail, 
2018-04-23 21:34:13.394 Detail, Node052,   Received: 0x01, 0x11, 0x00, 0x04, 0x00, 0x34, 0x0b, 0x71, 0x05, 0x07, 0x00, 0x00, 0xff, 0x07, 0x00, 0x01, 0x08, 0x00, 0x57
2018-04-23 21:34:13.394 Detail, 
2018-04-23 21:39:12.344 Detail, Node052,   Received: 0x01, 0x11, 0x00, 0x04, 0x00, 0x34, 0x0b, 0x71, 0x05, 0x07, 0x00, 0x00, 0xff, 0x07, 0x00, 0x01, 0x08, 0x00, 0x57
2018-04-23 21:39:12.345 Detail, 
2018-04-23 21:49:34.408 Detail,   Received: 0x01, 0x07, 0x00, 0x13, 0x36, 0x01, 0xae, 0x97, 0xe5
2018-04-23 21:49:34.408 Detail,   ZW_SEND_DATA Request with callback ID 0x36 received (expected 0x00)
2018-04-23 21:49:34.408 Warning, WARNING: Unexpected Callback ID received
2018-04-23 21:49:35.604 Detail,   Received: 0x01, 0x07, 0x00, 0x13, 0x36, 0x01, 0xaf, 0x0f, 0x7c
2018-04-23 21:49:35.604 Detail,   ZW_SEND_DATA Request with callback ID 0x36 received (expected 0x00)
2018-04-23 21:49:35.605 Warning, WARNING: Unexpected Callback ID received
2018-04-23 21:49:46.312 Detail, Node052,   Received: 0x01, 0x09, 0x00, 0x04, 0x00, 0x34, 0x03, 0x20, 0x01, 0xff, 0x1b
2018-04-23 21:49:46.312 Detail, 
2018-04-23 21:49:46.497 Detail, Node052,   Received: 0x01, 0x10, 0x00, 0x04, 0x00, 0x34, 0x0a, 0x71, 0x05, 0x07, 0xff, 0x00, 0xff, 0x07, 0x08, 0x00, 0x00, 0xa9
2018-04-23 21:49:46.498 Detail, 
2018-04-23 21:52:34.804 Detail, Node017,   Received: 0x01, 0x08, 0x00, 0x04, 0x00, 0x11, 0x02, 0x84, 0x07, 0x63
2018-04-23 21:52:34.804 Detail, 
2018-04-23 21:53:03.339 Detail, Node052,   Received: 0x01, 0x09, 0x00, 0x04, 0x00, 0x34, 0x03, 0x20, 0x01, 0x00, 0xe4
2018-04-23 21:53:03.340 Detail, 
2018-04-23 21:53:03.525 Detail, Node052,   Received: 0x01, 0x11, 0x00, 0x04, 0x00, 0x34, 0x0b, 0x71, 0x05, 0x07, 0x00, 0x00, 0xff, 0x07, 0x00, 0x01, 0x08, 0x00, 0x57
2018-04-23 21:53:03.526 Detail,

Check that your controller is really /dev/ttyACM0

Yes and no, I think this was an example of the stick not turning on after a soft reboot. (no disco ball or any lights at all).

This log is probably a better example. This is a hard reboot with the zwcfg file kept in tact from previous fresh boot with no cfg.

http://openzwave.net/log-analyzer/public/323cec90-5966-458f-ad16-2157f50c0873/

The disco ball lights get going, the GUI shows Initializing (CacheLoad) for everything and then about 2-5 minutes later the lights freeze on the stick. the UI doesn’t update. some 20-30min later zwave disapears from the GUI.

Of note, during this state if I press the reset button the stick it will start up, the devices turn to Ready, but I can’t control them and they don’t show the correct states. I think when I do the reset the stick moves to a different USB path.

This might help

I’ve been able to narrow the scope of the issue a little bit. Basically z-wave dies shortly after boot if there is a zwcf file present. If there is no file, it will typically boot okay, but it is re-finding all the devices and I can’t customize/edit or move forward because on the next boot s wave will crash.

Here is a log of what happens when I boot with a zwcfg file present.

2018-05-02 15:35:51.704 Always, OpenZwave Version 1.4.2926 Starting Up
2018-05-02 15:36:12.006 Info, Setting Up Provided Network Key for Secure Communications
2018-05-02 15:36:12.007 Info, mgr,     Added driver for controller /dev/ttyACM0
2018-05-02 15:36:12.007 Info,   Opening controller /dev/ttyACM0
2018-05-02 15:36:12.007 Info, Trying to open serial port /dev/ttyACM0 (attempt 1)
2018-05-02 15:36:12.008 Info, Serial port /dev/ttyACM0 opened (attempt 1)
2018-05-02 15:36:12.011 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-05-02 15:36:12.011 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_MEMORY_GET_ID: 0x01, 0x03, 0x00, 0x20, 0xdc
2018-05-02 15:36:12.011 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_CONTROLLER_CAPABILITIES: 0x01, 0x03, 0x00, 0x05, 0xf9
2018-05-02 15:36:12.011 Detail, contrlr, Queuing (Command) FUNC_ID_SERIAL_API_GET_CAPABILITIES: 0x01, 0x03, 0x00, 0x07, 0xfb
2018-05-02 15:36:12.012 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_SUC_NODE_ID: 0x01, 0x03, 0x00, 0x56, 0xaa
2018-05-02 15:36:12.012 Detail, 
2018-05-02 15:36:12.012 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-05-02 15:36:13.012 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-05-02 15:36:13.012 Detail, contrlr, Removing current message
2018-05-02 15:36:13.012 Detail, contrlr, Notification: Notification - TimeOut
2018-05-02 15:36:13.015 Detail, 
2018-05-02 15:36:13.015 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x20) - FUNC_ID_ZW_MEMORY_GET_ID: 0x01, 0x03, 0x00, 0x20, 0xdc
2018-05-02 15:36:14.015 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-05-02 15:36:14.016 Detail, contrlr, Removing current message
2018-05-02 15:36:14.016 Detail, contrlr, Notification: Notification - TimeOut
2018-05-02 15:36:14.019 Detail, 
2018-05-02 15:36:14.019 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x05) - FUNC_ID_ZW_GET_CONTROLLER_CAPABILITIES: 0x01, 0x03, 0x00, 0x05, 0xf9
2018-05-02 15:36:15.019 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-05-02 15:36:15.020 Detail, contrlr, Removing current message
2018-05-02 15:36:15.028 Detail, contrlr, Notification: Notification - TimeOut
2018-05-02 15:36:15.028 Detail, 
2018-05-02 15:36:15.029 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x07) - FUNC_ID_SERIAL_API_GET_CAPABILITIES: 0x01, 0x03, 0x00, 0x07, 0xfb
2018-05-02 15:36:16.029 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-05-02 15:36:16.029 Detail, contrlr, Removing current message
2018-05-02 15:36:16.029 Detail, contrlr, Notification: Notification - TimeOut
2018-05-02 15:36:16.030 Detail, 
2018-05-02 15:36:16.030 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x56) - FUNC_ID_ZW_GET_SUC_NODE_ID: 0x01, 0x03, 0x00, 0x56, 0xaa
2018-05-02 15:36:17.031 Error, contrlr, ERROR: Dropping command, expected response not received after 1 attempt(s)
2018-05-02 15:36:17.031 Detail, contrlr, Removing current message
2018-05-02 15:36:17.031 Detail, contrlr, Notification: Notification - TimeOut
2018-05-02 15:44:50.623 Detail,   Received: 0x01, 0x04, 0x01, 0x13, 0x00, 0xe9
2018-05-02 15:44:50.623 Error, ERROR: ZW_SEND_DATA could not be delivered to Z-Wave stack
2018-05-02 15:44:50.630 Detail, CAN received...triggering resend
2018-05-02 15:44:50.631 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-05-02 15:44:50.631 Always, 
2018-05-02 15:44:50.631 Always, Dumping queued log messages
2018-05-02 15:44:50.631 Always, 
2018-05-02 15:44:50.631 Always, 
2018-05-02 15:44:50.631 Always, End of queued log message dump
2018-05-02 15:44:50.631 Always, 
2018-05-02 15:44:50.631 Detail, WriteMsg CAN m_currentMsg=00000000
2018-05-02 15:44:50.631 Detail, CAN received...triggering resend
2018-05-02 15:44:50.631 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-05-02 15:44:50.631 Always, 
2018-05-02 15:44:50.632 Always, Dumping queued log messages
2018-05-02 15:44:50.632 Always, 
2018-05-02 15:44:50.632 Always, 
2018-05-02 15:44:50.632 Always, End of queued log message dump
2018-05-02 15:44:50.632 Always, 
2018-05-02 15:44:50.632 Detail, WriteMsg CAN m_currentMsg=00000000
2018-05-02 15:44:50.632 Detail, CAN received...triggering resend
2018-05-02 15:44:50.632 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-05-02 15:44:50.632 Always, 
2018-05-02 15:44:50.632 Always, Dumping queued log messages
2018-05-02 15:44:50.632 Always, 
2018-05-02 15:44:50.633 Always, 
2018-05-02 15:44:50.633 Always, End of queued log message dump
2018-05-02 15:44:50.633 Always, 
2018-05-02 15:44:50.633 Detail, WriteMsg CAN m_currentMsg=00000000
2018-05-02 15:44:50.633 Detail, CAN received...triggering resend
2018-05-02 15:44:50.633 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-05-02 15:44:50.633 Always, 
2018-05-02 15:44:50.633 Always, Dumping queued log messages
2018-05-02 15:44:50.633 Always, 
2018-05-02 15:44:50.633 Always, 
2018-05-02 15:44:50.633 Always, End of queued log message dump
2018-05-02 15:44:50.633 Always, 
2018-05-02 15:44:50.633 Detail, WriteMsg CAN m_currentMsg=00000000
2018-05-02 15:44:50.634 Detail, CAN received...triggering resend
2018-05-02 15:44:50.634 Warning, m_currentMsg was NULL when trying to set MaxSendAttempts
2018-05-02 15:44:50.634 Always, 
2018-05-02 15:44:50.634 Always, Dumping queued log messages
2018-05-02 15:44:50.634 Always, 
2018-05-02 15:44:50.634 Always, 
2018-05-02 15:44:50.634 Always, End of queued log message dump
2018-05-02 15:44:50.634 Always, 
2018-05-02 15:44:50.634 Detail, WriteMsg CAN m_currentMsg=00000000
2018-05-02 15:44:50.634 Detail, Node005,   Received: 0x01, 0x09, 0x00, 0x04, 0x00, 0x05, 0x03, 0x20, 0x01, 0xff, 0x2a
2018-05-02 15:44:50.635 Detail, 
2018-05-02 15:44:50.636 Detail,   Received: 0x01, 0x07, 0x00, 0x13, 0x33, 0x01, 0x66, 0x7f, 0xc0
2018-05-02 15:44:50.636 Detail,   ZW_SEND_DATA Request with callback ID 0x33 received (expected 0x00)
2018-05-02 15:44:50.636 Warning, WARNING: Unexpected Callback ID received
2018-05-02 15:44:50.637 Detail, Node005,   Received: 0x01, 0x08, 0x00, 0x04, 0x00, 0x05, 0x02, 0x84, 0x07, 0x77
2018-05-02 15:44:50.637 Detail, 
2018-05-02 15:44:54.752 Detail,   Received: 0x01, 0x07, 0x00, 0x13, 0x33, 0x01, 0x68, 0x1c, 0xad
2018-05-02 15:44:54.752 Detail,   ZW_SEND_DATA Request with callback ID 0x33 received (expected 0x00)
2018-05-02 15:44:54.752 Warning, WARNING: Unexpected Callback ID received
2018-05-02 15:44:56.806 Detail, Node005,   Received: 0x01, 0x08, 0x00, 0x04, 0x00, 0x05, 0x02, 0x84, 0x07, 0x77
2018-05-02 15:44:56.806 Detail, 
2018-05-02 15:46:44.841 Detail, Node069,   Received: 0x01, 0x08, 0x00, 0x04, 0x00, 0x45, 0x02, 0x84, 0x07, 0x37

Is it a bad Aeon stick? any ideas? I’m in hell right not basically none of my 95 devices can be controlled.

And yes the stick is 100% at /dev/ttyACM0 when this is happening. The LED is however freezing on whatever color it is on in the disco ball status.

I know it was long time ago but what was the root cause of the issue? I’m facing something similar now, sensor entities are being updated but I can’t control any device anymore…

First thing to try is always removing and re-plugging the stick, I find the z-stick in particular has a habit of getting into dodgy contact, probably because it’s so large and relatively heavy for a USB device.

Bear in mind that depending on your setup, doing this may change the path of your device from /dev/tty/ACM0 to something else.

This is why it’s recommended to use the /dev/serial/by-id/.... path.

I’ve tried that right after just restarting HA and the whole hardware. No luck. It is really strange as some devices work ok (e.g. one smart plug) and in general the sensors are properly updated (with some exceptions and with some delays).

Check your zwavejs logs, crank em up to debug to see what’s happening. Then post your logs.

I’m still using the ozw integration but actually this probably will make me migrate to zwavejs2mqtt. And will see if my network works on it…

I don’t recall what the solution was. I am on the built in zwave js now and I highly recommend it, fixed every complaint I ever had with zwave.

Migration is not very smooth. Some devices were difficult to interview and I have a couple of switch modules still to be reset probably - simple power cycle does not help and it ends up with the following error in the log:

ERROR ZWAVE: Interview of node 24 has failed: Maximum interview attempts reached

Looks like RF issues.

What’s the best way to solve it? Seems that nodes that were not able to connect directly were not interviewed by ZwaveJS properly until I brought them close to the zwave stick. The ones pending now are modules mounted in some switch boxes outside the house so it is more difficult to bring them in, close to the stick… Is there any other way to make them interviewed with a hop through a zwave switch plugged in the middle?

Incrementally heal nodes is your only option unless you have a zniffer.

I managed to connect most of the nodes now but so far they all required bringing them close to the coordinator. With plugs or battery sensors this was not so painful but I still have a couple of modules installed in switch boxes, is there any way to force them to connect via a router (zwave plug)? Or my only chance is to remove them from the switch boxes and interview close to the coordinator?

Technically you should only bring old non-S2 locks to the stick to pair them, most things supporting S2 should be able to be included in place.

Heal the nodes you’ve moved around, once that’s done try your switches and the like.

Healing didn’t solve it. The last two modules were qubino flush 2 relays but they required factory reset and inclusion from scratch. Now the network seems to work. A bit strange is that all devices, except one, connect to the coordinator directly - so I don’t understand why most of them were not getting re-interviewed until I have brought them close to the coordinator. Hope this was the last problem with my zwave network :slight_smile: