Help - Z-wave not working despite devices report as ready - massive log file

HA version: hass.io 0.63
Hardware: Raspberry Pi3, Aeon Z-Stick Gen5

I have been using z-wave with my hass.io install for several months and also been adding more and more devices. As of lately, my z-wave network has become unstable and unreliable. I get frequent error messages, e.g. “Z-Wave not ready after 301 seconds, continuing anyway” and sometimes following a HA restart or reboot, there are no z-wave devices at all visible in the system. Usually they are back after another restart or two. Also, once initiated and all AC powered devices reporting themselves as “Ready”, many of them won’t respond to on/off commands. I thought z-wave networks were supposed to become increasingly stable with more devices added, but as I said, I have been experiencing more instability during the past few weeks.

Any ideas about what is going on?

My z-wave network map below:

Something is really fishy here. I deleted my OZW_Log.txt this morning and now about 8 hours later the new log file has grown to over 160MB!!!

The beginning of it looks like this:

2018-02-12 09:47:12.101 Always, OpenZwave Version 1.4.2926 Starting Up
2018-02-12 09:48:01.083 Info, Setting Up Provided Network Key for Secure Communications
2018-02-12 09:48:01.083 Warning, Failed - Network Key Not Set
2018-02-12 09:48:01.084 Info, mgr,     Added driver for controller /dev/ttyACM0
2018-02-12 09:48:01.084 Info,   Opening controller /dev/ttyACM0
2018-02-12 09:48:01.084 Info, Trying to open serial port /dev/ttyACM0 (attempt 1)
2018-02-12 09:48:01.085 Info, Serial port /dev/ttyACM0 opened (attempt 1)
2018-02-12 09:48:01.085 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-02-12 09:48:01.085 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_MEMORY_GET_ID: 0x01, 0x03, 0x00, 0x20, 0xdc
2018-02-12 09:48:01.085 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_CONTROLLER_CAPABILITIES: 0x01, 0x03, 0x00, 0x05, 0xf9
2018-02-12 09:48:01.085 Detail, contrlr, Queuing (Command) FUNC_ID_SERIAL_API_GET_CAPABILITIES: 0x01, 0x03, 0x00, 0x07, 0xfb
2018-02-12 09:48:01.085 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_SUC_NODE_ID: 0x01, 0x03, 0x00, 0x56, 0xaa
2018-02-12 09:48:01.086 Detail, 
2018-02-12 09:48:01.086 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-02-12 09:48:01.086 Detail, Unsolicited message received while waiting for ACK.
2018-02-12 09:48:01.087 Detail, Node013,   Received: 0x01, 0x15, 0x00, 0x04, 0x08, 0x0d, 0x0f, 0x32, 0x02, 0x21, 0x64, 0x00, 0x00, 0x0e, 0xfe, 0x2d, 0xe6, 0x00, 0x00, 0x00, 0x00, 0x00, 0xaa
2018-02-12 09:48:01.087 Detail, 
2018-02-12 09:48:01.088 Detail, contrlr, CAN received...triggering resend
2018-02-12 09:48:01.088 Detail, 
2018-02-12 09:48:01.088 Info, contrlr, Sending (Command) message (Attempt 2, Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-02-12 09:48:01.089 Detail, Unsolicited message received while waiting for ACK.
2018-02-12 09:48:01.089 Detail, Node013,   Received: 0x01, 0x15, 0x00, 0x04, 0x10, 0x0d, 0x0f, 0x32, 0x02, 0x21, 0x64, 0x00, 0x00, 0x0e, 0xfe, 0x2d, 0xeb, 0x00, 0x00, 0x00, 0x00, 0x00, 0xbf
2018-02-12 09:48:01.089 Detail, 
2018-02-12 09:48:01.091 Detail, contrlr, CAN received...triggering resend
2018-02-12 09:48:01.091 Detail, 
2018-02-12 09:48:01.091 Info, contrlr, Sending (Command) message (Attempt 3, Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-02-12 09:48:01.091 Detail, Unsolicited message received while waiting for ACK.
2018-02-12 09:48:01.094 Detail, Node026,   Received: 0x01, 0x15, 0x00, 0x04, 0x00, 0x1a, 0x0f, 0x32, 0x02, 0x21, 0x64, 0x00, 0x00, 0x20, 0xda, 0x65, 0x6a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x7b
2018-02-12 09:48:01.106 Detail, 
2018-02-12 09:48:01.106 Detail, contrlr, CAN received...triggering resend

This is my z-wave config:

zwave:
  usb_path: /dev/ttyACM0
  polling_interval: 60000

It’s saying the is no device at /dev/ttyACM0

Try: ls -al /dev/tty* the assignment may have changed.

I get this:

core-ssh:~# ls -al /dev/tty*
crw-rw-rw-    1 root     root        5,   0 Feb 12 17:53 /dev/tty

also:

core-ssh:~# lsusb
Bus 001 Device 002: ID 0424:9514
Bus 001 Device 001: ID 1d6b:0002
Bus 001 Device 003: ID 0424:ec00
Bus 001 Device 004: ID 0403:6001
Bus 001 Device 005: ID 0658:0200

When I do a dmesg I get 3 references to tty:

3f201000.uart: ttyAMA0 at MMIO 0x3f201000 (irq = 87, base_baud = 0) is a PL011 rev2
cdc_acm 1-1.5:1.0: ttyACM0: USB ACM device
usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB0

But there is no path /dev/tty****

I can also add this to the info:

core-ssh:~# hassio host hardware
{
    "result": "ok",
    "data": {
        "serial": [
            "/dev/ttyUSB0",
            "/dev/ttyACM0",
            "/dev/ttyAMA0"
        ],
        "input": [],
        "disk": [],
        "gpio": [
            "gpiochip0",
            "gpiochip100"
        ],
        "audio": {
            "0": {
                "name": "bcm2835 - bcm2835 ALSA",
                "type": "ALSA",
                "devices": {
                    "0": "digital audio playback",
                    "1": "digital audio playback"
                }
            }
        }
    }
}

Does anybody have a clue?

Try setting it to /dev/ttyAMA0

I think I already tried that, but trying again just to make sure. Everything is just super sluggish. I ssh into my hass.io RPi, but doing a hassio ha restart takes at least 5 minutes. Just returning to the prompt after that command in hass.io takes a couple of minutes.

Update:
Z-wave devices doesn’t show and no nodes under Z-wave config. The start of the OZW_Log.txt looks like this:

2018-02-12 22:12:04.841 Always, OpenZwave Version 1.4.2926 Starting Up
2018-02-12 22:12:42.571 Info, Setting Up Provided Network Key for Secure Communications
2018-02-12 22:12:42.571 Warning, Failed - Network Key Not Set
2018-02-12 22:12:42.571 Info, mgr,     Added driver for controller /dev/ttyAMA0
2018-02-12 22:12:42.572 Info,   Opening controller /dev/ttyAMA0
2018-02-12 22:12:42.572 Info, Trying to open serial port /dev/ttyAMA0 (attempt 1)
2018-02-12 22:12:42.572 Info, Serial port /dev/ttyAMA0 opened (attempt 1)
2018-02-12 22:12:42.572 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-02-12 22:12:42.572 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_MEMORY_GET_ID: 0x01, 0x03, 0x00, 0x20, 0xdc
2018-02-12 22:12:42.572 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_CONTROLLER_CAPABILITIES: 0x01, 0x03, 0x00, 0x05, 0xf9
2018-02-12 22:12:42.572 Detail, contrlr, Queuing (Command) FUNC_ID_SERIAL_API_GET_CAPABILITIES: 0x01, 0x03, 0x00, 0x07, 0xfb
2018-02-12 22:12:42.572 Detail, contrlr, Queuing (Command) FUNC_ID_ZW_GET_SUC_NODE_ID: 0x01, 0x03, 0x00, 0x56, 0xaa
2018-02-12 22:12:42.573 Detail, 
2018-02-12 22:12:42.573 Info, contrlr, Sending (Command) message (Callback ID=0x00, Expected Reply=0x15) - FUNC_ID_ZW_GET_VERSION: 0x01, 0x03, 0x00, 0x15, 0xe9
2018-02-12 22:12:42.577 Detail, contrlr, Notification: DriverFailed

Try restarting the whole thing.

Unfortunately I lost count of how many times I have done that. I rolled back to previous state through a snapshot, while still hass.io 0.62.1 to see if the problem was somehow related to my config. Sadly, it seems there is something related to the way the Aeon Z-stick operates. I am trying to find a way to test the actual hardware with some other software.

Anyone knows what this means?

core-ssh:~# find / -name "ttyACM0"
/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.4/1-1.4:1.0/tty/ttyACM0
/sys/class/tty/ttyACM0

There is no /dev/ttyACM0 on my system, but there are two other instances as seen above.

You restarted the host as well not just HassIO?

Yes, of course.

Just being thorough, some users are unfamiliar with docker and it’s workings :slight_smile:

Absolutely fine :-), I know what you mean. I have been googling a lot today, but not yet found any probable culprit to this problem. There seem to be others that have issues with the Z-stick and errors such as:

ERROR: Dropping command, expected response not received after 6 attempt(s)

but no solutions. I will keep looking and maybe buy another Z-stick and try. Hopefully I will not need to start from scratch with my z-wave network.

If you get a new Z-stick, you will need to pair all your devices again.
Aeotec does have backup software where you can copy the pairing information from the old Z-stick to the new one but there will be some loss.

Oh, but they do, but it is for Windows only. I used it yesterday to back up my Z-stick. You can also turn off the “disco lights” on the stick with that program. Remains to be seen how a restore works, though.

Since you are using Hass.io, I would suggest you try reinstalling to a new SD card and restore your configuration before swapping the Z-stick.

After testing virtually everything including fresh hass.io install with old z-stick, old hass.io install with brand new and empty z-stick etc. etc. I started to analyze my mesh network and the errors I got a little bit more carefully.

I found that some nodes were showing up in the log 20 000 times or more. All three of them were NodOn MSP-3-1-XX Micro Smart Plugs. I have four of them in the same room, and in two of them in the same dual power socket. I pulled those out and restarted HA, and things started to go back to normal. Most nodes started to show neighbours again and seem to react to commands.

There are, however, some strange things in my z-wave network when it comes to neighbouring. Maybe I will have to shuffle devices around a bit to make it work as good as possible. I will continue to work with this, but it is interesting that a few bad nodes can choke the whole mesh network. A bit too sensitive in my view.