SOLVED - Upgraded to 0.108.3. Zwave is Borked

Argh!

Had my Zwave rock solid, upgraded to 0.108.3 and now my Zwave config is all messed up.

Running Home Assistant via Docker with an Aeotec stick. Upon querying the Zwave log, it’s filled with the messages below. I downgraded to 0.107.3 and the issue remains.

I’ve read in other posts with similar error messages that this is often happened when another process is also querying the Zwave stick, but I’ve not installed anything that access it and I don’t think that I have multiple instances of Home-Assistant running.

I did recently install a Sonoff Tasmota switch, but that’s via MQTT and I wouldn’t think that would be the issue.

Any ideas? I’m at a complete loss.

2020-04-12 19:16:43.169 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:43.169 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:43.169 Warning, WARNING: Out of frame flow! (0x73).  Sending NAK.
2020-04-12 19:16:43.170 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:53.021 Error, Node045, ERROR: Dropping command, expected response not received after 1 attempt(s)
2020-04-12 19:16:53.021 Detail, Node045, Removing current message
2020-04-12 19:16:53.021 Detail, Node045, Notification: Notification - TimeOut
2020-04-12 19:16:53.023 Detail, Node045, Query Stage Complete (Session)
2020-04-12 19:16:53.023 Detail, Node045, AdvanceQueries queryPending=0 queryRetries=0 queryStage=Dynamic live=1
2020-04-12 19:16:53.023 Detail, Node045, QueryStage_Dynamic
2020-04-12 19:16:53.023 Detail, Node045, Queuing (Send) SwitchBinaryCmd_Get (Node=45): 0x01, 0x09, 0x00, 0x13, 0x2d, 0x02, 0x25, 0x02, 0x25, 0xea, 0x22
2020-04-12 19:16:53.023 Info, Node045, CentralScene RequestState: 4
2020-04-12 19:16:53.023 Info, Node045, CentralScene: Not a StaticRequest
2020-04-12 19:16:53.023 Detail, Node045, Queuing (Query) Query Stage Complete (Dynamic)
2020-04-12 19:16:53.023 Detail,
2020-04-12 19:16:53.023 Info, Node045, Sending (Send) message (Callback ID=0xea, Expected Reply=0x04) - SwitchBinaryCmd_Get (Node=45): 0x01, 0x09, 0x00, 0x13, 0x2d, 0x02, 0x25, 0x02, 0x25, 0xea, 0x22
2020-04-12 19:16:53.531 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2020-04-12 19:16:53.531 Warning, WARNING: Out of frame flow! (0xe8).  Sending NAK.
2020-04-12 19:16:53.531 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:53.531 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:53.532 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:54.032 Warning, WARNING: 500ms passed without reading the rest of the frame...aborting frame read
2020-04-12 19:16:54.032 Detail, Node045,   Received: 0x01, 0x01, 0x01
2020-04-12 19:16:54.032 Warning, Node045, WARNING: Checksum incorrect - sending NAK
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x14).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x0e).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x32).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x7c).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:56.815 Warning, WARNING: Out of frame flow! (0x05).  Sending NAK.
2020-04-12 19:16:56.816 Warning, WARNING: Out of frame flow! (0x4e).  Sending NAK.
2020-04-12 19:16:56.816 Warning, WARNING: Out of frame flow! (0x14).  Sending NAK.
2020-04-12 19:16:56.816 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:56.816 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:56.816 Warning, WARNING: Out of frame flow! (0x0e).  Sending NAK.
2020-04-12 19:16:56.816 Warning, WARNING: Out of frame flow! (0x32).  Sending NAK.
2020-04-12 19:16:56.817 Warning, WARNING: Out of frame flow! (0x21).  Sending NAK.
2020-04-12 19:16:56.817 Warning, WARNING: Out of frame flow! (0x05).  Sending NAK.
2020-04-12 19:16:56.817 Warning, WARNING: Out of frame flow! (0x56).  Sending NAK.
2020-04-12 19:16:56.817 Warning, WARNING: Out of frame flow! (0xd4).  Sending NAK.
2020-04-12 19:16:56.817 Warning, WARNING: Out of frame flow! (0x05).  Sending NAK.
2020-04-12 19:16:56.817 Warning, WARNING: Out of frame flow! (0x4e).  Sending NAK.
2020-04-12 19:16:56.818 Warning, WARNING: Out of frame flow! (0x14).  Sending NAK.
2020-04-12 19:16:56.818 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:56.818 Warning, WARNING: Out of frame flow! (0x0e).  Sending NAK.
2020-04-12 19:16:56.818 Warning, WARNING: Out of frame flow! (0x21).  Sending NAK.
2020-04-12 19:16:56.818 Warning, WARNING: Out of frame flow! (0x05).  Sending NAK.
2020-04-12 19:16:56.818 Warning, WARNING: Out of frame flow! (0x56).  Sending NAK.
2020-04-12 19:16:56.819 Warning, WARNING: Out of frame flow! (0xd4).  Sending NAK.
2020-04-12 19:16:56.819 Warning, WARNING: Out of frame flow! (0x4e).  Sending NAK.
2020-04-12 19:16:56.819 Warning, WARNING: Out of frame flow! (0x14).  Sending NAK.
2020-04-12 19:16:56.819 Warning, WARNING: Out of frame flow! (0x04).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x07).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x32).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x02).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x56).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x00).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x7c).  Sending NAK.
2020-04-12 19:16:56.820 Warning, WARNING: Out of frame flow! (0x4e).  Sending NAK.
2020-04-12 19:16:56.905 Detail, Node045,   Received: 0x01, 0x10, 0x0a, 0x32, 0x02, 0x00, 0x02, 0x00, 0x00, 0x00, 0x01, 0x10, 0x00, 0x04, 0x00, 0x07, 0x02, 0x21
2020-04-12 19:16:56.905 Warning, Node045, WARNING: Checksum incorrect - sending NAK
2020-04-12 19:16:56.908 Detail, Node033,   Received: 0x01, 0x10, 0x00, 0x04, 0x07, 0x21, 0x74, 0x00, 0x02, 0x00, 0x00, 0x00, 0x01, 0x04, 0x00, 0x0a, 0x02, 0x21
2020-04-12 19:16:56.908 Warning, Node033, WARNING: Checksum incorrect - sending NAK
2020-04-12 19:16:56.958 Warning, WARNING: 50ms passed without finding the length byte...aborting frame read

I tried to upgrade and hour or so ago, and had the EXACT same issue. I also have the Aeotec stick. No matter what I tried, I could not get the ZWave Network to start with .108.

I downgraded to .107, and yes it was also still broken for me at first. I shut down my docker container, restored my /config folder from this morning’s backup (whew), restarted, and now all is well. So whatever it did, it also must have borked the zwave config files.

Hopefully you have a good backup, that was my last resort. But also, I think it proves there was no other process, otherwise I’d have the same issue on .107.

I did an upgrade to 0.180.3 from 0.179.5 (I think) any my Z-Wave network with Aeotec USB stick came back up without issue. I’m running the Home Assistant Core docker container on Ubuntu 18.04 with Docker. The errors being reported seem to indicate some sort of physical layer/communications problem, maybe? I wonder if the USB device name for the stick changed when you upgraded? Just a guess.

no no no!

By backups are borked too, but that’s a completely unrelated issue.

I’d go with the USB device name changing too, but here’s the weird thing:

My devices still respond, albeit slowly, but the devices don’t show as updated within the UI.

Argh.

Out of frame flow! usually means there are more than 1 processes accessing the serial port at the same time. Make sure you don’t have ModemManager installed.

Also make sure your Docker container is not overriding the CMD. It should look like this:

$ docker inspect container_name --format "{{ .Config.Cmd }}"
[]

Otherwise you’ll have two copies of HA running.

Are you using Portainer?

If so with the 108.3 image click on duplidate/edit and then make sure the entry point is blanked out and then deploy.

Go into the console for the HA container

Type:

ps

you should only see one instance of HA running, see below for an example.

bash-5.0# ps
PID   USER     TIME  COMMAND
    1 root      0:00 s6-svscan -t0 /var/run/s6/services
   31 root      0:00 s6-supervise s6-fdholderd
  178 root      0:00 udevd --daemon
  194 root      0:00 s6-supervise home-assistant
  197 root      3h21 python3 -m homeassistant --config /config
  467 root      0:00 bash
10423 root      0:00 bash
10429 root      0:00 ps
bash-5.0#
1 Like

I don’t have Portainer or ModemManager. It’s definitely something about 108 because I was able to simply downgrade and restore and get it back up (unless something about the new docker version causes dup running). If it were some outside force or renaming, going to 107 wouldn’t have fixed it.

I can try again later, but I need to pick my spot better. At my house not having ZWave is a big disaster. :slight_smile:

I also saved my post-upgrade configs, so I’ll do a compare of those and see if anything sticks out. That could help get ltoolio running, depending…

Here is mine. So the third and fourth lines from the end show I have two copies?

Weird that it’s causing no issues for me on .107 (and 107 has been the best running version of HA I’ve had yet).

PID   USER     TIME  COMMAND                                                                                  
    1 root      0:00 s6-svscan -t0 /var/run/s6/services                                                       
   30 root      0:00 foreground  if   /etc/s6/init/init-stage2-redirfd   foreground    if     if      s6-echo 
   31 root      0:00 s6-supervise s6-fdholderd                                                                
   42 root      0:00 foreground  s6-setsid  -gq  --  with-contenv  backtick  -D  0  -n  S6_LOGGING   printcon 
  178 root      0:00 udevd --daemon                                                                           
  217 root      0:00 s6-supervise home-assistant                                                              
  219 root      1h17 /usr/local/bin/python3 /usr/src/homeassistant/homeassistant/__main__.py --config /config 
  244 root      0:10 python -m homeassistant --config /config                                                 
31965 root      0:00 /bin/bash                                                                                
32266 root      0:00 ps  

Are you using Portainer?

If so with the 108.3 image click on duplicate/edit and then make sure the entry point is blanked out and then deploy.

I nearly wept when I tried this and it worked. Thank you soooooo much. I was already imagining needing to restart my zWave from scratch.

Thank you, thank you, thank you.

Yes - so you likely have a entry point that is starting it, and the new 107 and later docker contained also has it starting automatically. So you get it started twice…

No worries. I had a similar issues with the upgrade to 107 and tracked down the issue.

Just be aware portainer has a habit to add an entry point back in on upgrades so whenever you upgrade always check to ensure the entry point is blank.

Thanks. Not sure how that happened…probably I made a mistake setting up my container for zwave (since the Synology UI doesn’t support the device param).

I spawned a new container against 107, I only have one process now and things seem to be working fine. I’ll give it a few days and try upgrading again later in the week.

Anyway, thanks again.

2 Likes

Thank you for finding the root issue and posting it here! I wasted hours trying to figure this out and you just saved the day!

I forgot to post back, but I was able to upgrade without incident using the new container.

1 Like