Zwave Fan Control wrong states reported after control

I’ve had these GE/Jasco In-Wall 3-Speed Fan Control 55258 / ZW4002 working reliably for a year+ using the old zwave. Under the new zwave, there appear to be some status reporting issues when turning the fan off. Here’s an extract from the zwavejs log.

In the first line I turn the fan on. 4 seconds later the device responds :frowning:

The next sequence is odd (looks like a timing problem), it reports the value going to off 2ms before it logs the command, then 50ms reports it as being on. I go out and look at the fan and it is indeed of.

The third sequence is me turning the swich off again in lovelace and which point its state is updated correctly.

2022-04-04 16:14:20.832 INFO ZWAVE: Node 12: value updated: 38-0-targetValue 0 => 255
2022-04-04 16:14:24.844 INFO ZWAVE: Node 12: value updated: 38-0-currentValue 0 => 6

2022-04-04 16:14:43.145 INFO ZWAVE: Node 12: value updated: 38-0-currentValue 6 => 0
2022-04-04 16:14:43.147 INFO ZWAVE: Node 12: value updated: 38-0-targetValue 255 => 0
2022-04-04 16:14:43.197 INFO ZWAVE: Node 12: value updated: 38-0-currentValue 0 => 6

2022-04-04 16:15:57.519 INFO ZWAVE: Node 12: value updated: 38-0-currentValue 6 => 0
2022-04-04 16:15:57.520 INFO ZWAVE: Node 12: value updated: 38-0-targetValue 0 => 0
2022-04-04 16:15:57.534 INFO ZWAVE: Node 12: value updated: 38-0-currentValue 0 => 0

Turn switch on OWZ_Log.txt. - First, the response comes back 15ms later, then another 4 seconds later. It seems like zwave_js is missing that first response.

2022-04-04 16:36:23.773 Info, Node012, Value::Set - COMMAND_CLASS_SWITCH_MULTILEVEL - Level - 0 - 1 - 255
2022-04-04 16:36:23.773 Info, Node012, SwitchMultilevel::Set - Setting to level 255
2022-04-04 16:36:23.774 Info, Node012, Sending (Send) message (Callback ID=0x85, Expected Reply=0x13) - SwitchMultilevelCmd_Set (Node=12): 0x01, 0x0a, 0x00, 0x13, 0x0c, 0x03, 0x26, 0x01, 0xff, 0x25, 0x85, 0x91
2022-04-04 16:36:23.798 Info, Node012, Request RTT 24 Average Request RTT 24
2022-04-04 16:36:23.798 Info, Node012, Sending (Send) message (Callback ID=0x86, Expected Reply=0x04) - SwitchMultilevelCmd_Get (Node=12): 0x01, 0x09, 0x00, 0x13, 0x0c, 0x02, 0x26, 0x02, 0x25, 0x86, 0x6c
2022-04-04 16:36:23.823 Info, Node012, Request RTT 24 Average Request RTT 24
2022-04-04 16:36:23.838 Info, Node012, Response RTT 40 Average Response RTT 39
2022-04-04 16:36:23.838 Info, Node012, Received SwitchMultiLevel report: level=6
2022-04-04 16:36:27.816 Info, Node012, Received SwitchMultiLevel report: level=6

Turns off the switch. It does send a 6 back first (.645), few ms later it sends to zero (675).

2022-04-04 16:41:01.596 Info, Node012, Value::Set - COMMAND_CLASS_SWITCH_MULTILEVEL - Level - 0 - 1 - 0
2022-04-04 16:41:01.596 Info, Node012, SwitchMultilevel::Set - Setting to level 0
2022-04-04 16:41:01.597 Info, Node012, Sending (Send) message (Callback ID=0x9f, Expected Reply=0x13) - SwitchMultilevelCmd_Set (Node=12): 0x01, 0x0a, 0x00, 0x13, 0x0c, 0x03, 0x26, 0x01, 0x00, 0x25, 0x9f, 0x74
2022-04-04 16:41:01.622 Info, Node012, Request RTT 25 Average Request RTT 24
2022-04-04 16:41:01.622 Info, Node012, Sending (Send) message (Callback ID=0xa0, Expected Reply=0x04) - SwitchMultilevelCmd_Get (Node=12): 0x01, 0x09, 0x00, 0x13, 0x0c, 0x02, 0x26, 0x02, 0x25, 0xa0, 0x4a
2022-04-04 16:41:01.645 Info, Node012, Response RTT 22 Average Response RTT 30
2022-04-04 16:41:01.645 Info, Node012, Received SwitchMultiLevel report: level=6
2022-04-04 16:41:01.659 Info, Node012, Request RTT 36 Average Request RTT 30
2022-04-04 16:41:01.675 Info, Node012, Response RTT 52 Average Response RTT 41
2022-04-04 16:41:01.675 Info, Node012, Received SwitchMultiLevel report: level=0
2022-04-04 16:41:02.598 Info, Node012, Sending (Send) message (Callback ID=0xa1, Expected Reply=0x04) - SwitchMultilevelCmd_Get (Node=12): 0x01, 0x09, 0x00, 0x13, 0x0c, 0x02, 0x26, 0x02, 0x25, 0xa1, 0x4b
2022-04-04 16:41:02.623 Info, Node012, Request RTT 24 Average Request RTT 27
2022-04-04 16:41:02.638 Info, Node012, Response RTT 40 Average Response RTT 40
2022-04-04 16:41:02.638 Info, Node012, Received SwitchMultiLevel report: level=0
2022-04-04 16:41:03.598 Info, Node012, Sending (Send) message (Callback ID=0xa2, Expected Reply=0x04) - SwitchMultilevelCmd_Get (Node=12): 0x01, 0x09, 0x00, 0x13, 0x0c, 0x02, 0x26, 0x02, 0x25, 0xa2, 0x48
2022-04-04 16:41:03.623 Info, Node012, Request RTT 24 Average Request RTT 25
2022-04-04 16:41:03.638 Info, Node012, Response RTT 40 Average Response RTT 40
2022-04-04 16:41:03.638 Info, Node012, Received SwitchMultiLevel report: level=0

I think there are several issues here:
a) zwave_js is missing messages
b) zwave_js has an ordering issue processing in correct order.

That’s not a node-zwave-js log. This is more useful: https://zwave-js.github.io/zwavejs2mqtt/#/troubleshooting/generating-logs?id=driver-logs

Looks like your device is reporting bad values. OZW is refresh-happy (one if its biggest problems), so it just happens to get a more up-to-date value.

You’re not the first to report this problem with that switch model.

If you look at your driver logs, you’ll probably see the same thing: the device is reporting the wrong value unsolicited and that cancels the polling that was scheduled by zwave-js. OZW does not have such capability and refreshes anyways. OZW is just hiding bad device behavior. Let us know what you see.

1 Like

Thanks for the link. I’ll get the right debug logs, I was just tailing the log I saw.

Looks like I didn’t have it configured right for logging. I’ll rerun the test tomorrow as I switched the system back to the old driver for now.

The art of device integration is working with whatever is out there. In a past life I did industrial device software integration - very few things hit the spec 100% of the time. Always easier to tweak the SW than tell the customer to replace their device……

On the first issue - “Turning fan off in HA results in Fan turning off, but HA shows on” - is occurring because of the Scheduled Poll was cancelled. I have not looked at the second issue of the 4 second delay.

2022-04-05T19:02:13.964Z DRIVER » [Node 012] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      46
                                  └─[MultilevelSwitchCCSet]
                                      target value: 0
2022-04-05T19:02:13.966Z SERIAL « [ACK]                                                                   (0x06)
2022-04-05T19:02:13.972Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2022-04-05T19:02:13.973Z SERIAL » [ACK]                                                                   (0x06)
2022-04-05T19:02:13.973Z DRIVER « [RES] [SendData]
                                    was sent: true
2022-04-05T19:02:14.018Z SERIAL « 0x010700132e000005c0                                                 (9 bytes)
2022-04-05T19:02:14.018Z SERIAL » [ACK]                                                                   (0x06)
2022-04-05T19:02:14.018Z DRIVER « [REQ] [SendData]
                                    callback id:     46
                                    transmit status: OK
2022-04-05T19:02:14.023Z CNTRLR   [Node 012] [~] [Multilevel Switch] currentValue: 6 => 0           [Endpoint 0]
2022-04-05T19:02:14.098Z SERIAL « 0x01090004000c03260306de                                            (11 bytes)
2022-04-05T19:02:14.098Z CNTRLR   [Node 012] [~] [Multilevel Switch] currentValue: 0 => 6           [Endpoint 0]
2022-04-05T19:02:14.099Z CNTRLR   [Node 012] Scheduled poll canceled because value was updated
2022-04-05T19:02:14.099Z SERIAL » [ACK]                                                                   (0x06)
2022-04-05T19:02:14.100Z DRIVER « [Node 012] [REQ] [ApplicationCommand]
                                  └─[MultilevelSwitchCCReport]
                                      current value: 6
2022-04-05T19:02:17.426Z SERIAL « 0x010c0004000f063105012200a74e                                      (14 bytes)

I’d also agree that the device may not be doing the right thing. But since it is “Zwave Certified”, the certification test must not test for this :-(.

Anyways, the solve would be to get a change in this method is zwave_js. To allow this “optimization” to be disabled globally in zwave_js or to do it on a per device basis. This would be a good change, since it would allow a large class of installed in-wall devices to work.

	public cancelScheduledPoll(valueId: ValueID): boolean {
		// Avoid false positives or false negatives due to a mis-formatted value ID
		valueId = normalizeValueID(valueId);

		if (this.scheduledPolls.has(valueId)) {
			clearTimeout(this.scheduledPolls.get(valueId)!);
			this.scheduledPolls.delete(valueId);
			return true;
		}
		return false;
	}

The basic of what we need is to poll the device in less than <1 second after the command is sent. Right now the zwave_js “cancel poll” optimization is cancelling the poll that would resolve the issue. Here are some proposals for fixes that I’d like feedback on to see what is most practical and acceptable.

First, the general solve may to be have a set of configurable “device behaviors”, that can be enabled on a per device basis via config. Reading through lots of issues, it is clear the zwave devices do not all behave the same. So far, I’ve identified two behaviors that would be helpful. I’m sure there are more:

  • Device Poll at network startup. This was the prior behavior. Zwavejs does not do this for some good reasons. But if you need it for a device, or the zwavejs reasons aren’t important to you. Then enable this.
  • Device Poll after command, This was the prior behavior. Zwavejs optimized this out. Again enable it if you need it, or don’t. Specify the delay in MS and the number of polls.

In other words, via device Config the comms tweak needed to work more like the deprecated driver can be enabled and for migration use cases perhaps those are enabled by default?

Next question is where to implement.

  1. Zwave_js2mqtt this would be the best spot, as it would be server Config.
  2. Zwave_js integration - this may be the most practical as it’s our code
  3. Zwave-js custom wrapper integration - this would wrap and zwave_js entity (like a switch) and implement the behaviors. Less ideal as it’s outside of HA and I’d be doing all the maintenance.
  4. Automations - the behaviors identified could be done in automations. Disadvantage is every user is on their own.

Thoughts?

I was able to right an automation to work around the issue. It’s definitely not ideal but removes this as a blocker for migration.

When commanded to off, the switch seems to first reflect the requested state back (a 0), then sends the current state (a 6 for my fan level), the 6 then cancels the zwave_js async poll and then the fan turns off and if it is polled again it will return a 0.

So this automation looks for a change of state, polls it, waits a second, and for good measure polls it again.

- id: "loft_fan_2 refresh switch state"
  alias: loft_fan_2 refresh switch state
  description: "Automation to refresh switch after state changes usually caused by switch control"
  mode: queued
  trigger:
    - platform: state
      entity_id: fan.loft_fan_2_level
  condition: "{{ trigger.to_state.state != trigger.from_state.state }}"
  action:
    - service: system_log.write
      data:
        level: info
        message: 'Polling fan.loft_fan_2_level state change from {{ trigger.from_state.state }} to {{ trigger.to_state.state }}'
        logger: hvac
    - service: zwave_js.refresh_value
      data:
        entity_id: fan.loft_fan_2_level
    - delay: "00:00:01"
    - service: zwave_js.refresh_value
      data:
        entity_id: fan.loft_fan_2_level

Zwavejs team is looking at adding a device compat flag to allow this issue to be fixed.