Automation triggering... but not actioning

I have a Zigbee light switch and Zigbee wireless button. Both are registered with HA using the ZHA integration, and from testing both seem to be stable and responsive - I can turn the switch on and off via the Lovelace card and see events fired when the wireless button is pressed. I don’t remember seeing any misfires or missed events.

I have created a simple automation to toggle the zigbee light when the wireless button is pressed. This works - most of the time. I’m trying to figure out why it doesn’t work sometimes.

I do not think it’s a reception or connection issue, as I see the following in the logbook (I have reversed the timeline for this post):

23:28:31
Toggle Bedroom Lights has been triggered
23:28:40
Toggle Bedroom Lights has been triggered
23:28:43
Toggle Bedroom Lights has been triggered
23:28:51
Toggle Bedroom Lights has been triggered
23:28:52
Bedroom Lights turned on
23:28:54
Toggle Bedroom Lights has been triggered
23:28:59
Toggle Bedroom Lights has been triggered
23:29:02
Bedroom Lights turned off
23:29:09
Toggle Bedroom Lights has been triggered
23:29:12
Bedroom Lights turned on
23:59:56
Toggle Bedroom Lights has been triggered
23:59:58
Bedroom Lights turned off

This matches my experience - I pressed the wireless button a few times before the light turned on, and then a couple of tries to turn it off but then it worked as expected for the rest of the night. I suppose in theory HA is struggling to find a connection to the wall switch, but as I explained above I never see similar issues when I manually toggle it via the UI.

I do have a bunch of these in the logs (with various values od transid) around the same time which may be related:

2020-08-10 23:28:41 WARNING (MainThread) [zigpy_cc.api] Waiter timeout: <Waiter matcher=<Matcher type=CommandType.AREQ subsystem=Subsystem.AF command=dataConfirm payload={'endpoint': 2, 'transid': 5}> future=<Future cancelled> timeout=10000 sequence=None>
2020-08-10 23:28:51 WARNING (MainThread) [zigpy_cc.api] Waiter timeout: <Waiter matcher=<Matcher type=CommandType.AREQ subsystem=Subsystem.AF command=dataConfirm payload={'endpoint': 2, 'transid': 7}> future=<Future cancelled> timeout=10000 sequence=None>

It may also be interesting to note that I only notice this at night (although that’s the only time I really use it so…). I have NOT tried the web UI at those times to confirm if stability changes with time of day, but my other Zigbee devices seem to work okay. The only other data point I can think of is that the wall switch is an Aqara “no neutral” switch so maybe my electricity voltage drops at night and the magic no longer works when it does.

Otherwise I’m open to hearing the collective wisdom of the forums!

1 Like

can you share your automation?

I created it in the UI so I think this is the relevant snippet from the yaml:

- id: '1596040399304'
  alias: Toggle Bedroom Lights
  description: ''
  trigger:
  - device_id: cf8f9e59403f45a6afjsiweyuwbbdb32
    domain: zha
    platform: device
    subtype: button_1
    type: remote_button_short_press
  condition: []
  action:
  - device_id: 87a9b7360a4c4fc2af17kjdhjdh777a182
    domain: switch
    entity_id: switch.bedroom_lights_on_off
    type: toggle
  mode: restart

I changed the mode to restart to see if it fixes the issue. Perhaps interestingly with the default single mode I was getting “automation not finished” warnings in the logs.

yeah I think that if you tried to toggle the switch too quickly it could fail if the automation was still running. Though there are no timer so you’d need to either press the button very quickly, or more likely you have some mechanical bounce on your button which triggered too many automations in a short time.

A restart would delay your automation a bit by restarting your automation (but you only have 1 action so should be over rather quickly.
What I’d instead try to do if the issue is only when pressing the physical button is to stop the automation from running if it ran in the past (say) 1 sec.
Replace

  condition: []

with

  condition:
    - condition: template
      value_template: '{{ (as_timestamp(now()) - as_timestamp(state_attr("automation.toggle_bedroom_lights", "last_triggered") | default(0)) | int >= 1)}}'

Agreed, the button press events may be coming multiple times per press, especially considering the “not finished” (i.e., already running) warnings when the automation was in single mode.

It’s actually a little easier now to keep the automation from firing too often. Simply change it back to single mode, then add a delay after the switch toggle:

- id: '1596040399304'
  alias: Toggle Bedroom Lights
  description: ''
  trigger:
  - device_id: cf8f9e59403f45a6afjsiweyuwbbdb32
    domain: zha
    platform: device
    subtype: button_1
    type: remote_button_short_press
  condition: []
  action:
  - device_id: 87a9b7360a4c4fc2af17kjdhjdh777a182
    domain: switch
    entity_id: switch.bedroom_lights_on_off
    type: toggle
  - delay: 3
  mode: single

You’ll still get warnings if the button press events are coming quicker than every 3 seconds (or whatever value you choose to use in the delay), but it should effectively “filter” them out, meaning it won’t toggle the switch any quicker than every 3 seconds.

But warnings aside, shouldn’t the first event do something in single mode (or eventually in restart mode)? My use case isn’t to toggle the lights with a high frequency, but to work when pressed the first and only time, and I only press again (usually after a couple of seconds) because it didn’t work the first time.

More info on the wireless button - it can send “click” as well as “double click” and “long press” events so it seems to understand how often it’s been pressed.

I suppose a fundamental question I have is what could an automation be doing in toggling a switch that causes instability isn’t apparent when manually toggling via the UI?

As I write this I realise I was having similar issues when automating via node red. Maybe then for example “toggle” is unstable and I could hardcode “if off then on” etc? Or perhaps it is just a HW issue, in which case how can I test that?

I’ll try the suggestions in the thread otherwise.

FWIW I replaced the single toggle to two “if x than not(x)” automations and am seeing far better results. Of course it could be anecdotal, but I’m offering the data point just in case there’s something fundamentally inefficient about the way HA handles toggling.