Automations are (sometimes) running very slow

Hi All,

I have just finished migrating from a dockerised HA installation to a dedicated RPi4 (4G) running Home Assistant OS. Most of my smart stuff is on Zigbee and sourced from IKEA, integrated using Zigbee2MQTT.
As I was setting up some of the automations yesterday, I noticed that sometimes there’s a considerable delay in executing the various steps of an automation.
E.g: the stairs and upstairs hallway have 6 spotlights (IKEA), which supposed to come on one-by-one, with 250ms delay in between them. This automation was executed hundreds of times while HA was on docker and had no issues whatsoever. With HA OS there seems to be a 1s+ delay in between each light, but the frustrating thing is that it’s not all the time.

I have looked at the MQTT traffic and see no lag there whatsoever. MQTT event to turn the light on published, light comes on without delay.

Here are the relevant bits from a trace showing the time gap (~1.3sec or so) between each step:

  "logbookEntries": [
    {
      "name": "Stairs lights on",
      "message": "triggered by mqtt topic zigbee2mqtt/Ikea_OnOff_6/action",
      "source": "mqtt topic zigbee2mqtt/Ikea_OnOff_6/action",
      "when": 1655722183.739133,
      "domain": "automation"
    },
    {
      "when": 1655722183.87807,
      "state": "on",
      "entity_id": "light.ikea_gu10_ws_17",
    },
    {
      "when": 1655722185.172218,
      "state": "on",
      "entity_id": "light.ikea_gu10_ws_18",
    },
    {
      "when": 1655722186.474206,
      "state": "on",
      "entity_id": "light.ikea_gu10_ws_19",
    },
    {
      "when": 1655722187.75998,
      "state": "on",
      "entity_id": "light.ikea_gu10_ws_20",
    },
    {
      "when": 1655722189.055157,
      "state": "on",
      "entity_id": "light.ikea_gu10_ws_21",
    },
    {
      "when": 1655722190.353614,
      "state": "on",
      "entity_id": "light.ikea_gu10_ws_22",
    }
  ]

Here’s the first bit of the automation config. The successive steps are the same; turn light on, wait 250msec, rinse & repeat.

    "config": {
      "id": "1655565657810",
      "alias": "Stairs lights on",
      "description": "",
      "trigger": [
        {
          "platform": "device",
          "domain": "mqtt",
          "device_id": "be87781e58504a79b88c1d9384c2729d",
          "type": "action",
          "subtype": "brightness_move_up",
          "discovery_id": "0x50325ffffe47a4d1 action_brightness_move_up"
        }
      ],
      "condition": [],
      "action": [
        {
          "type": "turn_on",
          "device_id": "1503b84a820f0fa6b9fd639423cc5ba4",
          "entity_id": "light.ikea_gu10_ws_17",
          "domain": "light",
          "brightness_pct": 100
        },
        {
          "delay": {
            "hours": 0,
            "minutes": 0,
            "seconds": 0,
            "milliseconds": 250
          }
        },

Any idea what to look at? I saw that the system load is 1.5+ sometimes, but at that point I have not set up proper SSH access to HA OS, so was not able to check what was going on. Not sure if the two are related anyway as the previous trace was taken when the system load was 0.2.

Thanks,
Zoltan

Without actually knowing what is going on, it is just guessing. But one thing to note is that HA is waiting for each step to execute by default. So maybe it is also waiting for the Zigbee lights to respond? It could mean Zigbee is not as stable as it could be. Are you using an USB extension cable to keep the zigbee controller away from the RPI4’s USB3 interfernce?

I have thought about it being a Zigbee issue, so I fired up MQTT explorer and I can see the ‘on’ command going out to Z2M and immediately returning with the updated status showing that the light is on. (Also, I don’t believe that HA waits for the updated status, but I may be wrong there).

I have a 1.5m extension cable and I’m also using the USB2 port of the RPi4. This is to avoid interference with they USB-Sata converter, which is also in the USB2 port.

Just tested it and now it works fine. Trigger for the automation comes in at 2022-06-21T10:12:45.556983, first light triggered at 2022-06-21T10:12:45.557814 (+1 msec), last light triggered at 2022-06-21T10:12:46.874908 (1370 msec).
1370 msec total runtime for an automation that includes 5 * 250 msec waiting. Perfect!

I’ll try to activate a Zigbee trace when the issue resurfaces. In the meantime I’m scratching my head… Mainly because the same Zigbee network was working fine on a plain docker install (HA + Z2M) on the same hardware at the same location.