Incorrectly populated previous context to the state_changed event originated by other actor (should be the new context) - in ~5 seconds window

Tomaszek · March 9, 2021, 1:56pm

Hello, I have a problem with incorrectly populated context in state_changed events.
Current behaviour: old context, with old id is populated to the new events originated by other actor than previous change.
Expected behaviour: new context should be created for these events.

Now more information.

I use MQTT Light configuration (but probably it doesn’t matter and I expect problem is not connected with MQTT integration):

- platform: mqtt
  unique_id: "B1"
  name: "Server room"
  command_topic: "b/B1/command"
  state_topic: "b/B1/status"
  payload_on: "on"
  payload_off: "off"
  qos: 1

Once configured, I can read and change the state from the Home Assistant without any problem (including from the level of Lovelace UI).

Also I have hardware momentary switch witch modifies state (toggles) of the light directly, without coordinating with Home Assistant. Home Assistant just observes new state - and it works fine.

Now I have process built in Node Red (integrated with HA) which turns on/off this light in certain circumstances (based on sensors). And again - this automation works without any problem = turns on/off whenever needed.

Now slowly going to the problem.

I’ve built automation (in Node Red integrated with HA) which reacts to state_change events on the light - but only to those events which were NOT originated by Node Red automations.

How I recognise who originated the event? I base on the context.user_id. Whenever event is originated by Node Red, event has set user_id in the context to the ID of Hass.io autogenerated user. All the events originated from Node Red has it - no exceptions observed by me.

Also events originated from the Lovelace UI has user_id in context set to the logged in user. And events originated from the hardware momentary switch (so originating outside the HA environment) has user_id set to null. Always. Or… almost always.

This is NOT set to null when the momentary switch caused the light state change within approx. 5 seconds after the light state change caused by Node Red automation.

So scenario when everything works OK:

Node Red causes change of the B1 light state to ON. Event with context.user_id set to Hass.io is fired. OK!
Nothing happens for the next 5 or more seconds. (Measured empirically.)
I press momentary switch toggling B1 light state from ON to OFF. Home Assistant observes the change and fires new state_changed event. This event contains context with user_id set to null. OK!

Scenario when user_id / context is set incorrectly:

Node Red causes change of the B1 light state to ON. Event with context.user_id set to Hass.io is fired. OK!
After 0.5 or 1 or 2 or 3 or 4 seconds (within 5 seconds after previous point) I press momentary switch toggling B1 light state from ON to OFF. Home Assistant observes the change and fires new state_changed event. This event contains context with user_id set to Hass.io. More accurately this event has the same context ID as the event in previous point. So it shows that it was caused by the automation, not coming from the “external world” (meaning momentary switch).

For completeness, here is part of the event dump from the point 2. of incorrect scenario:

{
  "event_type":"state_changed",
  "entity_id":"light.server_room_light",
  "event":{
    "entity_id":"light.server_room_light",
    "old_state":{
      "entity_id":"light.server_room_light",
      "state":"on",
      // […]
      "last_updated":"2021-03-09T13:14:44.932896+00:00",
      "context":{
        "id":"77cafd384c0444794f01caab3cce8f08",
        "parent_id":null,
        "user_id":"621ac64ba0e14a95b6e30b614f6b8701"
      },
      "original_state":"on"
    },
    "new_state":{
      "entity_id":"light.server_room_light",
      "state":"off",
      // […]
      "last_updated":"2021-03-09T13:14:46.099566+00:00",
      "context":{
        "id":"77cafd384c0444794f01caab3cce8f08",
        "parent_id":null,
        "user_id":"621ac64ba0e14a95b6e30b614f6b8701"
      },
      "original_state":"off",
      "timeSinceChangedMs":6
    }
  },
  "origin":"LOCAL",
  "time_fired":"2021-03-09T13:14:46.099566+00:00",
  "context":{
    "id":"77cafd384c0444794f01caab3cce8f08",
    "parent_id":null,
    "user_id":"621ac64ba0e14a95b6e30b614f6b8701"
  }
}

You can see the change happened within 1-2 seconds. And new_state.context.id is equal to old_state.context.id - but the new_state was originated by the momentary switch, not Node Red (as old_state) - 100% sure here, I’ve tested this for few hours and different scenarios.

I’ve tried to change optimistic parameter for MQTT Light from default false to true, but it doesn’t make any difference. And I think it is not connected with the MQTT at all. My guess is that it is some kind of heuristic on HA side, where state changes within small time window (~5 seconds) on the same entity are “joined” together and it is assumed that it comes from the same context / actor, if HA do not have certain information who originated the change. Because here HA just notices state change - do not know that it was caused by the momentary switch NOT connected with HA…

Questions:
A) Do you consider this as bug? Any chances to fix?
B) Where is the place in code where I could change this ~5 second time window? I have fast and reliable system and network, I think everything would work correctly also for this time window of size ~0.5 seconds (10x smaller). And I would be fine with this.
C) Any other simple ideas to implement? To recognise on the event level who originated the change (Node Red vs. non-Node Red)?

Tomaszek · March 9, 2021, 2:18pm

Also - it doesn’t matter if event was caused by Node Red. The same applies for events caused by change coming from Lovelace UI. So other example of INCORRECT scenario:

From Lovelace UI I cause change of the B1 light state to ON. Event with context.user_id set to ID of my username is fired. OK!
After 0.5 or 1 or 2 or 3 or 4 seconds (within 5 seconds after previous point) I press momentary switch toggling B1 light state from ON to OFF. Home Assistant observes the change and fires new state_changed event. This event contains context with user_id set to my username. More accurately this event has the same context ID as the event in previous point. So it shows that it was caused by the Lovelace UI, not coming from the “external world” (meaning momentary switch).

Tomaszek · March 9, 2021, 3:48pm

And I think I’ve found place in code which is responsible for this time window for applying the same context:

github.com

home-assistant/core/blob/fd1add8f159761c57aa679f0c1a4258d65ec7112/homeassistant/helpers/entity.py#L239


    updated, not just when the value changes.
    """
    return False

@property
def supported_features(self) -> Optional[int]:
    """Flag supported features."""
    return None

@property
def context_recent_time(self) -> timedelta:
    """Time that a context is considered recent."""
    return timedelta(seconds=5)

@property
def entity_registry_enabled_default(self) -> bool:
    """Return if the entity should be enabled when first added to the entity registry."""
    return True

# DO NOT OVERWRITE
# These properties and methods are either managed by Home Assistant or they

EDITED:
What I did to solve the problem:

I copied mqtt/ component to custom_components directory.
Modified MqttEntity from mixins.py - https://github.com/home-assistant/core/blob/fd1add8f159761c57aa679f0c1a4258d65ec7112/homeassistant/components/mqtt/mixins.py#L519 - overriding context_recent_time() by returning timedelta(milliseconds=400)
PS. If you have the same problem - do not forget to add import (from datetime import timedelta).
PS2. With this change you may want to change mode for your MQTT entities to optimistic mode (aggressively reduced time may be too aggressive and create opposite problem - not marking events coming from automations/UI with proper user_id).

It’s partial answer for my question B) from the original post

I’ve tested the change and it works. Rationale is that human normally does not react so quickly after observing effects of automation - so it works in my case.

But it’s not very convenient to handle mqtt component updates… Would be worth to consider this as a standard configuration entry for entity. What do you think?

Also, any other suggestions to solve the problem?

imagio · November 24, 2021, 5:44pm

This is a pretty significant gotcha when using context to determine the source of state changes! Anybody trying to use context will likely see some unexpected behavior due to this. Were you able to determine if there’s any particular reason why 5 seconds is used or why context is even linked by time in the first place?

Maybe this should be filed as a bug on GitHub. The actual behavior directly contradicts the documentation. Documentation says userId is

Unique identifier of the user that started the change.

when the actual behavior is

Unique identifier of the user that performed any action on the dashboard in the last 5 seconds. This value will be incorrect for any non-user-initiated state changes happening within 5 seconds of a user interaction.

imagio · November 24, 2021, 6:30pm

I filed an issue on GitHub about this Context is incorrect for any state changes that happen within 5s of a user interaction · Issue #60302 · home-assistant/core · GitHub

elkarto405 · November 25, 2021, 9:25am

Thank you for opening the issue.

Yesterday I was opening one for the same thing, and after describing the problem, the browser window closed and I didn’t feel in the mood to start again.

I don’t know if this is an easy bug to fix because it seems to be an root problem. Maybe they can make a quick patch.