Template entities referencing "states" can miss updates - diagnosed

Hi All,
I think I’ve found a bug in HA’s template loop detection code that can cause template entities to miss updates. I’m not sure the best way to fix it. I welcome any thoughts. Pardon the lengthy diagnosis below.

The bug can occur if you have a template entity that references states. For example:

template:
  - sensor:
    - unique_id: test1
      name: test1
      state: >
        {{ states | selectattr('entity_id','match', 'sensor.*_foo'`) | ... }}

In the above example, depending on the timing of when sensors ending in “foo” update and when the template sensor test1 itself updates, the template sensor may miss updates and fall out of sync. It’s due to a faulty interaction between the rate limiting logic on template re-evaluations and the template loop detection logic. I think if it happens, you’ll see warning in your logs beginning with “Template loop detected” (even though there may be no actual infinite loop).

Background (HA internals)

A tracked template is re-evaluated when any entity it refers to changes state. The re-evaluation code is passed an event object that has the name of the entity that changed state. That event object is used to detect self-reference loops (i.e. if the template is re-evaluating because the template sensor itself changed) and cut them off to prevent infinite loops.

Tracked templates in HA have logic to rate limit re-evaluations The rate limit is implemented via a timer. When a template re-evaluation rate hits the limit, a timer is started. While the timer is running, all state changes are ignored and no re-evaluations are performed. When the timer finishes, a single re-evaluation is done. That single re-evaluation is passed an event object that corresponds to whatever state change happened to cause the rate limit timer to start.

The rate limit for templates referencing states is once every 60 seconds.

EDIT: Also, the logic to track whether a template re-evaluation changed the result (via TrackTemplateResultInfo::_last_result) is calculated in advance of any decision to call subsequent logic to check for loops and eventually update the template entity state.

The bug

Suppose the state change that happens to trip the rate-limit is the template sensor itself changing state. The rate limit timer starts. While the timer’s running say some other state changes happens that will cause the template sensor to change state again when it’s reevaluated. So the timer expires, the template is re-evaluated and changes, triggering the timer to start again (again with the event being that of the template itself changing state). When it expires a second time, the template is re-evaluated and the loop detection code sees two re-evaluations in a row due to the template itself changing, warns of a possible loop, and aborts the re-eval. That effectively causes the template to miss any updates that happened during the timer window.

For reference see where the rate limiter captures the event object: helpers/event.py#1161.

Note, I think the template sensor will correct itself if/when another update comes along that does not cause the sensor to change state.
EDIT : The template sensor may continue to stay out-of-sync despite future state updates because the TrackTemplateResultInfo::_last_result updated to the new template result even though the actual TemplateEntity::_handle_results bailed out early and so never updated the template sensor entity state.

Fixes

One solution is to make TrackTemplateResultInfo::_render_template_if_ready smarter about managing the event object the rate limiter holds on to while a timer is running. If the event object is self-referential and another event occurs while the timer is running which isn’t self-referential, then update the event object stored with the rate limiter to be the non self-referential one.

Thoughts?
Josh

While this could be a bug, IMO it’s best to avoid the states object all together. The reset limiting was added to avoid hammering the entire state machine and it was almost entirely removed at the time. I recommend changing your logic to a predefined list of entities and moving on.

There are a number of ways to achieve this:

  1. Make a trigger based template entity that updates once a minute that uses the state object. It’ll be the same behavior as using the state object without the issues
  2. Make a trigger based template entity that updates every minute, put a list of entities into an attribute that are pulled from the states object. Then make a state based template entity that uses the list. You’ll get live updates without throttling.
  3. Make an automation that updates an old school group that houses entities. Use that old school group as the basis for a state based template entity.
  4. Use labels to make your group of entities.

I’m sure there’s other options too.

As for fixing the bug, you’ll have to prove that it happens with a repeatable way to reproduce it in a development environment, then you’ll need to fix it, and then you’ll need a test that verifies the fix is working properly.

I personally don’t care what happens with the outcome of the stored event, mainly because we always recommend people avoid using the states object at all costs.

3 Likes

Isn’t this the real bug here? Shouldn’t you avoid this?

It’s not avoidable with template entities, it’s always been an issue. It’s why we added the rate limit in the first place. The templates themselves don’t know what entity they are inside. This is why the circular reference was added. It’s been awhile since I’ve looked a that code, but the only way we can detect a circular reference is by creating context and check the context down the line to see the source. I.e. the first time it resolves it has the context for what caused the template to resolve, if we get that same context again, it’s a circular reference. There’s almost no way to avoid it when using the states object inside a template entity. That’s why we always suggest to avoid using the states object for state based template entities.

In regards to functionality, the states object is rate limited to at most one update per minute. 99.99% of the time, this is no different than using a trigger based template entity where the trigger is a 1 minute time pattern. Essentially, the states object will always converge to the rate limit, so there’s almost no reason to use it in a state based template entity when you can safely control the rate limit with a trigger based template entity.

2 Likes

Does the advice “don’t use the states object” also apply to cases like:

states.switch|selectattr(...)
states['sensor.my_sensor'].name

or only to the “full scan” case: states|...

states.switch is throttled to at most 1 update per second. If you don’t have a ton of entities, it’s not a huge deal. You likely won’t hit that throttle. You may hit that throttle often with states.sensor because it’s typically the largest domain by far. The big take away here, is if you put states.switch into a switch template entity, you will hit the circular reference even if you filter out the switch itself. Because the act of using states.switch will reference all switches. It’s not a generator, it’s a list, so every switch entity is touched, thus creating the circular reference.

That is a full state object, and you’re only acessing a single entity. Nothing to worry about there. It’s only if you access states without any inner object.

2 Likes

My takeaway from Petro’s explanation is to avoid creating a Template entity with a template that includes itself.

If you’re creating a Template Switch with a template that starts with states.switch then it will be including itself.

1 Like

Yes.

On top of that, there’s almost always a way to avoid using states or states.<domain> inside state based template entities.

I typically do not avoid them in automations or trigger based template entities. Mainly because they do not create runaway situations.

Thanks guys, you explained it well, I think everything is clear to me now.

1 Like

To add a bit more detail to Petro’s explanation, in case it helps others:

So that HA doesn’t need to reevaluate every template on every state update, the template tracking engine notes the kinds of entity references in a template. It distinguishes between the following:

  • When you say states('sensor.foo'), the engine notes you referenced the entity sensor.foo and will re-evaluate the template when that entity changes.
  • When you reference states.switch, it notes you referenced the domain “switch”. It will re-evaluate the template whenever any entity in that domain changes state, regardless of whatever jinja filters you have after (e.g., states.switch | selectattr ...). Re-evals are currently rate-limited to one per second.
  • When you reference states by itself, the engine will re-evaluate the template whenever any entity at all changes state. Re-evals are currently rate-limited to one per minute.
  • When you reference time like now(), the engine re-evaluates the template every minute.

The loop-detection logic for template entities exists to prevent infinite loops. However, it’s rather conservative and aborts a loop if twice in a row it sees a template update whose cause is attributed to the template entity itself changed state. I wonder if the logic that decides attribution is sometimes imprecise. In any case, for example, having a template switch entity that references states.switch may be enough to trip the infinite-loop detection logic and, due to the suspected bug I describe above, result in updates being missed.

I believe Petro’s suggestion to use trigger-based updates avoids the above issue partly just because it avoids that loop-detection logic I mentioned.

-Josh

1 Like

It avoids the issue all together. Trigger based template sensors only update when a trigger is received. That means templates will evaluate at most once per trigger and they will not cause “self referencing” issues (assuming you don’t add the entity to the trigger).

Oh, right. I just looked at triggers again - even template triggers will only trigger when the template val switches from falsey to truthy. So it’s more restrictive than using the internal triggering in tracked templates. For example, I don’t think using triggers will enable you to create a sensor based on all entities ending in “_foo”, or a sensor based on all entities in the “unavailable” state…
J

I think you’re misunderstanding what I’m saying. There are 2 types of template entities: state based and trigger based.

state based

template:
- sensor:
  - name: something
    state: "{{ states | ... }}"

trigger based

template:
- triggers:
  - trigger: ...
  sensor:
  - name: something
    state: "{{ states | ... }}"

When you use {{ states | ... }} the sensor is watching all states in the state machine. This means it will effectively always update once a minute, never more.

However, if you create a trigger based template entity that is forced to update once a minute, you get the same net result without any circular reference issues.

e.g.

template:
- triggers:
  - trigger: time_pattern
    minutes: "/1"
  sensor:
  - name: something
    state: "{{ states | ... }}"

lastly, if you make a state based template entity that reacts to entities built in a trigger based template entity, you get live updates on your state based template entities.

e.g.

template:
- triggers:
  - trigger: time_pattern
    minutes: "/1"
  sensor:
  - name: something
    state: "Ok"
    attributes:
      entities: "{{ states | selectattr('entity_id','match', 'sensor.*_foo'`) | list }}"

- sensor:
  - name: something2
    state: "{{ state_attr('sensor.something', 'entities') | ... }}"
2 Likes

Ah, right, I forgot about the “/1” trigger techinique. The sensor.something trigger-based template entity is similar to the state-based template entity except it moves the logic to reeavluate the template from the HA internal template tracking (e.g., TrackTemplateResultInfo) to being explicit in the config, and moves the list of entities tracked from being manage implicitly to being managed more explicitly as an attribute.

I wonder if there’s a way to make HA’s internal template change tracker (e.g., template.async_render_to_info() ) smarter so it can do what you’re explicitly doing via triggers + sensor.something + sensor.something2.

It’d be neat if you write your original state-based template entity and have HA figure the rest out. For example, instead of polling for changes to the list of entities every minute, if HA knew the filter was based on the entity_id, it could check for changes only when entities are added or removed.

Josh