How can i get this template sensor to update in a more timely manner?

123 · July 15, 2021, 9:23pm

I created the topic, everyone else created the ‘huge thread’.

petro · July 15, 2021, 9:24pm

Templates with now() update at least once per minute (on the minute). Templates with states will update at most once per minute. Templates with states.<domain> will update at most once per second.

petro · July 15, 2021, 9:24pm

Yah, it wasn’t a ‘bash’, I was just pointing out where the discussion took place

123 · July 15, 2021, 9:30pm

FWIW, I just opened two separate browser windows, side-by-side.

Left window displays sensor.notifications in Developer Tools > States
Right window displays Developer Tools > Services, setup for creating a persistent notification.

I mashed the right window’s “Call Service” button repeatedly and the left window’s count for sensor.notifications kept up with the button-mashing.

All this to say, there’s something unique about your situation because it works flawlessly for me. I realize that’s cold comfort because it means what you’re experiencing isn’t nominal.

By “mashing” I mean I banged out about ten notifications in under 3 seconds.

petro · July 15, 2021, 9:30pm

Here’s a post with a diagram that explains the rate limits.

famous.bulb · July 16, 2021, 12:40am

Good to know / that explains the small lag… which is not what I was concerned about, really.

I have a very simple automation that boils down to “when there is a persistent notification, set $someLight to blue”.

The whole reason why I started this thread is because one day I noticed two persistent notifications (withings error and a new HA release) and was puzzled to see why the light in question was not blue. I checked the trace on the automation that turns the light blue when the sensor > 0 and discovered thatthe template sensor still said 0… which is why the light was off and not the expected blue.

At least for now the template seems to be behaving. I’ll wait until the withings integration fires off the bogus error again and see what happens…

famous.bulb · July 19, 2021, 4:37pm

An update!

There was an issue with HACS starting up which generated a persisnten notification saying as much. The notification language was something along the lines of “an integration had trouble setting up…”.

As soon as that notificaiotn fired, the template sensor went to a state greater than 0 which caused my automation to fire which resulted in a blue light.

However, dismissing the notification did not return the template sensor down to 0!

I opened dev tools / template and noticed that this:

 {{states.persistent_notification|length}}

evaluated to 7 … which is more than 0 so that would explain why the light was still on. But there are zero persistent notifications… so the value should be 0, right!? Another test:

 {{states.persistent_notification}}

evaluated to unknown which does have a length of 7.

So with that, the template sensor is revised:

# get a count of the number of persistent notifications
##
-  sensor:
    - name: "Active Persistent Notifications"
      # Unique ID is required for mgmt through the web UI
      unique_id: tmpl-active-persistent-notifications
      # I don't understand why, but from time to time the state of persistent notifications can be 'unknown' rather than 0.
      # This means we need to explicitly test for this unique condition; we can't just pipe to len() otherwise we'll get back
      #     the number of letters in the word unknown!
      ##
      icon: >-
        {% if states.persistent_notification == 'unknown' %}
        mdi:bell-badge
        {% else %}
        mdi:bell
        {% endif %}
      state: >
        {% if states.persistent_notification == 'unknown' %}
        0
        {% else %}
        {{states.persistent_notification|length}}
        {% endif %}

123 · July 19, 2021, 5:37pm

You may believe it to be a solution but it reinforces the hypothesis that your system contains a fault that cannot be replicated. The template you posted serves as a means to hide that unexplained fault.

You stated:

famous.bulb:

I opened dev tools / template and noticed that this:
 {{states.persistent_notification|length}}
evaluated to 7
…
But there are zero persistent notifications… so the value should be 0

When there are no persistent notifications, it evaluates to 0 for me.

In addition, the following doesn’t evaluate to unknown for me like you’ve reported:

The template you’ve created works around the fault so it’s possible it may manifest itself in some other way at a later date.

petro · July 19, 2021, 7:14pm

Something is not correct on your system. states.<ANYVALUE> always resolves to a domain object.

Also, to expand on 123s suggestions, your template checking against unknown will always hit the else state.

What’s actually happening is states.persistent_notification is none and the template editor changes none to "unknown". So your template should actually be:

        {% if states.persistent_notification is none %}
        0
        {% else %}
        {{states.persistent_notification|length}}
        {% endif %}

However, you shouldn’t be getting none from states.xyz, so something is very wrong.

famous.bulb · July 22, 2021, 4:50pm

Another update! This time with the clearest possible depiction of the issue.

I am convinced that there is a missing code path or other bug that prevents the template sensor from being evaluated every time it should.

Background

Some time shortly after midnight, two things happened:

The withing integration bug fired off again
At least one of the Shelly devices on my network determined that there was a new firmware update available.

Here is the automation that manages the notification when there is a shelly device firmware update.

alias: Notify when Shelly Devices have new FW updates to apply
description: ''
trigger:
  - type: turned_on
    platform: device
    device_id: 6<...>2a
    entity_id: binary_sensor.shelly_2_5_<...>_firmware_update
    domain: binary_sensor
    id: some_id
    # <... this repeats for each shelly device ...>
condition: []
action:
  - service: persistent_notification.create
    data:
      message: Please check shelly devices
      title: Shelly device has a pending FW update  # Yes, i fixed the typo; you will see SHelly in screenshot.
  - service: todoist.new_task
    data:
      content: Update Shelly Firmware
      # ...
mode: single

And here is the automation in question that I keep referencing w/r/t the blue light.

alias: Toggle status light blue when HA has notifications pending
description: >-
  A visual queue that something needs my attention
trigger:
  - platform: state
    entity_id: sensor.active_persistent_notifications
    from: '0'
    id: some-pending
  - platform: state
    entity_id: sensor.active_persistent_notifications
    id: none-pending
    to: '0'
condition: []
action:
  - choose:
      - conditions:
          - condition: trigger
            id: none-pending
        sequence:
          - type: turn_off
            device_id: d<...>b
            entity_id: light.blue
            domain: light
      - conditions:
          - condition: trigger
            id: some-pending
        sequence:
          - type: turn_on
            device_id: d<...>b
            entity_id: light.blue
            domain: light
    default: []
mode: single

SO with that in mind, lets get on with the show!

What I Woke Up to

After waking up and making my way into the kitchen for some coffee, I noticed the indicator light was blue. Since my computer was still asleep, I used my phone to check what the notifications were:

Before dismissing any notification, I woke up my desktop and prepared to start taking screenshots / documenting.
I pulled up three instances of Lovelace and then dismissed went back to my phone to dismiss the “withings failure” notification.
I used my phone to dismiss because I wanted to rule out any websocket related issues. As soon as I dismissed the notification on my phone, the notification badge count dropped by one on my mobile and desktop. It is safe to say that there is no problem with websockets here; a button press on one device is reflected in real time on the other.

Look above, I want to call your attention to a few things:

The notification badges are correct.
The template sensor is not correct
The template evaluation in dev tools is correct

After collecting and annotating that screenshot, I dismissed the only remaining notification; the shelly fw notification:

This time:

The notification badges are correct; now reads 0.
The template sensor is not correct. It isn’t an ‘off by one’ error… it straight up did not get evaluated!
The template evaluation in dev tools is correct

With that, I wondered if there was some sort of code path that either the shelly or the withings notification were created through that does not result in the template sensor evaluation. I fired off a new test notification:

This time:

The notification badges are correct; now reads 1. The notification drawer has the expected value.
The template sensor is correct. It has changed from 2 to 1.
The template evaluation in dev tools is correct

So then dismissing the test notification absolutely should result in all counts reading 0, right?! Well, i thought so, too.

This time:

The notification drawer and badge both show 0. The expected result.
The template sensor is not correct. It still appears as though it was never re-evaluated.
The template evaluation in dev tools is correct

So then what would it take to force the template sensor to be re-evaluated!?
I went into server controls and reloaded the template entities. The template sensor was updated and the notification light went from blue to off. Success!

I went to update the firmware on all but one of the Shelly devices and to make another cup of coffee.
While away, the shelly discovery script fired and the one device that didn’t get a FW update caused a notification to fire off again as expected / intended.

This time:

The notification drawer and badge both show 1; the single notification created by the automation shown @ the top of this post.
The template sensor is correct.
The template evaluation in dev tools is correct

So what happens when I dismiss the shelly notification? If you’ve been paying attention, you know that the answer is not “the template sensor worked as it should have”

This time:

The notification drawer and badge both show 0 as expected.
The template sensor is ** not correct**; it is still behaving as though it was never evaluated. The indicator light is still blue / on.
The template evaluation in dev tools is correct

So with that, I am convinced that there is an issue with the logic that schedules template sensors for (re)evaluation. In at least some (not yet fully understood) cases, dismissing a notification does not trigger re-evaluation…

123 · July 22, 2021, 5:34pm

If you’re convinced it’s a bug, I suggest you report it as an Issue in the Core repository on GitHub. However, your challenge is to provide a means of reproducing the error. Without that, a software developer is less likely to take on the challenge when there’s no way to observe the bug in action first hand.

Good luck!

petro · July 22, 2021, 6:00pm

Just to throw a wrench in your hypothesis, both the template tester and template sensors use the same exact code to update.

How about you post your entire template section instead of focusing on the automation.

famous.bulb · July 22, 2021, 7:47pm

This is precisely why I posted here and not directly to a new GH issue. If i can figure out what condition(s) are required to reproduce then i’ll open an issue there.

famous.bulb · July 22, 2021, 7:59pm

This is precisely why i used the word “schedules”. At the end of the day, some code path is triggered; that path ends up calling out to the jinja2 library for evaluation. I do not have an issue with the result of the template… the issue is that the result is not being computed in a timely or consistent manner… hence ‘schedule’.

While i am very certain that the ephemeral template expression via the webUI and template entities use the same evaluation engine, I am less certain that the triggers that lead to that eval are the same… which is why ephemeral expression seems to do what I want, but the template entity does not appear to be evaluated when i’d expect it to.

I don’t understand what you’re asking for?
The template sensor in question is detailed a few posts up in this thread:

Hopefully this is enough to explain things / satisfy your ask?

me@some-host # grep -C2 "template" configuration.yaml

# Template sensors are now a specific integration
# See: https://www.home-assistant.io/integrations/template/
##
template: !include_dir_merge_list devices/template/


me@some-host # cat devices/template/notifications.yaml 
# get a count of the number of persistent notifications
##
-  sensor:
    - name: "Active Persistent Notifications"
      # Unique ID is required for mgmt through the web UI
      unique_id: tmpl-active-persistent-notifications
      icon: >-
        {% if states.persistent_notification|length > 0 %}
          mdi:bell-badge
        {% else %}
          mdi:bell
        {% endif %}
      state: >
        {{states.persistent_notification|length}}

petro · July 22, 2021, 11:18pm

What you call “schedule” is part of the same code. Listeners are created and the listeners drive the updates. It’s literally identical top to bottom. Theres no difference. Not sure how else to explain it.

famous.bulb · July 23, 2021, 1:42am

Yes, It is a massive codebase.

As is common with large code bases, a given function will often will have several different paths to it. Each path presents an opportunity for a different behavior.

One path may immediately try to render the template and return a value, another path may make a note that the template should be re-evaluated and put that note on a queue where some other process may eventually get the note. While the code that evaluates the template may be identical in both paths, the act of evaluating the template is scheduled to take place at different times in the two examples. One is scheduled “as soon as possible” and the other is scheduled “after the other $things that were already in line are dealt with”.

In the case of the dev tools / template page, the code that actually parses the template string and tries to return something useful lives here: core/homeassistant/helpers/template.py at 4c51299dcc7b690e4e6789cee826e2d67b50eed2 · home-assistant/core · GitHub

The Dev Tools page triggers a template evaluation every time client/browser detects a change in the forms’ content; when the text changes, a message is sent over the websocket. There are some route/validate layers i’ll skip over, but eventually the message lands here:

github.com

home-assistant/core/blob/470f2dd73f9f476cf5e7ba58885af2aea43169d5/homeassistant/components/websocket_api/commands.py#L328


      
              {
                  vol.Required("type"): "render_template",
                  vol.Required("template"): str,
                  vol.Optional("entity_ids"): cv.entity_ids,
                  vol.Optional("variables"): dict,
                  vol.Optional("timeout"): vol.Coerce(float),
                  vol.Optional("strict", default=False): bool,
              }
          )
          @decorators.async_response
          async def handle_render_template(
              hass: HomeAssistant, connection: ActiveConnection, msg: dict[str, Any]
          ) -> None:
              """Handle render_template command."""
              template_str = msg["template"]
              template_obj = template.Template(template_str, hass)  # type: ignore[no-untyped-call]
              variables = msg.get("variables")
              timeout = msg.get("timeout")
              info = None
          
              if timeout:

The message looks something like this:

{"type":"render_template","template":"this is an example template string","timeout":3,"id":48}

In this case, the template evaluation is scheduled for the next event loop (call_soon_threadsafe()) as soon as the message is received.

Since the “there is a new notification to show” and the “this notification was dismissed” messages are exchanged over the same WS, it makes sense that the template value is always updated in near real time. On the next loop, it’ll be evaluated and the result pushed back to the browser via WS if the result has changed.

But here’s the kicker: the “state of this $thing has just changed” messages are also sent to the browser over that same WS. This is how the dev tools/states page can show the state of a contact sensor - for example - in near real time based on weather or not that door where the sensor is attached is open or closed.

But this begs the question: why is the devtools template result practically instantaneous but the template
sensor is clearly not?

Either:

the template sensor is not being scheduled for timely re-evaluation
or
the template sensor is being evaluated in a timely manner… but the “the state has changed” message is not propagating to the web interface and the other callback function(s) where the automation for the status light is dealt with.
or
some other possibility that eludes me

By definition, i can’t speak to the 3rd option. If you have a plausible explanation that better explains the symptoms, then please share!

The second option seems unlikely just because it’s a less simple failure. Reloading the template entities immediately gets the dev tools/states view to show a correct value and the automation that turns the blue light off responds to the state change and turns the light off. This indicates that there is no issue communicating “the state of the sensor has changed” to the browser and to the other call backs that deal with the particular automation.

Which leads me to the first option. I am not a heavy user of async python and there’s some non-trivial use of it in the HA code base so i’m struggling to identify precisely what about this code path results in a sensor whos template does not reliably get re-evaluated.

finity · July 23, 2021, 3:17am

I would recommend that you submit an issue containing all of the info that you posted in posd #18 above.

If that doesn’t get someone’s attention to at least start looking into it then there’s nothing more you can do.

I guess you could (and I’m surprised I am even suggesting it given my dislike for the medium) try to get some attention for this on the HA Discord.

It looks like you’ve done your homework on this (at least as much as you could) so hopefully given that effort on your part someone might take up the issue.

petro · July 23, 2021, 10:24am

Please for the love of god stop trying to mansplan the code I helped with. Thanks.

The real area that you should be looking is this class, which sets up the listeners for templates.

github.com

home-assistant/core/blob/470f2dd73f9f476cf5e7ba58885af2aea43169d5/homeassistant/helpers/event.py#L770


      
              info = async_track_template_result(
                  hass, [TrackTemplate(template, variables)], _template_changed_listener
              )
          
              return info.async_remove
          
          
          track_template = threaded_listener_factory(async_track_template)
          
          
          class _TrackTemplateResultInfo:
              """Handle removal / refresh of tracker."""
          
              def __init__(
                  self,
                  hass: HomeAssistant,
                  track_templates: Iterable[TrackTemplate],
                  action: Callable,
              ) -> None:
                  """Handle removal / refresh of tracker init."""
                  self.hass = hass

If there was a bug, it would be here. Specifically in this area:

github.com

home-assistant/core/blob/470f2dd73f9f476cf5e7ba58885af2aea43169d5/homeassistant/helpers/event.py#L975


      
                  continue
          
              template = track_template_.template
              self._setup_time_listener(template, self._info[template].has_time)
          
              info_changed = True
          
              if isinstance(update, TrackTemplateResult):
                  updates.append(update)
          
          if info_changed:
              assert self._track_state_changes
              self._track_state_changes.async_update_listeners(
                  _render_infos_to_track_states(
                      [
                          _suppress_domain_all_in_render_info(info)
                          if self._rate_limit.async_has_timer(template)
                          else info
                          for template, info in self._info.items()
                      ]
                  )

But as said earlier, I don’t think there is a bug in this code. As stated above, your system for some reason renders domain templates as None, which will update/remove the listeners for the template. That’s where you should be focusing and something is wrong with your system.

EDIT: To show you exactly the problem:

Notice how an invalid domain state still renders to a template DomainState object? This is what drives the update. Yours was rendering unknown, which is actually None because the template editor code changes None to unknown when rendering.

You’re the only one who has that issue. If you can replicate it, that’s what you should be writing up. Not the mess above because the mess above is a result of the underlying issue on your system.

famous.bulb · July 23, 2021, 4:24pm

Apologies if that’s how that came across. I was trying to communicate that irrespective of how the template is evaluated, the when is not consistent. It sounds like we’re on the same page.

The unknown issue hasn’t happened since the one time I was able to observe it. I wish i had the mental clarity at the time to screenshot it. Since i have not been able to observe it before or since, I am treating it like an outlier; something that is possibly resulted from some sort of user error in administering the test or observing the results… the most likely source of most outliers.

In zero of the screenshots i posted yesterday did that particular unknown or None behavior re-surface. If it is all the same code underneath, then it is incredibly difficult to explain why the template editor was accurate but the template sensor was not. Since they both consist of the same expression {{states.persistent_notification|length}}, and the expression in the template editor was always the correct integer, it is safe to conclude that the template sensor also would have never evaluated to None.

If, somehow, the template entity had evaluated to something that the length function could not be applied to, then either the template editor would have surfaced a TypeError or similar exception… right?

Neither home-assistant.log or the logs viewer through web UI indicate any such problem, though. If i had seen any, of course they’d be included in a post!

As a quick test:

me@some-host# cat devices/template/notifications.yaml 
# get a count of the number of persistent notifications
##
-  sensor:
    - name: "Active Persistent Notifications"
      # Unique ID is required for mgmt through the web UI
      unique_id: tmpl-active-persistent-notifications
      icon: >-
        {% if states.persistent_notification|length > 0 %}
          mdi:bell-badge
        {% else %}
          mdi:bell
        {% endif %}
      state: >
        {{states.persistent_notification|length}}

# quick test
-  sensor:
    - name: "TEST PNOT CT"
      state: >
        {{None|length}}

me@some-host# grep -i "error.*template.*$" home-assistant.log
2021-07-23 15:37:01 ERROR (MainThread) [homeassistant.helpers.event] Error while processing template: Template("{{None|length}}")
2021-07-23 15:37:01 ERROR (MainThread) [homeassistant.components.template.template_entity] TemplateError('TypeError: object of type 'NoneType' has no len()') while processing template 'Template("{{None|length}}")' for attribute '_attr_state' in entity 'sensor.test_pnot_ct'

This is good to know and starts to sound plausible. If the callbacks are removed from the template sensor, is it safe to assume that the cached value for the template sensor would be persisted? If that assumption holds, and something went wrong w/ the template sensor evaluation (but not the dev tools template eval?! and nothing in logs!?) between notifications being fired off and dismissed, it would explain why the value does not decrement as expected but usually does increment as expected. Calling reload on the template service ‘fixing’ the issue also lends some credibility that callbacks to re-evaluate are not firing (assuming because they no longer exist).

I posted because it does appear that I’m the only one w/ the issue and the time to document it publicly.

The original intent of this thread was to identify where the cause of theses symptoms might exist. This whole time i have been trying to precisely communicate what the issue is and, with some difficulty, when it occurs… so that it becomes a narrower and narrower problem description that can eventually be replicated in an isolated environment.

I am grateful for most of the posts in this thread; they have been helpful in moving me towards the end goal of understanding the problem with the intent of solving it.

I’ll turn up the logging to see if anything else pops up:

me@some-host # grep -A3 -B1 "logger" configuration.yaml
# You can turn specific components to a specific level
# See: https://www.home-assistant.io/integrations/logger/#log-levels
#
# critical, fatal, error, warning, warn, info, debug, notset
##
logger:
  default: info
  logs:
    homeassistant.components.template.template_entity: debug

petro · July 23, 2021, 4:40pm

It appears that way, but I’d expect an error when it fails.

Yes, if your listeners are removed it would stay static, forever. Only a reload (which re-evals the template) would cause the listener to be recreated.

That’s why this is troubling:

If that simple template resolves the domain as None, then the listeners will go away and the template will “freeze”.

Either way, turning on debug is a good place to start and hopefully the messages can lead to a cause. Because right now, what’s happening doesn’t make sense. I was hoping that you had more to your configuration that would point to a simple solution but that does not seem to be the case. All in all, the path forward is to tabulate this information inside an issue on github.