Heads up! Upcoming breaking change in the Template integration

not really sure what the difference is between these?

Here, you are confusing frontend and backend. Custom-ui is done in the frontend browser side computation, just as Lovelace is doing. Jinja is backend server side computation so yes, that’s seen in the processor usage.

for the sake of experiment Ive just taken out my full package of customizations, which leads to exactly no change at all in processor usage%. So that’s not it.

But, this thread is about breaking changes in the template integration. and Amelchio has stated what he is doing and why.
We are all eager to see them live.

Those are fine words but a Template Sensor’s entity_id option has existed for a long time and has been used (and understood) by many users since it was implemented. There are countless examples of its use in this community forum.

The removal of entity_id was reported as a Breaking Change for good reason: it breaks existing Template Sensors and requires users to modify them. In some cases, it’s easy whereas in others the official guidance is to replace the Template Sensor with an automation. That’s a clear example of “loss of functionality”.

Moreover, there has been no technical justification for the removal of entity_id. I have yet to see a compelling explanation for why the new entity-identification technique cannot use entity_id. The entities are clearly specified thereby sparing the new technique the task of identifying them in the template (i.e. use these entities and don’t look in the template).

Why is there resistance to re-instating entity_id but a willingness to add options, like scan_interval and auto_update, that attempt to do the same thing?

1 Like

Again I am not confusing front and back end. I made a comment over cluttering the system versus making use of the core functionality now available.

But it’s clearly pointless so I give up. Good luck with your challenges.

please let me ask a related question, on creating an efficient automation for updating, as we’ve discussed before

  - alias: Last Automation
    id: Last Automation
    trigger:
#      platform: event
#      event_type: call_service
#      event_data:
#        domain: automation
#        service: trigger
      platform: event
      event_type: state_changed
    condition:

Since this listens to all states, I suppose it could be made more efficient. As you can see I tried to do it with the event call_service on the domain automation, (because this is an automation that tracks all triggered automations…) bit the doesnt work. I dont see an error though so dont yet understand why this wont work.

a related automation on tracking the last scripts does work like that:

  - alias: Last Script
    id: Last Script
    trigger:
      platform: event
      event_type: call_service
      event_data:
        domain: script
    condition: []
    mode: queued
    max: 50
    action:
      - delay:
          seconds: 1
      - service: python_script.last_script
        data:
          event: >
            {{trigger.event}}

am I using the event triggering incorrectly, or is this simply not possible like that.

I updated to 115.2 this morning and as far as CPU usage mine increased by 70% (from around 15% to around 25%). And I’m running on a NUC not on a Pi.

I’ve only had one sensor that started with {{states | …}} (the infamous unavailable entities sensor) but I’ve changed it to only look at a few domains. So that can’t be the issue.

So this update definitely didn’t do me any favors where CPU load is concerned.

ex

How can you be certain that templates are causing the bump? I’m still running @ 10% (like normal) on my nuc.

I hate ‘me too’ posts but…

When I upgraded I went from 3-4% to 7-10% peaking intermittently at ~50%
Now though I am back down to 3%, often even 2%.

I did a fair bit of investigating and made quite a few changes including removing (for now) the component counters (although frankly they seem a bit useless now the novelty has worn off so may never come back!) but keeping the ‘infamous’ unavailable entities sensor.

Just sayin’ :wink:

The 0.115 update added slightly more load than one chatty sensor for me:

Though 0.115 did alter my long term daily minimum level:

The cpu load drop from those listener optimisations (0.113?) was sweet while it lasted.

I was defining entity ids for all my templates prior to 0.115 so there was no need to automatically determine what to listen for.

1 Like

I don’t know how the underlying code works, but do get the automating the entity_id identification from the template itself to simplify things for users. However, would it not have been possible to have the automation of entity_ids from the template ALONG WITH establishing an additional listener for every entity_id the user specifically includes under an entity_id key? That would seem to resolve the issues being discussed. The code should now exist for both methods, can logic not be included to allow both?

I’m not saying it was the templates. I actually don’t think i have any of the offending templates. I’m saying it was simply the update that caused my CPU load to increase.

So what others in the thread are seeing might be from the update in general instead of the templates.

Just an FYI for anyone seeing the same thing.

Yeah, that’s what I’m trying to figure out. I didn’t see any bump but others are seeing a huge bump. I also don’t use automations or too many templates. When I do use templates, I make sure they all work off entity state changes. I might have to put my cpu data into another software package. Maybe I do have a bump.

Prior to 0.115 it worked like this:

  • Home Assistant inspects the value_template option, identifies entities, and assigns listeners to them. If it can’t find any entities, your Template Sensor is evaluated at startup and never again (until the next restart).

  • If you also include the entity_id option, Home Assistant assigns listeners to each entity you’ve specified and does not inspect value_template. So entity_id serves to supersede value_template. You have complete manual control over what causes value_template to be evaluated.

In 0.115 it works like this:

  • Home Assistant inspects value_template, identifies entities, and assigns listeners to them. If it can’t find any entities, your Template Sensor is evaluated at startup and never again (until the next restart).

  • The entity_id option is no longer available to supersede value_template. You no longer have manual control to override the automatic system.

To be clear, the new entity-identification process is more thorough than the previous one. For example, it now handles expand properly and understands states, state.sensor, etc. I believe this is why it was decided to deprecate entity_id. However, there are situations where you may still want manual control to supersede the automatic system.

So far, I have not heard a compelling technical reason that prevents restoring the entity_id option. Instead, there’s been talk of adding new options that, irony of ironies, serve as substitutes for entity_id. :man_shrugging:

2 Likes

It seems that I have to point out that I am just a single guy discussing my thoughts, just like all of you. I do not speak for Home Assistant and I am not at all certain that auto_update would even be accepted since it is, honestly, still ugly.

You ask for technical justification. If my previous post was not enough, I am not sure what could satisfy you. There is no reason that entity_id could not be re-implemented if that’s what you want to hear. The technical terms are “feature creep”, “technical debt” and possibly even “second system syndrome”.

It would be very helpful if you could show an example of a template that no longer works for you with the new engine.

And as additional info…

I’ve also noticed system having a higher latency than it did before as well.

I have a light switch that directly controls a shelly and the shelly state toggling controls toggling a smart bulb. It used to be almost instantaneous. Now there’s at least a .5 to 1 second delay between flipping the switch and the light toggling.

So "the bump’ really is having an impact on performance.

Too bad there’s no good/easy way to figure out what’s causing it.

The feature was removed so to call its reinstatement “feature creep” would be disingenuous. If there’s no technical obstacle then I recommend it be restored because it provides manual control over which entities are assigned listeners.

A prime example is the one discussed at length in the posts above, namely the “Sensor - Unavailable/Offline Detection”. Its template uses states which, in previous versions, was not assigned any listeners and that was a good thing for that particular Template Sensor’s purposes. It was sufficient to evaluate the template once a minute by simply specifying entity_id: sensor.time. Effectively, we had manual control over which entity (or entities) served to refresh the template. We could constrain the listeners.

It’s no longer possible to do that in 0.115 (by definition, “loss of functionality”). That means instead of one listener (for sensor.time) it now gets the maximum number of possible listeners (one for each entity). Instead of updating once a minute it gets updated at least that and much more. It uses more resources than needed to get the job done and there’s no control over it (short of foregoing the use of a Template Sensor altogether and resorting to a python_script … which is what Marius is doing).

I just outlined the issue to someone else who asked where to now include sensor.time within the template:

yes.
Another that is a possible system killer is a template like:

      {% set ns = namespace(domains=[]) %}
      {% for d in states|groupby('domain') %}
      {% set ns.domains = ns.domains + [d[0]] %}
      {% endfor %}
      {% set list = ns.domains|join('\n') %}
      {{list if list|count < 255 else
        list|replace('input','inp')|truncate(255,true)}}

I have 3 systems. Main system with all other integrations loaded and rather a large backend setup, and 2 smaller ones, dedicated for Z-wave (Aotec stick) and Mqtt, serving as the dedicated broker. The latter 2 to take away as much stress from the production system as possible.
All on 115.3 now. My production system immediately breaks upon loading this template, and grinds to a halt. I can only restart it using the command line.
This template didnt break sweat in 114. The other (Z-wave and Mqtt) can run it without obvious trouble, but of course they have practically nothing to track…

I have to add the above template is not even the complete template sensor, but an attribute_template. Main value_template was the count:

{{states|groupby('domain')|count}}

Now these seem to get solved in 116, but I mention it here to be completely transparent :wink:

1 Like

With respect finity, with HA that was always a possibility probability as HA works on 1 second updates. If you hit the button 1ms before the ‘scan’ (sorry I can only relate this to PLC’s) then the response will be instantaneous (well 1ms) if you hit it 999ms before the scan it will be 1s. And the average will be about 0.5 s in normal use. If you have z wave devices talking directly (having been put on the same group) then you may be able to beat that 1/2 second average. (from what I’ve read this is also possible with some Philips hue stuff, but I can’t attest). You could probably also do it with esphome but would need dedicated communication. All of these inherently bypass HA as a controller so you’d loose flexibility and probably some control too.
I don’t know what else to say other than you’ve been very lucky with the timings previously and are maybe noticing it more now that you are ‘looking for symptoms’
:man_shrugging:

With respect mutt,

That’s kind of insulting…I know it wasn’t intended but please consider who you are talking to.

I installed this ceiling fan a year ago and have been operating it as it is since then. I think I would have noticed a latency in that amount of time if it was being “just lucky”.

And to be fair I wasn’t even thinking about or even considering that a 10 or 12 percent increase in CPU was causing any issues with response time. As you can see (and as I’ve mentioned in other threads) my CPU usage at times gets up to over 50% and I’ve never seen any latency.

I got home from work and flipped on the light switch and the light didn’t come on…then it did…I thought it was strange but didn’t even think about it being related until it kept happening every single time since then. And even this morning it’s still doing the same thing.

So, no, I wasn’t “lucky” before and I’m not suddenly “unlucky” now. Something is going on.

Understood.
I know who I’m speaking to and would trust your observations above 99% of members
And it is co-incident with the recent upgrades etc. Insulting you was furthest from my mind (for which I apologise)
But the 1 second update of HA has been discussed many times, so that will have to be taken into account.
The shelly is connected by WiFi, the smart bulb is connected via ??? Zigbee, Z-wave, WiFi ???
Given your experience with these what is your best guess for the various propagation delays involved here ?
Switch to Shelly - instantaneous ?
Shelly over WiFi to HA - 50 to 120 ms ?
HA read, process, write - assuming good timings - 50ms ?
HA to bulb - (depends on transport and protocol), wifi - 50 to 120 ms ?, zigbee - your guess is better than mine ? Z-wave - I’m a z-wave fan but I’ve seen delays varying from say a quarter of a second to 3 secs ? (if the z-wave is native HA to z-wave or if it has to be translated first to mqtt then back to passing through the zwave controller api, I don’t know the speed differences but there must be some)
Given this chain, and you having to make an estimate what would you estimate ? (not from previous but in theory). AND, if your life depended on it what would you guarantee to (say) a client/friend /relative you just installed this exact setup for ?
Evidently something has changed in your setup but unless your processor usage is over (say) 80% then I wouldn’t say that your increased overhead is actually affecting this point to point response.
HA is designed to cope with varying loads whilst maintaining a reasonably consistent response (part of the one second updates, else they’d finish one cycle and immediately start on the next, so massive processing power /speed would pay massive dividends. (Edit: But a LOT of people run quite large systems happily on a Pi3b)
I genuinely would be interested for anyone’s views on what is absorbing this additional time.
And particularly what you think your propagation times ‘should’ be.
Also surrounding conditions, was someone streaming a 4k film clogging your WiFi bandwidth for example ?
(edit2: I also know that you know ALL of the above, I’m just putting in in context for everyone)

Believe me, I get what you are saying.

I’m using ESPHome for the most part for the Wifi stuff. the light in question is zigbee.

I do expect some small latency, of course. a quarter second seems reasonable. Which is what I was previously getting.

As a test I reverted back to v114.2 and I immediately experienced my previous performance.

here is an example:

then i again updated to 115.2 and this is what I get now:

I’d say that’s a pretty significant difference. The videos were taken 16 minutes apart (and everybody else is still in bed…) so literally the only difference is the HA version.

2 Likes