Heads up! Upcoming breaking change in the Template integration

With respect mutt,

That’s kind of insulting…I know it wasn’t intended but please consider who you are talking to.

I installed this ceiling fan a year ago and have been operating it as it is since then. I think I would have noticed a latency in that amount of time if it was being “just lucky”.

And to be fair I wasn’t even thinking about or even considering that a 10 or 12 percent increase in CPU was causing any issues with response time. As you can see (and as I’ve mentioned in other threads) my CPU usage at times gets up to over 50% and I’ve never seen any latency.

I got home from work and flipped on the light switch and the light didn’t come on…then it did…I thought it was strange but didn’t even think about it being related until it kept happening every single time since then. And even this morning it’s still doing the same thing.

So, no, I wasn’t “lucky” before and I’m not suddenly “unlucky” now. Something is going on.

Understood.
I know who I’m speaking to and would trust your observations above 99% of members
And it is co-incident with the recent upgrades etc. Insulting you was furthest from my mind (for which I apologise)
But the 1 second update of HA has been discussed many times, so that will have to be taken into account.
The shelly is connected by WiFi, the smart bulb is connected via ??? Zigbee, Z-wave, WiFi ???
Given your experience with these what is your best guess for the various propagation delays involved here ?
Switch to Shelly - instantaneous ?
Shelly over WiFi to HA - 50 to 120 ms ?
HA read, process, write - assuming good timings - 50ms ?
HA to bulb - (depends on transport and protocol), wifi - 50 to 120 ms ?, zigbee - your guess is better than mine ? Z-wave - I’m a z-wave fan but I’ve seen delays varying from say a quarter of a second to 3 secs ? (if the z-wave is native HA to z-wave or if it has to be translated first to mqtt then back to passing through the zwave controller api, I don’t know the speed differences but there must be some)
Given this chain, and you having to make an estimate what would you estimate ? (not from previous but in theory). AND, if your life depended on it what would you guarantee to (say) a client/friend /relative you just installed this exact setup for ?
Evidently something has changed in your setup but unless your processor usage is over (say) 80% then I wouldn’t say that your increased overhead is actually affecting this point to point response.
HA is designed to cope with varying loads whilst maintaining a reasonably consistent response (part of the one second updates, else they’d finish one cycle and immediately start on the next, so massive processing power /speed would pay massive dividends. (Edit: But a LOT of people run quite large systems happily on a Pi3b)
I genuinely would be interested for anyone’s views on what is absorbing this additional time.
And particularly what you think your propagation times ‘should’ be.
Also surrounding conditions, was someone streaming a 4k film clogging your WiFi bandwidth for example ?
(edit2: I also know that you know ALL of the above, I’m just putting in in context for everyone)

Believe me, I get what you are saying.

I’m using ESPHome for the most part for the Wifi stuff. the light in question is zigbee.

I do expect some small latency, of course. a quarter second seems reasonable. Which is what I was previously getting.

As a test I reverted back to v114.2 and I immediately experienced my previous performance.

here is an example:

then i again updated to 115.2 and this is what I get now:

I’d say that’s a pretty significant difference. The videos were taken 16 minutes apart (and everybody else is still in bed…) so literally the only difference is the HA version.

2 Likes

Wow !
Pretty damning
(Edit: AND a well documented effect !)

1 Like

So I just checked out my cpu level for the last month and it’s been static @ 1% on average. You can check out my config to compare to yours. I’m also using esphome but I only have 2 devices with ~20 entities. The event loop is tight when performing automations on it. I only have 15 template sensors and about 10 or so template entitites outside of that. Most of my stuff comes from appdaemon. My memory dropped 2k from 15.2 to 15.3. But I removed tensorflow 1.0 and added tensorflow 2.0.

There’s a link posted by Bdraco for installing py-spy. It can reveal what is occupying Home Assistant’s time and potentially help the development team to fix it. Someone suggested in a WTH that it be included with Home Assistant to make it easier for users to provide its reports when logging a GitHub Issue.

Direct link: GitHub - benfred/py-spy: Sampling profiler for Python programs

1 Like

ooh, I didn’t think about that.

I’ll see if I can get that running and maybe it’ll help narrow it down.

At this point I don’t even know what to put into a bug report. I’m sure that “I’m getting a bad latency on a few switches” won’t give them much to go on.

I’d Upvote that to become an official Add-On and a well documented addition to other install methods so that such diagnostic information being available to all (if running supervised (ie able to be monitored by the devs) and enabled)
:+1:

yep, especially for those amongst us using Homeassistant OS… rather difficult to setup in that.

I don’t know if it can be used as an Add-on (i.e. its own docker container). I may be wrong but it may need to run in the same context as the python program it is profiling (so in the homeassistant docker container).

It’s coded in Rust, not python, so that means it needs Rust-related resources. I don’t know how much space all of this takes but it may be a consideration when deciding whether to include it by default.


EDIT

If I have understood the following blog post correctly, it creates a docker container for py-spy and a separate container for a python test program. It then proceeds to use the containerized py-spy to profile the containerized test program. If this is true then, theoretically, py-spy could be packaged as an Add-on.

Full disclosure: I’m making a lot of assumptions …

In integration then?

Either way, Frenk’s your man !

Edit:
So you could ‘choose’ to install it (or not)
And if you could then ‘choose’ to allow remote monitoring, just of Py-Spy date (or not)

That way everyone gets what they want.

Long term you could look at what processor hit it causes / data bandwidth it consumes and change accordingly.

In case someone has more spare time than I do this weekend. :slightly_smiling_face:

If you do have a spare moment, do check this https://github.com/home-assistant/core/issues/40621#issuecomment-699510926

Seems rather related…

No, If we want this as an “Official Add-On” I think it has to be undertaken by a member of the ‘Core-Team’ - I willing to be educated to the contrary though.
:man_shrugging:
As this ‘may’ provide useful data to the same …

I’d run it that way anyway

It certainly seemed that way but ultimately it was decided that it wasn’t the proper place to discuss it.

Moreover, the multiple posts about restoring entity_id were seen as harassing the developer (and hidden). In addition, anyone who brings up entity_id again will be punished by having their account banned from the project.

Obviously I won’t be participating in that thread anymore.

Well how unfriendly.
My contribution was hidden too . Unjustly so, since it was spot on topic of the issue adding an extra example of a template killing the instance in the dev template editor.
I’ve asked to unhide because of that.

My ability to continue using my GitHub account to post in the project is more important to me than this one issue. Whether the decision was “unfriendly” or not, I have no further interest in that thread.

I think “the devs” (whichever ones they are…) are a bit thin skinned and like playing the “harassment” card a bit too easily.

Along with playing that card too easily too.

2 Likes

I’ve run up against a failure to install the py-spy package in my HA Container system following those instructions.

The error I’m getting is the same as apop did in that thread. He said he fixed it but didn’t say how.

Any idea how to over come that error?

And while I’m here I see that I need to run the “top” command to find the PID number. Can you give a full example on what the command is to get the info I need to run py-spy in the HA container? Everything I’ve found seems to already assume you know what the PID is before you run top.