Python3 high CPU Usage

Johann_Edelmuller · October 24, 2020, 11:02pm

Yes, here please!

bdraco · October 24, 2020, 11:03pm

Thanks. Looks like the switch over to httpx is working as expected

tom_l · October 25, 2020, 4:36am

Running 0.117.0b3 Hassio NUC image.

I wish there was some way to work out what is causing this. 100% use of one core:

core
(starts at 2am local to home assistant, im in a different timezone. There is nothing in the log or logbook around this time).

It only shows up on the core cpu sensor. The system monitor shows nothing:

sys-mon

Supervisor cpu is fine:

superv

All other addons are fine. I tried restarting them all anyway, but this caused no change.

Top shows nothing:

And we’ve tried profiler and py-spy without success.

Restarting tames the runaway cpu instantly.

Untitled

Then it happens again a day or three later.

What is the difference between system monitor cpu and core cpu measurements?

bdraco · October 25, 2020, 12:35pm

It is likely an integration that has a run away thread / loop.

Keeping taking py-spy records each time it happens. Sometimes you have to compare 4 or 5 of them against the idle ones to see the pattern. Make sure to take a py-spy dump as well as it will show the threads.

You may have some luck with strace as well. If you can figure out which thread is running away and attach to it (you can use the py-spy dump output to narrow it down)

Then strace -p <pid> -s 4096 -o trace.log

Mariusthvdb · October 26, 2020, 8:55am

O that’s nice, didn’t realize we could measure per add-on. How do you get all that info per Add-on Tom, is that yet another rest sensor?

also, are those buttons with an embedded mini-graph? If so, could you please share the config of one of them, I would like to try that for sure.

tom_l · October 26, 2020, 9:37am

See this topic: HA 0.116 / Supervisor 247 - Core and Supervisor & Addon Stats

QbaF · November 2, 2020, 8:53pm

In those situations usually helps me cleaning the browser cache.

QbaF · November 2, 2020, 8:54pm

In those situation usually helps me flushing the browser cache and browser restart.

tom_l · November 2, 2020, 9:39pm

No, just standard mini-graph cards.

tom_l · November 13, 2020, 8:27am

Is py-spy only recording what is going on in the home assistant container?

Because I don’t think my issue is there.

The system monitor integration does not record an increase in CPU use. Likewise the supervisor and all my addons do not record an increase. It’s only the Core CPU use. Which, I take it, includes host os processes.

So it appears to be something on the host running amok that is reset by the home assistant container restarting. Any idea what that could be?

Unfortunately I have no access to the host for another 4 months until I return home and can share some SSH keys. Though I may be able to get someone to shuffle a USB drive for me before then.

bdraco · November 13, 2020, 3:54pm

Py-spy is only sampling what is going on in the core python process inside the container.

tom_l · November 13, 2020, 4:08pm

Yeah, that’s what I thought. Will see if I can get a sneaker-net volunteer to move some SSH keys for me.

loupiz · November 22, 2020, 9:03am

Hello,

I’m suffering from high CPU usage too since version 0.117.

I tkink it is related to zwave because my network stops working when it happens and CPU is back to normal when I stop the network.

I did a py-spy for 120 seconds. Here is the svg file.

Can anyone help me ?

Thanks !

bdraco · November 22, 2020, 9:17pm

Do you have more than core running? It doesn’t look like core is the source of the high cpu from the py-spy.

loupiz · November 24, 2020, 6:45am

No, the only process using CPU is HA. Usually CPU is around 5%.

I did 3 snapshots with py-spy:

First one is after a fresh reboot. HA and Zwave are working fine.

Second one after CPU reached 25-26 % and zwave network stopped working.

Third one after stopping zwave network. CPU is back to normal.

Maybe putting the snapshots side by side will show something is wrong ?

By the way I noticed I couldn’t restart zwave after this happened. I get this error:
2020-11-24 06:40:18.684 Error, ERROR: Cannot open serial port /dev/ttyACM0. Error code 6
2020-11-24 06:40:18.684 Error, ERROR: Failed to open serial port /dev/ttyACM0

After rebooting everything is back to normal (until it comes back again after more or less 24 hours).

Edit: seems related to this issue Z-wave stop working after midnight

bdraco · November 29, 2020, 12:26am

There isn’t anything interesting in the py-spy recording. Can you do a py-spy top when it happens again?

loupiz · November 29, 2020, 5:52pm

I think I finally found something. The problem seems to come from the raspberry usb port where the zwave stick is plugged in. The address actually changes from /dev/ttyACM0 to /dev/ttyACM1 which causes the zwave network to stop working in HA. Not sure how it affects the CPU though. Not sure why it mostly happens around midnight either. Anyway I switched the stick to another port and the problem has disappeared since then. Thanks for trying to help.

GiulianoTO · November 30, 2020, 8:48am

Hi, my i7 PC CPU running the Home Assistant in VM would jump to 100% after a few days just sitting and not doing much. Only restart helps bring it down back to around 1-5% where it usually is, but after a few days it would jump back to 100%, and I have no idea why.

I’m new to py-spy and I would appreciate it if someone would help me out decipher it as I have no idea what I’m looking at.

This is the screenshot from Glances where you can see python3 being above 80% all the time.

This is the py-spy SVG

What I noticed during the sampling is that I was getting a message saying sampling is behind and results may be inaccurate. I don’t know what to do to prevent that or if I got valid results.

bdraco · November 30, 2020, 6:16pm

Would you please post a py-spy dump and top as well.

It looks like the py-spy was only able to sample for very short window so its likely not too useful. You can decrease the rate with --rate

GiulianoTO · November 30, 2020, 10:38pm

Hi @bdraco, I have restarted the HA so it is acting nicely at the moment. As soon as the CPU jumps up again I will run the py-spy.