Python3 high CPU Usage

Thanks for the py-spy recordings. Telegram looks like its taking up a fair amount of cpu time.

Did you hit the 35% cpu when these were taken?

Also, a dump at the same time would be helpful.

1 Like

No, it was the day after the reset (pick about 15%). Now there was an update to 114.1. I have to wait around 2 days to see if cpu consumption will increase to 35%. I will wait and let you know and try to make py-spy record again.

1 Like

Hi @bdraco, I’m also one of the users having the CPU-increase-up-to-saturation, and trying to isolate and fix it locally by applying all changes going on in the integrations under suspect :), so I think I can add some interesting info of what I found:

Suspects

  • Indeed, telegram_bot (configured as “polling”) takes a lot of threads and uses a lot of CPU (but before 0.112 it was not a problem). Changing it to “webhooks” (which, with NabuCasa, works seamlessly) does reduce this, but no change on the CPU issue going under :frowning: (but I could disable it for a test run if required :slight_smile: )

  • I have 3 cast devices, but I removed the integration and marked it as “ignore”.

  • 3 esphome devices. I locally applied the changes from Add the ability to use the shared Home Assistant Zeroconf instance by bdraco · Pull Request #13 · esphome/aioesphomeapi · GitHub and Use the shared Zeroconf instance in esphome by bdraco · Pull Request #38747 · home-assistant/core · GitHub

  • 8 Shelly devices (custom integration ShellyForHass), which I read somewhere that it was a suspect, and indeed it is.

    I manually changed the CC repo and the main library (PyShelly) to share the zeroconf instance from HA, and I have it ready to make both PR’s if it works, but my first impression after the change was that the issue was quicker! (CPU saturation in < 15h, when initially took > 1day)

  • 2 Sonos devices, which I also saw in the suspect list, I think, and they take 9 threads (but no increase over time, and not specially present in the py-spy plots)

Setup context:

  • RPI4-4GB+SSD+UPS+conbee II stick, solid as rock :slight_smile:
  • Supervisor install with docker (no pure-hassio), v0.114.0 with custom changes (the ones described above)
  • ADDONS: Appdaemon4, deCONZ, esphome, vcode, ADB, adguard, mqtt, dhcp, mariaDB, samba, nginx-proxy-manager
  • Custom integrations: shelly, xiaomi_miio_fan, eventsensor
  • Integrations: homekit, mobile_app(iOS) x4, sonos x2, androidtv x2, hue x16, nut, tplink x1, nabucasa + alexa, denon_avr, tuya x3, influxdb, recorder, and the ones from addons: deconz x29, esphome x3, adguard, mqtt

Last experiment

I’ve just found and applied all changes from your 2 PRs in zeroconf: Reduce the time window that the handlers lock is held by bdraco · Pull Request #287 · python-zeroconf/python-zeroconf · GitHub and Ensure all listeners are cleaned up on ServiceBrowser cancelation by bdraco · Pull Request #290 · python-zeroconf/python-zeroconf · GitHub, so I will restart once more to test the behavior → Edited: done, with interesting results:

These are my last HA sessions, plotting a 15-min rolling mean of the reported CPU usage.

The current one, with all changes described, looks stable for now (~7h running), but it is running with a 3-4% increase in CPU usage over the stable reference on v0.112, so maybe there is something more under the hood :frowning:

I can provide py-spy dumps and plots. This dump is from right now:

1 Like

For PyShelly. Is it better once https://github.com/jstasiak/python-zeroconf/pull/290 is applied?

The only other thing that stands out is there have been some recent influxdb changes that another user reported was causing an issue.

For PyShelly . Is it better once https://github.com/jstasiak/python-zeroconf/pull/290 is applied?

I think so, but couldn’t be precise about that.
But it is better than without it for sure, in general for the system-as-one, I cannot say specifically for pyshelly (I’m not even sure if it is using the mdns in my config, as all devices are defined by ip and discovery is off right now :))

The only other thing that stands out is there have been some recent influxdb changes that another user reported was causing an issue.

Thanks, I’ll search about it, but my first impression is that influx is not the cause, and I’ve not seen any influx error log or presence in dumps/py-spy dumps/plots…, but I’ll review it.

The thing that surprised me the most was the pyhap usage, and I don’t use the homekit_controller, but just the homekit to control a few things with Siri on watches :frowning:

BTW, the current run evolves stable, but wrong things are happening in the network/HA for sure: I’m having lot of hue bridge fetch errors, deconz zigbee sensors not triggering sometimes, small delays on automations, and ghostly things alike… (I even tried to restart everything: router/zigbee hubs/individual devices, but it’s not that)

Tomorrow I’ll try to disable some things to continue testing, and maybe update to 114.2 and redo the custom changes in pyshelly, aioesphome and zeroconf again…
Is something interesting (related to this issue) in 114.1 and 114.2 revisions that could help here?

If you are on python 3.8, https://github.com/home-assistant/core/pull/38821 could fix the executor being overloaded (in 0.114.1).

Saw it yesterday, I think. It is one of the customizations, first with 100, last runs with the selected value of 64 from the PR. No apparent change in behaviour :frowning:

Mostly in reference to pyhap:

Also to analyze the py-spys, look at the file and line numbers and if its a select()/sleep() like operation, exclude this from the analysis as you can assume its using no cpu and just blocking.

As requested @bdraco I am sending py-spy records (120, 360s and dump). After two days, CPU usage jumped to 30%. Is there anything else I can do to help fix this problem? Thanks for your help :slight_smile:



Thank you. telegram is the only thing that stands out. There is quite a bit of time in zeroconf. Let’s see if 0.114.3 solves the issue for you as it has the zeroconf fix.

2 Likes

I realized I only posted this in https://community.home-assistant.io/t/high-cpu-usage-after-0-113.

Here is the current status all the known cpu related issues that I’m aware of:

0.114.3 has been published. If you still see cpu creep, please post a new py-spy

2 Likes

Thank you for your help. After the update cpu usage doesn’t increase :slight_smile:

1 Like

The remaining known performance concerns that I’ve been tracking have now all been addressed in 0.115dev which is scheduled for beta tomorrow.

1 Like

I have home assistant core 0.115.1 in docker in a nuc, with version 0.114.4 and earlier the processor usage is 6% but now it does not drop below 25% and I don’t know why, help please

Please post a py-spy using the instructions above

Also debug logging may show the issue as well.

It looks like you have a template that is firing very frequently. If you turn on debug logging for homeassistant.event you should be able to see which one it is

How do I have to put the log?

that’s OK?

  default: error
  logs:
    homeassistant.event: debug```