CPU usage goes over 100% and stays until restart of the Docker container

I have been running HA in Docker (see specs below) for almost three years without any performance issues, but on October 10 I started getting issues with high CPU load from the HA Docker container.

Normally CPU usage is between 5%-10%, but suddenly CPU usages increases to 100% (and above) and the CPU temperature rises from 50 C to 95 C and stays like that until container is restarted. Then CPU usages goes down to 5%.

This issue repeats every day, at none deterministic time pattern, it can be after 5 hours of restart, sometimes more and less.

I can’t see that I did install or update anything around October 10, but I have tried to disable few Integrations and custom sensors, such as Speedtest, MQTT, Arlo (HACS), ping, and others. But I still face the issue.

I have used the Profiler integration and py-spy to get more insight into CPU load, but I have not succeeded in identifying what is actually causing the increase of CPU load. Maybe @bdraco could provide some guidance?

Files created with Profiler integration (you might need to download svg files locally for best view):

Files created with py-spy:

Specs:

  • Docker image: homeassistant/intel-nuc-homeassistant:2022.10.5
  • NUC spec: core i7, 16 GB RAM, 250 GB SSD
  • OS: Ubuntu 22.04.1 LTS (GNU/Linux 5.15.0-52-generic x86_64)
  • Docker: v20.10.17
  • Docker Compose: v2.5.0

It looks like you have something that is reading from a websocket that is either closed and it doesn’t know and keep reading over and over or the websocket is flooding it with data

I’d grep -R graphql_subscriptions_manager your python libraries to find which integration uses that.

I have two tablets mounted on the wall, with Home Assistant App running in the foreground. They might be the reason? Or do you suspect an Integration causing the issue?

I suspect its a specific integration

I did ssh to HA container within the /config folder and run grep -R graphql_subscriptions_manager, but received empty output.

I suspect the Tibber integration to cause this issue. I did quick search on graphql_subscription_manager in HA Core repo and got quickly into the pyTibber where the grapqh is used. After few more clicks I found an open issue #76491 where many users are complaining about the high CPU usage.

Thanks for pointing me into right direction @bdraco!

1 Like