Home Assistant - High Memory Usage

Thanks for the answer.
How did you installed pip spy? It failed for me with the following errors (cargo and python):

@danielo515
I did it with pip

## Py-Spy
# Installation
ssh [email protected] -p 22222
login
docker exec -it homeassistant /bin/bash
echo 'manylinux1_compatible = True' > /usr/local/lib/python3.8/site-packages/_manylinux.py
pip install py-spy

Thanks, that method allowed me to install it.
However, something very interesting happened. Today I removed what I think was a faulty switch that was slowing down my entire network like hell.
After that, the CPU usage has just dropped to normal levels, while the memory usage has just increased to previous thresholds. See:

Can a faulty network make the CPU of hassio go crazy?

I don’t know, but can you show me which switch you have removed?

Here it is

Does this list of processes, where none is home assistant or python mean that I am logged inside a container:


I’ll try to deactivate the protection mode on the addon.

Oh, silly me. I just took for granted that connecting through the addon I was getting into the home assitant container. I just followed your instructions step by step. I’ll report back.
By the way, I find easier to find the process using ps than htop or top, there are not many processes on the container anyway:

bash-5.0# ps -ef
PID   USER     TIME  COMMAND
    1 root      0:00 s6-svscan -t0 /var/run/s6/services
   32 root      0:00 s6-supervise s6-fdholderd
  187 root      0:00 udevd --daemon
  219 root      0:00 s6-supervise home-assistant
  221 root      3h22 python3 -m homeassistant --config /config
  325 root      0:00 /bin/bash
  342 root      0:00 ps -ef

Here is the output, not sure if this errors could make the output to be wrong:

py-spy> Sampling process 100 times a second for 120 seconds. Press Control-C to exit.

py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.10s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.42s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.37s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.46s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.33s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.28s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.26s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.11s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.28s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.12s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.18s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.24s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.21s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.30s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.13s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.01s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> Wrote flamegraph data to '/config/www/spy-0.116.0.svg'. Samples: 12000 Errors: 0

Here is the resulting svg:

According to the UI of the new release, they are not using as much memory…

So I guess it tries to cache all the memory it cans because, hey, it is there!

Can you provide a py-spy dump as well?

Do I need to run another command or does the one I already ran produce the desired output?

By the way, the actual problem was not the switch, but a raspberrypi running my main hassio. Everytime I plug it on the main network, it starts to flood it and all the computers on my home network start to spin their CPUs up to the sky, including the raspberry pi producing it (I can see it on it’s glances for example). If I isolate the raspberry pi on a switch, I can connect to it normally and everything is fine. Not sure if having several home assistant instances can cause this, but I had two hassios running on different machines for months without problem

It sounds like you might have some type of broadcast traffic loop.

Can you run a py-spy while its happening?

That is what I thought, but how can a single device produce this?
Can you provide the exact py-spy params that you suggest to run? Thanks

py-spy record --pid 208 --duration 60 --output www/snapshot.svg
py-spy dump --pid 208
py-spy top --pid 208. (Hit ctrl + c after 60 seconds and copy and paste)

Adjust the output location and pid

Sorry, I just removed the IP from the raspberrypi and it is not happening anymore.
Do you think the memory issue can still be debugged? Or it is something normal?

If it’s not happening anymore there isn’t much to be done since we would need a recording when the issue is occurring

Yep, that is what I was thinking. However, it was not probably a problem with Hassio per se, because it was affecting the entire network. Even the router become unresponsive at certain point.
But what about the memory? People report that 2gb is enough to run home assistant, and I assigned 4 to the VM and it is (according to the Hypervisor, proxmox) using 3.5 Gb

2GiB is usually more than enough unless you have thousands of entities. It could be a case of https://www.linuxatemyram.com/

Yep, probably it is that, because I don’t have enough elements to overwhelm a rpi instance I don’t think it will eat the ram of a 4gb VM.
Thanks

Hello,

I came across this topic because I also have a memory issue, see Help: My HA is restaring, why?.

As far as I can see, the host is killing HA because it consumes too much memory (at the end, it reaches 90%).

How can I know what is causing the issue ?

Did you take a look at the link in bdraco’s post above? I’m pretty certain that explains what you are seeing.

Actually yes, I know high memory usage is not an issue by itself. But my memory usage keeps growing over time and at some point, linux is killing HA:

[91096.975793] Out of memory: Kill process 6177 (python3) score 475 or sacrifice child
[91096.985885] Killed process 6177 (python3) total-vm:3798692kB, anon-rss:481348kB, file-rss:0kB, shmem-rss:0kB

My question is maybe not as precise as It should: How can I know what integration/component/other is causing the issue ?