Home Assistant - High Memory Usage

danielo515 · October 8, 2020, 8:27am

Thanks for the answer.
How did you installed pip spy? It failed for me with the following errors (cargo and python):

gieljnssns · October 8, 2020, 1:36pm

@danielo515
I did it with pip

## Py-Spy
# Installation
ssh [email protected] -p 22222
login
docker exec -it homeassistant /bin/bash
echo 'manylinux1_compatible = True' > /usr/local/lib/python3.8/site-packages/_manylinux.py
pip install py-spy

danielo515 · October 8, 2020, 5:08pm

Thanks, that method allowed me to install it.
However, something very interesting happened. Today I removed what I think was a faulty switch that was slowing down my entire network like hell.
After that, the CPU usage has just dropped to normal levels, while the memory usage has just increased to previous thresholds. See:

Can a faulty network make the CPU of hassio go crazy?

gieljnssns · October 8, 2020, 5:49pm

I don’t know, but can you show me which switch you have removed?

danielo515 · October 8, 2020, 6:56pm

Here it is

danielo515 · October 9, 2020, 5:55am

Does this list of processes, where none is home assistant or python mean that I am logged inside a container:

I’ll try to deactivate the protection mode on the addon.

danielo515 · October 9, 2020, 6:01am

Oh, silly me. I just took for granted that connecting through the addon I was getting into the home assitant container. I just followed your instructions step by step. I’ll report back.
By the way, I find easier to find the process using ps than htop or top, there are not many processes on the container anyway:

bash-5.0# ps -ef
PID   USER     TIME  COMMAND
    1 root      0:00 s6-svscan -t0 /var/run/s6/services
   32 root      0:00 s6-supervise s6-fdholderd
  187 root      0:00 udevd --daemon
  219 root      0:00 s6-supervise home-assistant
  221 root      3h22 python3 -m homeassistant --config /config
  325 root      0:00 /bin/bash
  342 root      0:00 ps -ef

Here is the output, not sure if this errors could make the output to be wrong:

py-spy> Sampling process 100 times a second for 120 seconds. Press Control-C to exit.

py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.10s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.42s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.37s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.46s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.33s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.28s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.26s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.11s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.28s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.12s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.18s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.24s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.21s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.30s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.13s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.00s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 1.01s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> Wrote flamegraph data to '/config/www/spy-0.116.0.svg'. Samples: 12000 Errors: 0

Here is the resulting svg:

According to the UI of the new release, they are not using as much memory…

So I guess it tries to cache all the memory it cans because, hey, it is there!

bdraco · October 9, 2020, 3:30pm

Can you provide a py-spy dump as well?

danielo515 · October 9, 2020, 4:16pm

Do I need to run another command or does the one I already ran produce the desired output?

By the way, the actual problem was not the switch, but a raspberrypi running my main hassio. Everytime I plug it on the main network, it starts to flood it and all the computers on my home network start to spin their CPUs up to the sky, including the raspberry pi producing it (I can see it on it’s glances for example). If I isolate the raspberry pi on a switch, I can connect to it normally and everything is fine. Not sure if having several home assistant instances can cause this, but I had two hassios running on different machines for months without problem

bdraco · October 9, 2020, 4:31pm

It sounds like you might have some type of broadcast traffic loop.

Can you run a py-spy while its happening?

danielo515 · October 9, 2020, 5:19pm

That is what I thought, but how can a single device produce this?
Can you provide the exact py-spy params that you suggest to run? Thanks

bdraco · October 9, 2020, 5:34pm

py-spy record --pid 208 --duration 60 --output www/snapshot.svg
py-spy dump --pid 208
py-spy top --pid 208. (Hit ctrl + c after 60 seconds and copy and paste)

Adjust the output location and pid

danielo515 · October 9, 2020, 7:02pm

Sorry, I just removed the IP from the raspberrypi and it is not happening anymore.
Do you think the memory issue can still be debugged? Or it is something normal?

bdraco · October 9, 2020, 7:32pm

If it’s not happening anymore there isn’t much to be done since we would need a recording when the issue is occurring

danielo515 · October 10, 2020, 7:23am

Yep, that is what I was thinking. However, it was not probably a problem with Hassio per se, because it was affecting the entire network. Even the router become unresponsive at certain point.
But what about the memory? People report that 2gb is enough to run home assistant, and I assigned 4 to the VM and it is (according to the Hypervisor, proxmox) using 3.5 Gb

bdraco · October 10, 2020, 2:13pm

2GiB is usually more than enough unless you have thousands of entities. It could be a case of https://www.linuxatemyram.com/

danielo515 · October 12, 2020, 5:44am

Yep, probably it is that, because I don’t have enough elements to overwhelm a rpi instance I don’t think it will eat the ram of a 4gb VM.
Thanks

tgermain · October 16, 2020, 6:45am

Hello,

I came across this topic because I also have a memory issue, see Help: My HA is restaring, why?.

As far as I can see, the host is killing HA because it consumes too much memory (at the end, it reaches 90%).

How can I know what is causing the issue ?

moto2000 · October 16, 2020, 6:51am

Did you take a look at the link in bdraco’s post above? I’m pretty certain that explains what you are seeing.

tgermain · October 16, 2020, 7:10am

Actually yes, I know high memory usage is not an issue by itself. But my memory usage keeps growing over time and at some point, linux is killing HA:

[91096.975793] Out of memory: Kill process 6177 (python3) score 475 or sacrifice child
[91096.985885] Killed process 6177 (python3) total-vm:3798692kB, anon-rss:481348kB, file-rss:0kB, shmem-rss:0kB

My question is maybe not as precise as It should: How can I know what integration/component/other is causing the issue ?