Out of memory in 2 days

Do you have any addons installed? Some can cause memory hog, Studio code server or Firefox for example.
There are hidden (disabled) sensors for memory consumption for each addon, like “sensor.studio_code_server_ram_percentage” (name can be a bit different, since i translated it from Slovenian). Find those sensors, enable them for all addons and monitor if any of them raises. If you find one just make an automation which restarts that addon, say, each night at 00:00

    action: hassio.addon_restart
    data:
      addon: a0d7b954_vscode

Here you can see memory raising during the day and falling when adddon is restarted:
image
I’d say perhaps ram raises when i open addon but it doesn’t clean behind itself when i close it… but then again similar happened with firefox addon, but i didn’t open that addon for weeks… but it still raised each day.

Hi dear, simply happens that the HA starts to lag, doesn’t show trends, historical data and such. sometimes even lights are not available anymore.
If I look at the console of Esxi, reports “out of memory” process killed…bla bla.
so I must restart manually to get all back in operation.

Do you have the visual studio code addon? I had something similar and it turned out to be the VS code addon for me. It’s a known issue,but not everyone experiences it.

Really?! Let me give a trial.

I wrote that 3 days ago and asked you if you have vscode installed, but obviously you missed that…?

See here, also for a link to the github issue What is causing this memory leak?

didi the test, nothing changed.
the only repeating error I have is related to a sensor template I have somewhere.
this is exactly the problem, somewhere.
I activated logger with debug level for homeassistant.helpers but I didn’t get any hint on what sensor is causing the issue.
I made many attempts but it does not change. it ramps up like hell with no appearent reason why.

To determine if, say, studio code causes the probelm you must do following:

  • find and enable sensor.studio_code_server_memory_percentage (or similar name…) - sensor is disabled by default, so you must search among disabled sensors and enable it;
  • make a graph card with this sensor;
  • wait a couple of days and observe graph if it rises.

IF VS code is not the culprit do the same with all installed addons, perhaps you’ll find the one.

You won’t find any error in logs or similar about this.

thanks for the hint.
one question: I 95% believe there is a problem in some core templates or similar…because looking at glances, the only value rising is the memory of the homeassistant container…
I have a error in the log linked to a misconfigured template, I’m Ifraid that could be the guilty, but I cannot detect with certinty what template is faulty.

dear,
I believe 99% I have one of the automations or scripts looping.
what should I do, should I disable all automation first and see what of them is probably causing the loop?

No, act a bit more clever.
When you do not know where the fault is then cut your problem in half by disabling half of it. If your problem is gone then you know that it is in the disabled part. Just repeat this process untill only the problem one is left.

Updates: I discovered that this behavious is there since january 2023, but I didn’t care enough, clearly. I just got an explanation on the times I’ve rebooted because of some freezes without investigate.

now, I’ve cleaned up unused automations, I kept on the only relevant ones, I made housekeeping of custom components, scripts, removed all the error in the log linked to templating sensors…removed components causing the minimum error in the log, without success.

the patters is still there, is a neverending sawtooth.

don’t know what to do else, now i’m running without camera flows, as another test, disabling all the camera entities, but looks like without success.

Did you try to disable all installed addons (and turn off their wathcdogs to prevent self-restart) for a couple of days?

(btw…you still didn’t report if you checked all addons ram consumption over time like i said - and if it’s ok -not rising?)

dear,
I tried to ind the mentioned sensor, but I wasn’t able to find it…

BUT! today with a doble check I’ve been able to track them.
cool!
I’ll let you know.

once i track the memory consumption of the addons I’ll do the attempt of disable them.

hi dear,
unfortunately, is the Supervisor affected by this issue:
please have a look at this screenshot:


the violet line is the supervisor’s memory consumption in percent.
all other addons are pretty constant to low values.
to who could I address this?
thank you!

What did you do on 11.january just before 18:00 when memory dropped? Did you restart something, remove something… ? If I’ve read correctly then supervisor ram consumption “reset” can only be achieved with VM reboot, just HA core reboot doesn’t help. So, unless you restarted your VM try to remember what you did, perhaps you’ll find the culprit.

Do you use z-wave by any chance? Some guys did have problems with it…

Hi,
the drop was due to a VM restart, and when of course the sawtooth everytime repeats.
starting from that, everytime is the same pattern.

I don’t use Z-Wave. :frowning:

Then a temporary emergency fix would be to schedule VM restart each night, say at midnight or similar, at least to ensure that HA will always be available until you find a cause for this.

It’s not a solution, only temporary fix, as said. I’m out of ideas as the cause goes, though…

What i would do is I’d create another VM, install fresh HA on it, allowing to run parallel with existing HA (only at different IP and hostname, of course) and then slowly add stuff into it - every couple of days add another addon, then integrations…etc - and closely monitor RAM. Perhaps this way you’ll find when ram starts to go wild.

Thank you for the hints buddy.
I thought many times in making a fresh one, I have plenty of templates and entities…how might I do? It will take decades…
Btw, do you maybe know how to schedule an automatic reboot routine in ESXi?

I guess you could try to run this command from HA:

action: hassio.host_reboot

This should reboot VM, but as ESXi goes sorry, i don’t know that one…