Core: 2024.7.4
Supervisor: 2024.08.0
Operating System: 13.1
Frontend: 20240710.0
6 cpu / 16G in a VM on Proxmox 8
AND, all works GREAT as long as I don’t upgrade to a later version.
To test I restored a backup to another (fresh build) Proxmox, but ALSO to a mini-pc, both virtual and physical cannot deal with the upgrade.
By now I have logging completely disabled on the test box, disabled ALL add-ons. I also have reverted back to SQLite and deleted the original database for good measures.
On the test-box went into OS and was able to get into the homeassistant container by: docker exec -it homeassistant /bin/bash.
If I then run “top” the result is: Mem: 3797928K used, 6385148K free, 4180K shrd, 156120K buff, 1057956K cached CPU: 25% usr 0% sys 0% nic 73% idle 0% io 0% irq 0% sirq Load average: 1.00 1.00 0.97 2/514 892 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 67 65 root R 3075m 30% 3 25% python3 -m homeassistant --config /config
The command “python3 -m homeassistant --config /config” is at a minimum of 25% constantly, whilst on the prod machine this is never more than 2.
Although 25% might not look like “a lot” it basically makes HA really (really, really) slow in response.
So now the real questions:
How can I further diagnose what is going on?
Anyone else seeing this?
Although I disabled (actually removed most) add-ons, could it be an integration having this impact on Core?
The interesting bit is that the high cpu sometimes is immediate after starting, mostly it is around 20 minute mark, but this time it was almost one hour.
Hard to troubleshoot this too be honest, as when this happens there is no way I can get to logs anymore.
ok, for those interested, finally was able to upgrade to 2024.9.1 and now run for half a day without issues.
Eventually I don’t think that I can pin-point one source, but for sure none of the integrations, add-on’s or HACS (and through HACS installed components) were the issue.
I removed several “yaml” based integrations in my dev system and that eventually stabilized the installation.
replicated this on prod, removed a BUNCH of other integrations that I basically did not use and upgraded successful.
The only thing that is still left (had that with the old(er) releases also already) is a slow memory leak.
So that probably will be a different topic at some point.