wondering what many of us with bigger HA installs are noticing in the way of slowdown if we are using older tech to run things ( appreciate mine might actually NOT be a big install)
My Lenovo M83 with Proxmox, HAOS running as a VM (and PiHole too) works really well…but I’ve noticed a couple of instances of things taking a few seconds longer to execute on automations of late…and it’s not wifi as the signals are all great as per my system but also because if I manually trigger these “things” they execute instantly…so its as if the system is a little bogged down for very short periods which is affecting automations being triggered…
This is can be native automations or node-red ones…
Now I have no idea if my system is getting too big for my existing setup (I3 4150T with 16 GB RAM), or maybe the SSD is starting to have issues, although I think this would be more of an ongoing issue affecting everything)
I have 90 integrations, and 14 add ons which apparently results in the follwoing
automations: 130
Binary sensors: 330
Device Tracker: 75
Calendars 7
Climate: 1
Fans: 3
input_boolean: 19
input_select: 0
input_text: 1
Lights: 100
Media Players 79
Script: 25
Sensors: 1661
Switches: 513
Zones 27
alarm_control_panel: 9
automation: 130
binary_sensor: 330
button: 210
calendar: 7
camera: 19
climate: 1
conversation: 1
cover: 5
device_tracker: 75
event: 4
fan: 3
image: 18
input_boolean: 19
input_datetime: 3
input_number: 4
input_text: 1
light: 100
lock: 1
media_player: 79
notify: 2
number: 180
person: 3
remote: 10
scene: 7
schedule: 1
script: 25
select: 97
sensor: 1661
sun: 1
switch: 513
text: 2
update: 168
weather: 2
zone: 27
Now I know the above might not be a good measure of how hard a system has to work…but wondering what people’s thoughts were, if due to the size of my setup, I would be better off upgrading my machine to a 'NUC’alike…thinking an N100/N150/Ryzen 5 3550H with 16 or 32GB of RAM as anything else might be overkill, these are all around £150 mark and would allow me to run more VMs down the line
I’m going to be running on Proxmox as I am now, might run Z2M and Node Red in their own VMs in addition
This should be more than enough, but it also depends on the other loads (vms) you are running on it and how much hardware you have presented to the HA vm.
Does this mean that the Lenovo is using wifi as its network connection? Or are you running it with a cabled connection (which I would recommend).
For comparison during winter (cold garage so need to shut down the main server there) I run HAOS (2 vcpu, 6 GB RAM) as a vm together with 7 other vms on a Intel Skull Canyon NUC and I barely notice any difference from when I run it on a Dell 740xd server.
I also have a Hades Canyon NUC which is a super competent machine but far newer and more powerful than my ageing Lenovo M83…so don’t think Id need anything as powerful as that…but assume it wouldn’t blink if it were to be my system of choice…but the M83 is a bit long in the tooth with its 10-year-old CPU and even an N100 would be more powerful and more power efficient too…a Rzyen (still almost 6 years old itself) 5 would be considerably more but with similar power requirement to the i3
out of interest, whilst I’ve learnt about Time Drift today…does anyone have any thoughts on wether spending £150 on a machine would be a complete waste of time or a way or futureproofing HA for some time…it would be more powerful and better in power than my ageing machine as well…
In my opinion it’s likely that the delays you are experiencing are not related to the hardware you are running on. If you want a good reason to buy more hardware you could always look at running replication, high availability and other clustered features.
I’ve been running HA for more than 3 years. I run the critical HA capabilities on an odroid N2+ with 4G memory and 64G eMMc drive. This hardware is less capable than yours. My hardware has no problem handling my main HA functionality. I have a second system that I use to run additional services. This includes the 3 HA voice processing components, frigate video processing and a modbus interface to my HAVC. The system also runs other functions in support of my home network. On the 2nd system I use containers, as they don’t have the wasted overhead of VMs. The second system has an AMD Ryzen 5 5600G with 32G of memory. Should the 2nd system fail, my basic smart house processing capabilities still work. Separating the architecture allows my HA alarm system, part of the critical components, to run even during a power failure.
My point is instead of just getting a new system to run everything on you might want to consider spreading your capabilities across multiple systems.
I’m actually very surprised by the number of integrations, media player, zones and sensors that you have. I only have 32 integrations. I have 6 add-ons, of which only two need to run all the time. The other addons can be started when they are required, like ESPHome. If you have integrations and addons that are not being utilized you should consider disabling or deleting them.
HA will by default check a NTP service each 12 hours, but it might be like 8 hours into that cycle when you see drift.
Time drift occur when the Hypervisor is under heavy load and therefore is shuffling resources between the VMs.
One VM might be doing heavy stuff, like backing up, compiling, video processing and so on, which affect the other VMs.
Time drift may not have been my issue as add-ons didn’t seem to have an impact so I decided to trying something else
AS I used to run HASS on my Synology NAS in a docker, I decided to spin up a test machine again on this but using VMM so I can run add ons etc…
Initially, setting up just two of the affected ESPHome devices that had what id call severe delays, I can now happily report that on this incredibly thin build, that there is NO slow down at all…
so it’s not wifi…and not the ESPhome devices themselves.
But that means it can either be my existing build which is complete with multiple entries in many yaml’s or it is the machine itself I’m currently running HASS on…but part of me thinks the next logical step is to try a completely new install of HASS on my existing Proxmox instance…and see if I get the same results.
Worst case I can go back to a snapshot or backup which won’t fix any of my issues but means I won’t be without 99% functioning system but it would mean I could slowly build my system backup again…
I have no doubt Ive cut many corners over the years and installed stuff I didn’t need or want and that is causing little glitches to happen in the back end that Im just not aware of…so a bit of housekeeping certainly wont hurt
Do yourself a favour and check the automation traces the next time you experience lag. The timeline will help you isolate the common culprit.
Once you have a suspect, disable the cause and monitor for a few days. It could be caused by something as simple as too many lights in a group, or it could be a misbehaving integration.
how many lights would be too many in a group…I think the max I have is 8…so I cant imagine its that…
Automation traces on the ones with issues have shown up nothing so far but I guess it could be another automation that it causing conflict at the same time
Housekeeping is always a good idea and it might even remove the problem…for now, but as you load new stuff on your system, then it will creep back.
Time drift in VMs are a big issue, just do a Google search on it and you will find it mentioned in support articles from Red Hat, Broadlink (=VMWare), Proxmox, Microsoft and every other provider of Hypervisor software.
cant disagree with you at all…just seems odd that it hasn’t ever been an issue in the first 4 years I’ve been using HA…but it certainly doesn’t hurt to have steps in place to stop it becoming an issue
It can be something simply as adding encryption to backups.
Encryption and backups are two things that really can put a load on a computer, especially when you encrypt large backups.