I have been having problems with my HA setup since last October. After applying the 2021.10 Core update my RPi4 started crashing within 24 hrs repeatedly. I had not changed any of my configuration. I also updated the system to 6.0 (IIRC) so thought I might be suffering from issue #1119 affecting Rpi 4 stability.
However, I have now moved to a generic x64 installation and the problem persists.
My investigations have shown that at some point in the day (variable) the memory usage starts to clock up from ~10% and within 30 minutes it gets to ~90% and swap file is 100% utilised. This is when the web UI becomes unresponsive and automations no longer trigger.
I have tried a complete fresh configuration (about 5 times) and always end up with the same crash using just add-ons from the Supervisor and official integrations. I have a couple of automations and I have now correlated the triggering of one of them with the memory blow up.
The automation triggers when the temperature sensor reading on my hot water cylinder falls below a set level. When triggered, it calls a script to turn on the hot water boost function on the boiler programmer. The boost cycle lasts for a hour and then turns itself off. I have observed that one cycle of the boost is not always sufficient to get the water back up to temperature. So, the script uses a repeatā¦ while control loop with a 15 minute delay at the end of each loop to check the temperature and the state of the boost switch and turn it on again if necessary.
I have deduced that itās the script causing the memory blow uo by running off some hot water from the cylinder until the automation triggers and watching the memory statistics from the systemmonitor sensors.
Since the code was working when I first implemented it, I think it may be a memory leak in the Coreās handling of the repeatā¦ while loop.
Can anyone corroborate this issue?
Iām posting my script below in case it is poor and the level of ineptitude is now being caught by the Core code. Please comment if you see anything wrong with it.
hot_water_boost:
sequence:
- repeat:
while:
- condition: numeric_state
entity_id: sensor.hot_water_temperature
below: '50'
sequence:
- condition: state
entity_id: switch.hot_water_boost
state: 'off'
- service: switch.turn_on
target:
entity_id: switch.hot_water_boost
- delay:
hours: 0
minutes: 15
seconds: 0
milliseconds: 0
- service: switch.turn_off
target:
entity_id: switch.hot_water_boost
mode: single
alias: Hot water boost