Found the issue. Notes:
- Memory leak is 140MB/min
- CPU jumps from ~1% to 25% during leak, and remains at 25% “forever”
- Mem leak is caused by a (my) badly programmed Repeat Loop: (no Action Sequence in Loop)
- The “Until” condition looks fine, loop “should” finish ok, but does not
- Mem is not freed via turning the automation off/on - reboot required
Description: I have a reed switch on the front door, if the door is left open too long, then the heating system is turned off, and turned back on once the door has been closed for 3 minutes. The problem is the Repeat section, as @tom_l suggested.
I had used a repeat
loop, where I should have used a wait for trigger
. Likely cause: creating an automation too late at night.
However: I doubt such a severe memory leak, and high CPU usage, is the ideal response to a (badly programmed?) Repeat section - the entire Home Assistant shuts down, and the user can not even restart it via the web interface - pulling the power cable is required.
The automation was intended to “wait” for the door to be closed, before turning the heaters back on. Here is the offending section, with **pseudocode for the other sections for brevity:
**Trigger: front door opened (reed switch)
**Action1: Turn off heaters
repeat:
until:
- type: is_not_open
condition: device
device_id: 4de50e1b1e70e7e80176a1dc763eb3e5
entity_id: binary_sensor.reed_front_door
domain: binary_sensor
for:
hours: 0
minutes: 3
seconds: 0
sequence: []
**Action2: Turn heaters back on
As you will note, there is no sequence
in the repeat loop. I think my internal logic at the time was simply “just do nothing, until the door has been closed for 3 minutes”. This caused high CPU usage, and a drastic memory leak (8.4GB / hr!)
(FWIW: also the above doesn’t work, i.e even when door is closed, the until
condition is never actually satisfied)
“Obviously” a better way to do that is to use wait_for_trigger
and not a repeat..until
loop:
wait_for_trigger:
- type: not_opened
platform: device
device_id: 4de50e1b1e70e7e80176a1dc763eb3e5
entity_id: binary_sensor.reed_front_door
domain: binary_sensor
for: 00:03:00
What were those quotes about “try and make something foolproof…and someone just invents a better fool”? or “people will always try and use things in ways you never intended”? Looks like I am guilty of both
Regardless, HA is hard for newbies. And we should not assume everyone is a programmer. So, I dont feel the behaviour (catastrophic OOM failure) is appropriate, suggest either just dissallow an empty sequence
, or, if the devs wish to allow an empty sequence
, then indeed just have it actually “do nothing” until the “repeat…until” condition is met.
Thankyou!