Logging via Script w/ Automation triggering on it causing an infinite loop!

I’ve managed to send HA into an infinite logging loop where it’s so busy it’s unusable and requires a power cycle to recover! I assume I have an error in my yaml? Either way, I assume HA should handle this better. Please help!

My goal is to email myself any time a system error (or warning) happens.

I added the following lines to configuration.yaml:

system_log:
  fire_event: true

I wrote the below automation to capture and respond to the trigger:

alias: "HA: Error"
description: ""
triggers:
  - event_type: system_log_event
    event_data:
      level: ERROR
    trigger: event
  - event_type: system_log_event
    event_data:
      level: WARNING
    trigger: event
conditions: []
actions:
  - action: notify.gmail_automation
    metadata: {}
    data:
      title: Home Assistant {{ trigger.event.data.level | title }}
      target: [email protected]
      data: >-
        Time: {{ trigger.event.data.timestamp | timestamp_custom("%a %I:%M:%S%p
        %m/%d/%Y") }}
        Level: {{ trigger.event.data.level }}
        Integration: {{ trigger.event.data.name }}
        Source: {{ trigger.event.data.source }}
        Message:
        {{ trigger.event.data.message }}"

I wrote the following script to test the automation:

sequence:
  - action: system_log.write
    metadata: {}
    data:
      level: error
      message: Test Message
alias: Log Test
description: ""

Every time I run the script, I can see the automation immediately triggers (“Last Trigger” shows “now”), but the system immediately hangs: I get a message that the connection is lost and it’s attempting to reconnect. I’ve waited several minutes but it doesn’t recover. I finally have to power cycle my HA box to recover. Upon restarting, the “Last Trigger” shows “never”, and when I look in the raw logs, I see hundreds of lines:

automation.home_assistant_error: expected dict for dictionary value @ data['data']
2025-05-14 22:11:20.057 ERROR (MainThread) [homeassistant.components.automation.home_assistant_error] HA: Error: Error executing script. Invalid data for call_service at pos 1: expected dict for dictionary value @ data['data']
2025-05-14 22:11:20.057 ERROR (MainThread) [homeassistant.components.automation.home_assistant_error] Error while executing automation automation.home_assistant_error: expected dict for dictionary value @ data['data']
2025-05-14 22:11:20.058 ERROR (MainThread) [homeassistant.components.automation.home_assistant_error] HA: Error: Error executing script. Invalid data for call_service at pos 1: expected dict for dictionary value @ data['data']
2025-05-14 22:11:20.058 ERROR (MainThread) [homeassistant.components.automation.home_assistant_error] Error while executing automation automation.home_assistant_error: expected dict for dictionary value @ data['data']

I’m running 2025.5.1, OS 15.2, on a Yellow.

Heh. I’ve done this. Created a notification automation to send me logged errors that caused errors. Managed to crash HA in about 10 seconds.

You are doing the same thing. Your actions are generating errors. This:

      data: >-
        Time: {{ trigger.event.data.timestamp | timestamp_custom("%a %I:%M:%S%p
        %m/%d/%Y") }}
        Level: {{ trigger.event.data.level }}
        Integration: trigger.event.data.name }}
        Source: {{ trigger.event.data.source }}
        Message:
        {{ trigger.event.data.message }}"

Should be:

      data:
        Time: "{{ trigger.event.data.timestamp | timestamp_custom("%a %I:%M:%S%p
        %m/%d/%Y") }}"
        Level: "{{ trigger.event.data.level }}"
        Integration: "{{ trigger.event.data.name }}"
        Source: "{{ trigger.event.data.source }}"
        Message: "{{ trigger.event.data.message }}"

I would advise you to add a “cool-down” delay action too in case it starts generating errors for some other reason.

2 Likes

Ah. Of course a script error is causing an infinite loop! Thank you @tom_l. My error was that I had “data:” instead of “message:”. The content did need to be a single string for my notify implementation, so that wasn’t the issue.

Your cool-down suggestion makes a ton of sense. What implementation do you suggest? I tried adding a “delay” action after my notify, but it didn’t work. I thought the default mode (for scripts and automations) was “single”, which would mean only a single instance runs and attempts to run concurrently are ignored, but if that were the case, I’d think a 1 minute delay (what I tried) would have worked fine since all script errors happen long before the script finishes given the delay.

One [more complex] idea would be to use a datetime helper to store the last execution time, and add a condition to the system_log_event trigger that skips execution if N minutes hasn’t yet passed. This could still yield a race condition between checking and setting the helper, but I assume would be good enough to ensure the system stays stable.

I discovered the “last_triggered” attribute on the automation, so added a condition that the time since last_triggered must be above 5 minutes (an arbitrary time to ensure HA stability and manageable email spam if something does go wrong). I’m open to other suggestion on how to throttle an automation’s execution, but this seems reasonable… and reasonably easy.