Home Assistant crashing after clocks changed?

Well that and you seem to have jumped forward an hour, rather than repeating an hour. I certainly did not have any problems when we jumped forward an hour in March.

EDIT:

If you have a fancy template though that replicate being able to run an automation every X minutes, I’d be interested - because this is one of the automations that was affected:

alias: Camera Snapshots
description: ''
trigger:
  - platform: time_pattern
    minutes: /5
condition: []
action:
  - service: camera.snapshot
    target:
      entity_id: camera.livingroom_mainstream
    data:
      filename: /config/www/cameras/rtsp_livingroom_main.jpg
  - service: camera.snapshot
    target:
      entity_id: camera.bedroom_mainstream
    data:
      filename: /config/www/cameras/rtsp_bedroom_main.jpg
mode: single

I only take a snapshot when movement is detected.

- id: d70f3e5f-8bcc-49bb-8aa4-3fc389f300e8
  alias: 'Drive Camera Motion'
  max_exceeded: 'silent'
  trigger:
    platform: state
    entity_id:
    - binary_sensor.pir_drive
    from: 'off'
    to: 'on'
  action:
  - service: camera.snapshot
    data:
      entity_id: camera.front
      filename: '/config/www/snapshots/drive1.jpg'
  - delay: 00:00:01
  - service: camera.snapshot
    data:
      entity_id: camera.front
      filename: '/config/www/snapshots/drive2.jpg'
  - delay: 00:00:01
  - service: camera.snapshot
    data:
      entity_id: camera.front
      filename: '/config/www/snapshots/drive3.jpg'
  - service: script.turn_on
    entity_id: script.drive_cam_telegram
  - service: notify.gmail
    data:
      title: 'Drive Movement'
      message: 'Drive movement detected'
      data:
        images:
          - /config/www/snapshots/drive1.jpg
          - /config/www/snapshots/drive2.jpg
          - /config/www/snapshots/drive3.jpg
  - delay:
      seconds: 5 # trigger every 5 seconds at most

Yeah, everything with timepattern just fires constantly on all cylinders.

That doesn’t help in this case - the problem is the cameras are using RTSP and when you load the lovelace view, you don’t get a picture on loading it, until the first time the picture card refreshes the feed (every 10 seconds or so). This workaround means there is a recent image at least when you look at the camera page. This workaround is required on multiple installations not just mine.

Here is another automation that ensure that the boiler state stays synchronised on several ESPHome thermostats, in case they have been rebooted or lost connection with WiFi temporarily.

alias: Sensor - Boiler Running
description: Update State of Thermostats
trigger:
  - entity_id: binary_sensor.boiler_working
    platform: state
  - hours: '*'
    minutes: /5
    platform: time_pattern
    seconds: '0'
condition: []
action:
  - data: {}
    target:
      entity_id:
        - switch.boiler_status_led
        - switch.boiler_status_led_2
        - switch.boiler_status_led_3
    service: >-
      switch.turn_{{ 'on' if is_state('binary_sensor.boiler_working','on') else
      'off' }}

Same here. Crashed HA and i have an automation with time pattern.
And ofcourse it happend right when i was heating… so nice and warm house this morning :smiley: :cry:

My system crashed last night. Not figured out yet what it could be, but we have some time to fix this :wink:

You are putting your CPU to that much use 24/7 just to refresh an image in case you might look at it?

Browser mod can refresh the image only when Lovelace is opened, (use the device sensor visibility attribute to trigger a script that captures your image).

The other one is easily converted to state based triggers too:

alias: Sensor - Boiler Running
description: Update State of Thermostats
trigger:
  - platform: state
    entity_id: binary_sensor.boiler_working
    to:
      - 'on'
      - 'off'
  - platform: homeassistant
    event: start
  - platform: state
    entity_id: 
      - switch.boiler_status_led
      - switch.boiler_status_led_2
      - switch.boiler_status_led_3
    from: 'unavailable'
condition: []
action:
  - data: {}
    target:
      entity_id:
        - switch.boiler_status_led
        - switch.boiler_status_led_2
        - switch.boiler_status_led_3
    service: >-
      switch.turn_{{ 'on' if is_state('binary_sensor.boiler_working','on') else
      'off' }}

This is getting off topic. Open another topic if you want help making your automations more efficient.

1 Like

Same issue for me, last night we’ve moved out of daylight savings time, and at 3AM, home assistant stooped receiving updates form sensors, and got stuck at the last value received. I’ve checked the logs in esphome, and they were still detecting the right values. Also none of my automations ran over night. After a restart of the virtual machine, everything started to work. I’m running core-2021.10.6 and Home Assistant OS 6.5
image

3 Likes

Same for me, my HA crashed exectly at 3AM and the first error message in the log is iindeed linked to time patter triggered automation. Interesting though, HA was working (or trying so) for another almost 7 minutes before finally stopping. In that time it generated almost 600 000 of various error messages in the log - 79MB of data :slight_smile:

For me, Home Assistant continued to run (data was still captured from sensors and sent to InfluxDB) but my graphs were “flat”. The relevant error was “The recorder queue reached the maximum size” - but I have no idea why the clocks changing would cause the recorder queue to build up.

My recorder database is the default SQLite one.

System Health
version: core-2021.10.2
installation_type: Home Assistant Core
dev: false
hassio: false
docker: false
user: homeassistant
virtualenv: true
python_version: 3.9.1
os_name: Linux
os_version: 5.10.17-v8+
arch: aarch64
timezone: Europe/London

logged_in: false
can_reach_cert_server: ok
can_reach_cloud_auth: ok
can_reach_cloud: ok

FYI:
I have a notification each time HA reboots and I had like 6 notifications in a row at 3am.
(yes I have automations with the time pattern trigger)

For me HA still running but no history after clocks went back an hour. With this error -

The recorder queue reached the maximum size of 30000; Events are no longer being recorded

Seems fixed by a restart.

1 Like

Yep, “backend” crashed for me as well.
Besides the “recorder” message, ZHA seemed to have lost it as well.

2021-10-31 02:02:26 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:04:37 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:05:47 ERROR (MainThread) [homeassistant.components.recorder] The recorder queue reached the maximum size of 30000; Events are no longer being recorded
2021-10-31 02:09:35 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:20:58 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:27:18 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:29:17 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:34:22 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:37:58 WARNING (MainThread) [homeassistant.helpers.entity] Update of sensor.plug_tuya_01_electrical_measurement is taking over 10 seconds
2021-10-31 02:38:00 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:38:16 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:38:31 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:38:46 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:39:01 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout: 
2021-10-31 02:39:50 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2021-10-31 02:40:04 ERROR (bellows.thread_0) [bellows.uart] Lost serial connection: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
2021-10-31 02:40:04 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2021-10-31 02:40:04 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2021-10-31 02:40:04 ERROR (bellows.thread_0) [bellows.uart] Lost serial connection: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
2021-10-31 02:40:04 ERROR (bellows.thread_0) [bellows.uart] Lost serial connection: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
2021-10-31 02:40:04 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2021-10-31 02:40:04 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2021-10-31 02:47:22 ERROR (MainThread) [homeassistant] Error doing job: Task was destroyed but it is pending!
2021-10-31 02:47:22 ERROR (MainThread) [homeassistant] Error doing job: Task was destroyed but it is pending!

Might be worth some testing before our western Atlantic friends do their DST jump next week…

my system crahed too :thinking:

I woke up to a crashed instance at exactly 2am too. Used the console to do a core restart which brought it back to life.

I thought something had happened as a motion sensor didn’t trigger lights when I walked past.

First time anything like this has happened.

Yes it looks like Home Assistant crashed for me as well at 02:00 (UK DST applied).

Of course, logs don’t survive a rebbot so absolutely noidea why.

I do wish the system logs were saved and rotated… :frowning:

[edit]
HA Blue
core-2021.10.6
supervisor-2021.10.6
Home Assistant OS 6.5

Same here. And I’m 150 km away from home, I can still connect to SSH over Zero Tier One but didn’t take my SSH key with me.

How comes Supervisor won’t detect the problem and restart HA?

Yes, same here. I have two HA in two different locations. Both crashed tonight when we switched to winter time. I have no clue why. Hard reboot fixed the first one. The second one is 1 hour away by car… maybe I should use one of my old 433 MHz Telldus switches in front of Ha so I can reboot from a distance via the Telldus app…

I have the same problem with HA Supervised.
However, my system rebooted itself at 06:47 CET this morning, after which all went back to normal.

The history graphs problems also reported are sort of cosmetic but the /n time problem is a critical bug. I have a temperamental ESP32 based camera that I power cycle every hour to keep it active. When I woke at 01:15 it was resetting approximately every second. If that automation had been controlling a more sensitive device such as a fridge or freezer the unit would almost certainly be damaged.