Simple(?) Configuration keeps crashing - help!

After months of trying to track down why my Home Assistant server regularly crashes every 5-7 days. I finally just gave up and started over from scratch.
Downloaded the latest release for the RPi, deleted all the unused entries from all my config files, removed config files no longer in use and reinstalled MQTT, Mosquitto and File Editor.
Then copied all my configuration files back in and rebooted. Cleaned up some misc issues with the MQTT login and it all ran for about 10 minutes after rebooting again with no errors. Then it just FROZE with ā€˜Reload UIā€™.

Attached are all my yaml files. I really donā€™t know why this is happening. I really canā€™t get it more basic than this. Hopefully someone can point out what Iā€™ve done wrong.

Thanks.

configuration.yaml

homeassistant:
  # Name of the location where Home Assistant is running
  name: Home
  # Location required to calculate the time the sun rises and sets
  latitude: 0
  longitude: 0
  # Impacts weather/sunrise data (altitude above sea level in meters)
  elevation: 4
  # metric for Metric, imperial for Imperial
  unit_system: metric
  # Pick yours from here: http://en.wikipedia.org/wiki/List_of_tz_database_time_zones
  time_zone: Atlantic/Madeira
  # Customization file
  customize: !include customize.yaml

recorder:
  purge_interval: 2
  purge_keep_days: 7
  db_url: !secret mysql_recorder

lovelace:
  mode: yaml

# Enables configuration UI
config:

# Checks for available updates
# Note: This component will send some information about your system to
# the developers to assist with development of Home Assistant.
# For more information, please see:
# https://home-assistant.io/blog/2016/10/25/explaining-the-updater/
updater:
  # Optional, allows Home Assistant developers to focus on popular components.
  # include_used_components: true
  
# System Health
system_health:

# Discover some devices automatically
discovery:

# Allows you to issue voice commands from the frontend in enabled browsers
conversation:

# Enables support for tracking state changes over time
history:

# View all events in a logbook
logbook:

# Enables a map showing the location of tracked devices
map:

# Track the sun
sun:

# Cloud
cloud:

group: !include groups.yaml
automation: !include automations.yaml
script: !include scripts.yaml

panel_custom:
  - name: hassio-main
    sidebar_title: Configurator
    sidebar_icon: hass:settings
    js_url: /api/hassio/app/entrypoint.js
    url_path: configurator
    embed_iframe: true
    require_admin: true
    config:
      ingress: core_configurator    
    
# Sonoff Switches
switch TaskLamp:
  - platform: mqtt
    name: "Task Lamp"
    command_topic: "cmnd/sonoff-01/power"
    state_topic: "stat/sonoff-01/POWER"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false
    
switch TableLamp:
  - platform: mqtt
    name: "Table Lamp"
    command_topic: "cmnd/sonoff-02/power"
    state_topic: "stat/sonoff-02/POWER"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false
    
camera:
  - platform: foscam
    ip: 172.27.2.21
    port: 88
    username: !secret foscam_username
    password: !secret foscam_password
    name: Kitchen
  - platform: foscam
    ip: 172.27.2.22
    port: 88
    username: !secret foscam_username
    password: !secret foscam_password
    name: Front Door

sensor Livingroom:
- platform: mqtt
  name: "Temperature"
  state_topic: "tele/sonoff-02/SENSOR"
  value_template: '{{ value_json.SI7021.Temperature }}'
  unit_of_measurement: "Ā°C"
  availability_topic: "tele/sonoff-02/LWT"
  payload_available: "Online"
  payload_not_available: "Offline"
- platform: mqtt
  name: "Humidity"
  state_topic: "tele/sonoff-02/SENSOR"
  value_template: '{{ value_json.SI7021.Humidity }}'
  unit_of_measurement: "%"
  availability_topic: "tele/sonoff-02/LWT"
  payload_available: "Online"
  payload_not_available: "Offline"
  
sensor:       
  - platform: time_date
    display_options: 
      - 'date_time'

#### JGAurora A4 3D Printer configurations
switch A4:
  - platform: mqtt
    name: "A4 3D Printer"
    command_topic: "cmnd/sonoff-a4/power"
    state_topic: "stat/sonoff-a4/POWER1"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false

#### JGMaker A6 3D Printer configurations
switch A6:
  - platform: mqtt
    name: "A6 3D Printer"
    command_topic: "cmnd/sonoff-a6/power"
    state_topic: "stat/sonoff-a6/POWER1"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false

#### JGMaker Magic 3D Printer configurations
switch Magic:
  - platform: mqtt
    name: "Magic 3D Printer"
    command_topic: "cmnd/sonoff-magic/power"
    state_topic: "stat/sonoff-magic/POWER1"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false

#### AnyCubic Kossel 3D Printer configurations
switch Kossel:
  - platform: mqtt
    name: "Kossel 3D Printer"
    command_topic: "cmnd/sonoff-kossel/power"
    state_topic: "stat/sonoff-kossel/POWER1"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false

#### Hevo 3D Printer configurations
switch Hevo:
  - platform: mqtt
    name: "Hevo 3D Printer"
    command_topic: "cmnd/sonoff-hevo/power"
    state_topic: "stat/sonoff-hevo/POWER1"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false

#### Hevo 3D Printer configurations
switch Artist_D:
  - platform: mqtt
    name: "Artist D 3D Printer"
    command_topic: "cmnd/sonoff-ad/power"
    state_topic: "stat/sonoff-ad/POWER1"
    qos: 1
    payload_on: "ON"
    payload_off: "OFF"
    retain: false

customize.yaml

switch.a4_3d_printer:
  icon: mdi:power

switch.a6_3d_printer:
  icon: mdi:power

switch.magic_3d_printer:
  icon: mdi:power

switch.kossel_3d_printer:
  icon: mdi:power

switch.hevo_3d_printer:
  icon: mdi:power

switch.artist_d_3d_printer:
  icon: mdi:power

ui-lovelace.yaml

title: LiĆ nyĆ¹
views:
    ############ Home Tab ############
  - title: Home
    columns: 3
    cards:
      - type: vertical-stack
        cards:
          - type: sensor
            entity: sensor.date_time
          - type: horizontal-stack
            cards:
              - entity: switch.table_lamp
                tap_action:
                  action: toggle
                type: entity-button
                icon: mdi:lamp
              - entity: switch.task_lamp
                tap_action:
                  action: toggle
                type: entity-button
                icon: mdi:desk-lamp
      - type: vertical-stack
        cards:
          - type: horizontal-stack
            cards:
              - entity: sensor.temperature
                type: sensor
                icon: mdi:thermometer
              - entity: sensor.humidity
                type: sensor
                icon: mdi:water-percent
          - type: history-graph
            title: Environment
            refresh: 60
            refresh_interval: 60
            entities:
              - entity: sensor.temperature
                name: Temperature
              - entity: sensor.humidity
                name: Humidity
      - type: break
      - type: vertical-stack
        cards:
          - type: picture-glance
            title: Front Door
            entities: 
              - entity: switch.task_lamp
                icon: mdi:desk-lamp
            camera_image: camera.front_door
          - type: picture-entity
            entity: camera.kitchen
            camera_image: camera.kitchen
                    
    ############ 3D Printers Tab ############
  - title: 3D Printers
    cards:
      - type: entities
        title: Hevo
        show_header_toggle: false
        entities:
          - entity: switch.hevo_3d_printer
            name: '***Power!!!'
      - type: entities
        title: JGAurora A4
        show_header_toggle: false
        entities:
          - entity:  switch.a4_3d_printer
            name: '***Power!!!'
      - type: entities
        title: JGMaker A6
        show_header_toggle: false
        entities:
          - entity:  switch.a6_3d_printer
            name: '***Power!!!'
      - type: entities
        title: JGMaker Magic
        show_header_toggle: false
        entities:
          - entity:  switch.magic_3d_printer
            name: '***Power!!!'
      - type: entities
        title: AnyCubic Kossel
        show_header_toggle: false
        entities:
          - entity:  switch.kossel_3d_printer
            name: '***Power!!!'
      - type: entities
        title: Artist-D
        show_header_toggle: false
        entities:
          - entity: switch.artist_d_3d_printer
            name: '***Power!!!'

Capital letters in yaml? Didnā€™t think that was possible.

And sensor Livingroom: seems to have indentation error

Thanks! I totally missed that, but it didnā€™t throw any errors and seems to be working. However, I have corrected it and restarted Home Assistance. Hope that was the issue!

Still died 1 week later. It just locks up and fails to respond over http or ssh. Only way to get it working again is to power off (unplug power) and plug it back in.

Any ideas ???

Like clockwork, it just crashed again.

has been depreciated. Defaults to everyday. If you want every other day, use an automation to call recorder.purge and set auto_purge: to false.

Oh. Ok. I have removed thatā€¦butā€¦thereā€™s my 7 days crash interval:

recorder:
  purge_interval: 2
  purge_keep_days: 7
  db_url: !secret mysql_recorder

Its not like its saving a lot of data. Maybe Iā€™ll disable purge_keep_days as well and see if it lives any longer.

Thanks!

This sounds hardware-related to me. Your configuration may put more or less load on the system, which will influence its crash frequency, but it shouldnā€™t be doing this at all. New SD card might be the answer, if youā€™re sure itā€™s not overheating.

Can you ssh into the Pi, or even ping it, once itā€™s frozen?

1 Like

Also, are you using an official (or at least confirmed working) power supply?

does it crash every monday at like 2am? If the answer is yes, get a new SD card. Thereā€™s a whole thread about this and the solution for everyone was SD card related.

Iā€™m just going to go out on a limb and say that the problem is definitely SD card related. It crashes on a monday based on your post history, youā€™re using a pi, and most likely using an SD card.

Thanks everyone for the suggestions. Here are some answers:

Itā€™s not every Monday, its every 7 days from the last crash or manual reboot.

Its a new SD Card. Iā€™ve tried several. Still crashes. Its a 32GB Class 10.

This is my second Power Supply. Itā€™s a 50W with 4 x 5.2V 2.5A outlets and 1 USB C Outlet with nothing plugged into it. There are two other Raspberry Piā€™s (3B and 2B) plugged into the other two USB A outlets running other services and have never crashed. This Power Supply has LCD displays showing Voltage and Power draw of each Pi plugged into it. Currently they are running between .1A and .2A @ 5.2V

As for overheating. The temps are around 40c, with heatsink and a fan. But Iā€™ve disabled the Pi monitor plugin as part of my shot-gun testing of this problem. Thereā€™s not much left of my original configuration. And not much left to go wrong.

If this still crashes, Iā€™ll steal the Samsung 64GB evo select mico xc class 3 SD card from my camera and give it a try.

You sound exactly like everyone else on that thread and youā€™ve posted every monday. Just saying. Every single person ā€œIts not my sd card. The card is XYZ and I have 8 piā€™s running with this sd.ā€ Then a month later ā€œI replace the sd and now it no longer crashes.ā€

I hope youā€™re right and the Samsung SD card fixes it. But like I said, Iā€™ve already replaced the card several times over the many months this has been happening. The only thing I havenā€™t done is try the xc card from Samsung. And just as an TL;DR - its not every Monday, its every 7 days. If I manually reboot in the middle of the week, it will crash in the middle of next week.

Stillā€¦hope its the SD card, put an end to this saga, and I can start restoring my configuration back together the way it was. :vulcan_salute:

Iā€™ll keep everyone posted.

1 Like

So for giggles, reboot wednesday. If it crashes next wednesday itā€™ll most likely be database related and I would recommend moving towards a different database instead of the sqllite.

Been there, done that. Donā€™t see a reason to do that again.
Iā€™m using MariaDB

It appears to be in the throws of immiment death - well before its usual 1 week lifespan. Sensors and cameras are timing out. System is losing connection, etc.
Hereā€™s the current system log and screenshot of the usaged.

21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.check] System checks complete
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-06-16 09:25:59 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-06-16 09:25:59 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
21-06-16 09:25:59 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
21-06-16 10:13:10 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:13:10 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
21-06-16 10:14:45 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.check] System checks complete
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-06-16 10:26:00 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-06-16 10:26:00 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
21-06-16 10:26:00 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
21-06-16 10:26:10 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:26:38 INFO (MainThread) [supervisor.jobs] 'Tasks._update_addons' blocked from execution, no host internet connection
21-06-16 10:26:53 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:27:19 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
21-06-16 10:27:33 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:28:24 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:29:04 WARNING (MainThread) [supervisor.jobs] 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:04 WARNING (MainThread) [supervisor.jobs] 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:04 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-68136' coro=<Repository.update() done, defined at /usr/src/supervisor/supervisor/store/repository.py:106> exception=StoreJobError("'GitRepo.pull' blocked from execution, no supervisor internet connection")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/store/repository.py", line 110, in update
    await self.git.pull()
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 86, in wrapper
    raise self.on_condition(error_msg, _LOGGER.warning) from None
supervisor.exceptions.StoreJobError: 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:04 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-68138' coro=<Repository.update() done, defined at /usr/src/supervisor/supervisor/store/repository.py:106> exception=StoreJobError("'GitRepo.pull' blocked from execution, no supervisor internet connection")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/store/repository.py", line 110, in update
    await self.git.pull()
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 86, in wrapper
    raise self.on_condition(error_msg, _LOGGER.warning) from None
supervisor.exceptions.StoreJobError: 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:07 INFO (MainThread) [supervisor.jobs] 'StoreManager.update_repositories' blocked from execution, no supervisor internet connection
21-06-16 10:29:07 INFO (MainThread) [supervisor.store] Loading add-ons from store: 63 all - 0 new - 0 remove
21-06-16 10:29:17 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:31:25 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:32:13 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:32:54 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:44:12 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
21-06-16 10:55:45 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:02:52 INFO (MainThread) [supervisor.host.info] Updating local host information
21-06-16 11:02:54 INFO (MainThread) [supervisor.host.services] Updating service information
21-06-16 11:02:55 INFO (MainThread) [supervisor.host.network] Updating local network information
21-06-16 11:03:02 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
21-06-16 11:03:02 INFO (MainThread) [supervisor.host] Host information reload completed
21-06-16 11:07:08 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:07:59 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:08:50 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:09:41 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:10:32 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:26:00 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
21-06-16 11:26:00 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
21-06-16 11:26:00 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
21-06-16 11:26:11 WARNING (MainThread) [supervisor.utils.pwned] Can't fetch HIBP data: Timeout
21-06-16 11:26:22 WARNING (MainThread) [supervisor.utils.pwned] Can't fetch HIBP data: Timeout
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.check] System checks complete
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete

When it craps out, Iā€™ll swap out the SD card for the Samsung one and see how it goes - unless someone sees the real issue from this log

Why is the ip of your host in the internal ip address range?

What do you mean? Where? 172.27.3.4? Whatā€™s wrong with this? Itā€™s not internet facing

It should still have a network ip. Are all devices on your router starting with 172? Typically thatā€™s reserved for internal to a computer, not a network.