Simple(?) Configuration keeps crashing - help!

Like clockwork, it just crashed again.

has been depreciated. Defaults to everyday. If you want every other day, use an automation to call recorder.purge and set auto_purge: to false.

Oh. Ok. I have removed that…but…there’s my 7 days crash interval:

recorder:
  purge_interval: 2
  purge_keep_days: 7
  db_url: !secret mysql_recorder

Its not like its saving a lot of data. Maybe I’ll disable purge_keep_days as well and see if it lives any longer.

Thanks!

This sounds hardware-related to me. Your configuration may put more or less load on the system, which will influence its crash frequency, but it shouldn’t be doing this at all. New SD card might be the answer, if you’re sure it’s not overheating.

Can you ssh into the Pi, or even ping it, once it’s frozen?

1 Like

Also, are you using an official (or at least confirmed working) power supply?

does it crash every monday at like 2am? If the answer is yes, get a new SD card. There’s a whole thread about this and the solution for everyone was SD card related.

I’m just going to go out on a limb and say that the problem is definitely SD card related. It crashes on a monday based on your post history, you’re using a pi, and most likely using an SD card.

Thanks everyone for the suggestions. Here are some answers:

It’s not every Monday, its every 7 days from the last crash or manual reboot.

Its a new SD Card. I’ve tried several. Still crashes. Its a 32GB Class 10.

This is my second Power Supply. It’s a 50W with 4 x 5.2V 2.5A outlets and 1 USB C Outlet with nothing plugged into it. There are two other Raspberry Pi’s (3B and 2B) plugged into the other two USB A outlets running other services and have never crashed. This Power Supply has LCD displays showing Voltage and Power draw of each Pi plugged into it. Currently they are running between .1A and .2A @ 5.2V

As for overheating. The temps are around 40c, with heatsink and a fan. But I’ve disabled the Pi monitor plugin as part of my shot-gun testing of this problem. There’s not much left of my original configuration. And not much left to go wrong.

If this still crashes, I’ll steal the Samsung 64GB evo select mico xc class 3 SD card from my camera and give it a try.

You sound exactly like everyone else on that thread and you’ve posted every monday. Just saying. Every single person “Its not my sd card. The card is XYZ and I have 8 pi’s running with this sd.” Then a month later “I replace the sd and now it no longer crashes.”

I hope you’re right and the Samsung SD card fixes it. But like I said, I’ve already replaced the card several times over the many months this has been happening. The only thing I haven’t done is try the xc card from Samsung. And just as an TL;DR - its not every Monday, its every 7 days. If I manually reboot in the middle of the week, it will crash in the middle of next week.

Still…hope its the SD card, put an end to this saga, and I can start restoring my configuration back together the way it was. :vulcan_salute:

I’ll keep everyone posted.

1 Like

So for giggles, reboot wednesday. If it crashes next wednesday it’ll most likely be database related and I would recommend moving towards a different database instead of the sqllite.

Been there, done that. Don’t see a reason to do that again.
I’m using MariaDB

It appears to be in the throws of immiment death - well before its usual 1 week lifespan. Sensors and cameras are timing out. System is losing connection, etc.
Here’s the current system log and screenshot of the usaged.

21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.check] System checks complete
21-06-16 09:25:58 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-06-16 09:25:59 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-06-16 09:25:59 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
21-06-16 09:25:59 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
21-06-16 10:13:10 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:13:10 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
21-06-16 10:14:45 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.check] System checks complete
21-06-16 10:25:59 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-06-16 10:26:00 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-06-16 10:26:00 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
21-06-16 10:26:00 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
21-06-16 10:26:10 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:26:38 INFO (MainThread) [supervisor.jobs] 'Tasks._update_addons' blocked from execution, no host internet connection
21-06-16 10:26:53 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:27:19 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
21-06-16 10:27:33 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:28:24 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:29:04 WARNING (MainThread) [supervisor.jobs] 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:04 WARNING (MainThread) [supervisor.jobs] 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:04 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-68136' coro=<Repository.update() done, defined at /usr/src/supervisor/supervisor/store/repository.py:106> exception=StoreJobError("'GitRepo.pull' blocked from execution, no supervisor internet connection")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/store/repository.py", line 110, in update
    await self.git.pull()
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 86, in wrapper
    raise self.on_condition(error_msg, _LOGGER.warning) from None
supervisor.exceptions.StoreJobError: 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:04 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-68138' coro=<Repository.update() done, defined at /usr/src/supervisor/supervisor/store/repository.py:106> exception=StoreJobError("'GitRepo.pull' blocked from execution, no supervisor internet connection")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/store/repository.py", line 110, in update
    await self.git.pull()
  File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 86, in wrapper
    raise self.on_condition(error_msg, _LOGGER.warning) from None
supervisor.exceptions.StoreJobError: 'GitRepo.pull' blocked from execution, no supervisor internet connection
21-06-16 10:29:07 INFO (MainThread) [supervisor.jobs] 'StoreManager.update_repositories' blocked from execution, no supervisor internet connection
21-06-16 10:29:07 INFO (MainThread) [supervisor.store] Loading add-ons from store: 63 all - 0 new - 0 remove
21-06-16 10:29:17 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:31:25 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:32:13 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:32:54 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 10:44:12 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
21-06-16 10:55:45 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:02:52 INFO (MainThread) [supervisor.host.info] Updating local host information
21-06-16 11:02:54 INFO (MainThread) [supervisor.host.services] Updating service information
21-06-16 11:02:55 INFO (MainThread) [supervisor.host.network] Updating local network information
21-06-16 11:03:02 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
21-06-16 11:03:02 INFO (MainThread) [supervisor.host] Host information reload completed
21-06-16 11:07:08 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:07:59 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:08:50 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:09:41 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:10:32 WARNING (MainThread) [supervisor.host.network] Can't update connectivity information: Error: Timeout was reached
21-06-16 11:26:00 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
21-06-16 11:26:00 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
21-06-16 11:26:00 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
21-06-16 11:26:11 WARNING (MainThread) [supervisor.utils.pwned] Can't fetch HIBP data: Timeout
21-06-16 11:26:22 WARNING (MainThread) [supervisor.utils.pwned] Can't fetch HIBP data: Timeout
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.check] System checks complete
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
21-06-16 11:26:22 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete

When it craps out, I’ll swap out the SD card for the Samsung one and see how it goes - unless someone sees the real issue from this log

Why is the ip of your host in the internal ip address range?

What do you mean? Where? 172.27.3.4? What’s wrong with this? It’s not internet facing

It should still have a network ip. Are all devices on your router starting with 172? Typically that’s reserved for internal to a computer, not a network.

I think you’re confused with 127.x.x.x addresses

1 Like

I think you’re right about the 127 / 172 confusion, but the error messages still look like some sort of network meltdown. What’s your setup? Is there any chance of an IP address collision / duplication?

On the network, I have pfSense router with DHCP Server and DNS Resolver. Everything below .100 is assigned a fixed IP address by the DHCP Server by MAC address with a static ARP table entry and local DNS name. Those above .100 are dynamic. Things like PC, Phones, etc. that come and go are put in the .100 to .250 range. pfSense is not showing any IP address conflicts and Fing doesn’t show any either.

Well, its been well over an hour and it didn’t crash. Things magically returned to normal and its no longer loosing connections or timing out. No more warnings or errors in the system log. I didn’t touch or change anything.
Very odd.