HA stops work every Monday

For some reason importing the public key from an USB for setting up the 22222 SSH access fails. I get Unknown error, see supervisor logs. Supervisor log only confirms USB is recognized, but doesn’t display any error:

21-03-20 16:56:09 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.ADD usb hardware /dev/bus/usb/001/007

The core-log gives some hint:

2021-03-20 18:04:11 ERROR (MainThread) [homeassistant.components.hassio.handler] Client error on os/config/sync request Cannot connect to host 172.30.32.2os:80 ssl:default [Name does not resolve]
2021-03-20 18:04:11 ERROR (MainThread) [homeassistant.components.hassio] Failed to to call os/config/sync - 

Host-log only shows ‘mounting’ of the drive:

[430112.206756] usb 1-1.1.3: new high-speed USB device number 7 using dwc_otg
[430112.341873] usb 1-1.1.3: New USB device found, idVendor=13fe, idProduct=3e00, bcdDevice= 1.00
[430112.341892] usb 1-1.1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[430112.341904] usb 1-1.1.3: Product: USB Flash Drive
[430112.341916] usb 1-1.1.3: Manufacturer: Philips
[430112.341928] usb 1-1.1.3: SerialNumber: 07B8120890C73B5B
[430112.343881] usb-storage 1-1.1.3:1.0: USB Mass Storage device detected
[430112.344558] scsi host0: usb-storage 1-1.1.3:1.0
[430112.511037] usbcore: registered new interface driver uas
[430113.493635] scsi 0:0:0:0: Direct-Access     Philips  USB Flash Drive  PMAP PQ: 0 ANSI: 0 CCS
[430114.083144] sd 0:0:0:0: [sda] 7570752 512-byte logical blocks: (3.88 GB/3.61 GiB)
[430114.083609] sd 0:0:0:0: [sda] Write Protect is off
[430114.083627] sd 0:0:0:0: [sda] Mode Sense: 23 00 00 00
[430114.084058] sd 0:0:0:0: [sda] No Caching mode page found
[430114.084250] sd 0:0:0:0: [sda] Assuming drive cache: write through
[430114.140458]  sda: sda1
[430114.144340] sd 0:0:0:0: [sda] Attached SCSI removable disk

The drive is FAT formatted and has name ‘CONFIG’. The file containing the public key is named authorized_keys, is ANSI encoded with only LF (no CR). I do not know why it fails. any tips?

Hi all, my PI went bananas again last night. I did hook up a monitor to my PI and it did show error messages from 01:52 onwards.

I have added my findings to the github entry as the person involved there (Michael) suggested I do that.
In short the following error messages seem to be the interesting ones:
[#####.#####] mmc0: card never left busy state
[#####.#####] mmc0: error -110 whilst initialising SD card

Looking up these messages in google show remarks on bad SD cards.
But I am not entirely convinced here:

  1. The error only happens every sunday at around 02:00
  2. Rebooting any other time (hard or the proper way) does not cause this to happen

One remark did trigger a new line of thought though: it might be that my card is a fake 32GB.
The card tells the system it is indeed 32GB but in reality it is only 16GB.
At 02:00 the PI is doing some health check including the state of the SD card and tries to access the SD card beyond the 16GB and goes into error mode.

So, I am going to buy a new 32GB card and try to copy my HASS.IO setup to this new card and look what happens. If no error occurs then I will check the old card to see if my assumption is correct.

Link to the github entry
HA stops work every Monday · Issue #47928 · home-assistant/core

I rebuilt my instance onto a new SD card last week - and also deleted the database after recovering from backup.
This morning - all still working for the first time in weeks (might be months by now)
Not sure which, if either, of the above changes made the difference. This is starting to look like an SD card issue as mentioned by Harmpert above.

@Wilber keep your fingers crossed! Next weekend will be the test!

Question:
How did you get your configuration onto the new card?

Yes, fair point! :slight_smile: My database does seem to be growing quite quickly these days.

I just used the standard backup/recover method.
Took a snapshot before shutting down, downloaded to pc. Created the new disk/image of HA. Installed Samba, uploaded the snapshot to HA and recovered from that.

You do not even have to install samba. Allready during the onbording you can choose to select and upload a snapshot from your computer. See this post.
I remember it didn’t work flawlessly in my case, but it end it worked. (can’t full remember what went wrong and how I solved it. I think the progress window didn’t show, but I just waited)

@Wilber @plevuus Thanks for the tip!

@Plevuus although I am going to try the card route I am still not convinced that the problem we are facing is necessarily a card issue. I have seen some more people having instability problems that were only temporarily solved by using a new card. Having said that, the card might be part of the problem.

@Harmpert, You can use the free DiskInternals Linux-Reader to read the content of your Home-Assistant SD-card on a windows PC.
C’T magazine has a nice SD-card test on their site (in Dutch).

@Wilber
How did your system behave last night? Still no problems?

I installed a new SD card and did not have a crash last night.
On the other hand, my configuration is not entirely back to normal yet…

Still up and running fine this morning!

I can understand how a rebuild would fix a crashing problem, but it’s very strange that several of us had the same issue at around the same time.

@Wilber

Good to hear!
Indeed, you do have a point, It still is a good idea to identify which process is running on sunday night that triggers a (seemingly) problematic SD-card to act up.

Hi, here the same problem.

I started with HA in nov. 2020. And from the beginning (I think) my RPI4 crashes every Monday during the night. And I’m not doing any automation at that time.
My HA crashed at 3h04 this night (summer time GMT+1, Brussels) . Some previous crashes happened at (winter time) +/- 2h00, 1h30, 2h30

I have always the same behavior:

  • No automation was done this morning
  • But I can logon from my PC to my PI. In the overview menu, I can even manually set some lights.
  • But I cannot open any other menu than “overview”. I receive the error message “Unable to load the panel source: /api/hassio/app/entrypoint.js” when trying to open other menus, like the logging menu, Red-one menu, supervisor menu.
  • Not possible to connect to my RPI with Samba

I always have to power off/on my RPI

After power off/on -> some screennshots indicating that memory use, disk use, processor temp are very acceptable.

Environment: Raspberry PI model4 – 64bit - 4GB Ram – 64GB SSD
HA Software versions: Core 2021.3.3 / Supervisor 2021.03.6 / Host: 5.12 /
Add-ons: File editor: 5.2.0, Samba share V 9.3.1, Terminal&SSH 9.1.0, Zwave JS 0.1.16, Log Viewer 0.9.1, Nodered 8.2.1, DSS VOip Notifier 3.5.6

@hdehaseleer, You write 64GB SSD, but I assume you mean a 64GB SD, right? What is the brand and model of the SD card and what is it’s speed-class?
Please have a look at the linked Github-issue. Multiple people got rid of the issue by upgrading their SD cards. (actually all people who tried)

@Plevuus: Thanks for your prompt response.

Yes, i’m using a SD card (not SSD). (Kingston Canvas Select plus / 10 I U1 A1 100MB/s)

I didn’t believe that a hardware problem could arrive each Monday moring 3h00.(and only on Monday)
But indeed, perhaps the OS is doing some cleanup Monday +/- 3h00 which the SD card cannot follow.
I’ll buy a new one and give it a try.

Indeed, changing the SD card solved the problems. It’s for the second time that my RPI didn’t crash on Monday. This after several months.

I have now a Samsung Evo Plus MicroSDXC 64GB UHS Class I | Class 10 | U3 |.
So it is a U3 that writes at 60MB/s read 100MB/s

Before I had a Kingston Canvas Select Plus microSDXC 64 GB UHS Class I | Class 10 | U1. So a U1

For information: I bought two identical SD cards at the time I bought my RPI model4. So I tried first with the second identical Kingston card. But this second SD card had the same problem of the “Monday crash”

Im using hassos on a rpi4 2Gb with an original rpi4 ucbc adapter. My router is a fritzbox7590.
Mine stops every monday around 4am

@Antoine_Evertze do try a new SD card with proper capabilities. Somehow a process runs on Sunday night that causes an issue with some SDcards. As you can see in all cases listed here and in the github the problem does not occur with a proper SDCard . mine has been running for two months now without any issue.

As I said, I changed my SD card with a Samsung Evo and this solved the problem. I changed nothing else. It’s already my 4th Monday without a crash.

Somebody of the development team should look at the (housekeeping?) application that runs every Monday morning around 02h. It cannot be hard to correct this problem by ex. limiting the writing speed or the CPU utilization of this application. This will reduce the frustration of a lot of persons and increase the reputation of HA.

Who can forward this problem to the development team?

Anybody. see

It’s been a while sinds the initial problem. In my case the problem was solved with the july 2021 update.
But now the last two weeks the problem is back. Already tried a different sd card (evo). Didn’t help.
Any suggestions?

Today it happens again