Home Assistant OS 10 update has broken my Pi 4b 4GB

Did you go back to 9.5?
I tried to update to 10.1 today and it didn’t work with a RPi4 + SSD

Yesterday I had some “fun” myself as well, upgrading my pi4-4gb+sd+ssd from 9.5 to 10.1. After the normal update failed I didn’t give up easy. I tried updating the eeprom, ditching the sdcard, and booting a fresh install of 10.1 from ssd. That worked fine from scratch, but as soon as I restore my snapshot/backup it has the same issue many others have described. It works for a second, then goes away, then randomly can be pinged but not accessed… until hard reset, then rinse and repeat.

I have been through assorted update/upgrade issues over the years with HA. I fell back to 9.5 and will wait and see if this gets fixed later on… maybe 10.2 or whatever. At least it’s not like an esphome OTA all fail, where you have to access every device to connect usb for recovery. The problem I am now having though, is I select “Skip” to 10.1, and it keeps popping back up. It’s like it really, really wants me to brick it again, lol.

You may be able to pull the drive, hook it up to another box running linux, and see if they can be recovered. It’s best to habitually download those backups and put them somewhere safe for this exact situation. They can take forever to complete, and I am sometimes asleep before they’re ready to download. Making a habit of checking the backups page whenever I’m logged helps me not forget to download them.

Hi there,

I just want to add to this. I have a:
Pi 4b 8GB
USB Boot to ORICO SSD Portable External 128GB Mini M.2 NVME

I updated from HA OS 9.5 to 10.0, the day it was released and it has been a nightmare since. I read that some people were not even able to boot when they updated with a similar NVME SSD Pi4 hardware configuration.

Luckily mine did, it just kept crashing every 5 hours or so. I connected the HDMI and saw that it was the SQUASHFS becoming read only and journald errors.


compare:

I since changed the power supply from a 20W 4 Ampere to a macbook usb-c charger and updated to HA OS 10.1 which, brought some stability improvement. But still it crashed, then about every other day.

Today I rolled back to HA OS 9.5:
ha os update --version 9.5
ha core update --version=2023.1.7

and its currently migrating my DB back

Database is about to upgrade from schema version: 41 to: 30

so it’s still very busy. It yet remains to be seen whether I get my old regular 1 month or more uptime without crashes. I really hope so.

This is not OKAY!

I suspect it has something to do with the following ‘features’, from release notes:

  • zswap instead of swap in zram is used. This should allow to use Home Assistant OS on systems with lower amounts of RAM with the trade-off of slightly higher storage wear.

For reference, these are the release notes of HA OS 10.1, so something was really not okay with the 10.0 Release, and it’s quite annoying

Raspberry Pi

2 Likes

Hi there,
I have a similar problem on my Odroid XU4 booting from eMMC.
After the update to HAOS 10 it no longer boots and I get no connection to display via HDMI…
Currently trying to recover my 9.5 backup.

FWIW, those who had issues with booting from a usb attached ssd (not nvme) after upgrading from 9.5 to 10.0 or 10.1…

It may be helpful to post some info about your adapter and drive here:

The devs are working on a potential fix, and it seems some info from users who had a failure may help.

1 Like

Hi,

I just wanted to update on this issue. I have had no crashes since I downgraded to HA OS 9.5, uptime is now since 9 May 2023 at 18:17, which was a normal host reboot. I also updated to core 2023.5.2 again, that does not seem to cause any problems.

I tried again recently to update. Some things I discovered

add-ons that utilize CIFS are what is making my system continuously reboot the host like a boot loop. Such as radarr, sonarr, transmission, imich, next cloud etc etc.

From what I can tell, it seems like smbv1/CIFS is deprecated entirely and causes the system to crash when trying to mount.

If have add-ons using SMBv1/cifs, if you can plug directly into the host and use the cli even if it’s only for 20-30seconds before the system crashes and reboots. Try to disable the add ons utilizing CiFS, and the system should boot or not crash long enough to bring up the main HA container, and the webui, at which point you can disable those add-ons on boot.

Need to submit a bug in alexbelgium repo to update the smb/cifs method used in the containers

Since I’ve disabled add ons that utilize CIFS, HAOS 10 and 10.1 run fine.

2 Likes

My add-ons actually try all smb versions starting from 3 and only using smbv1 if it is the protocol used by the user. I could disable completely smbv1 some routers such as fritzbox only use smbv1. Si it seems more on the user side here… On my own system I use the cifs method and never had issues. I use the samba nas add-on that publishes all disks on smb

Hi, actually I’ve adapted my code : it was testing first smb 3, then switched to smb 1. I’ve now implemented that it tests first smb 3 (any version), 3.0 specifically, 2.1, 2.0, and 1.0 only if nothing else works with a big warning saying this is not recommended. It will be implemented in the different addons according to their next update schedules - let me know if you want me to push updates earlier for some specific addons

2 Likes

Hi,
I have this same issue. Upgrading to any OS 10 version caused disk to go read-only after a few hours, reverting to 9.5 corrects the issue.
I am running straight HASSIO OS 9.5 on a RP4 8G booting from a USB drive 128GB (no SD card).

Are you putting any notes on each add on about the change? I need it for radarr, sonarr, and nzbget.

This is implemented in a template manner, if the add-on was updated since thag date (which it was for all of those) then it has the new smb code

I have removed the only CIFS addon I use i.e. SAMBA, but I still have the same issue, though it only occurs after a few days.

to be honest the bus used in storage medium is unreliable to begin with, if you think nothings going wrong doing stuff on a storage connected via USB while in the middle of the update, that your naive.

who knows maybe while in the middle of the update the usb controller decide to poop it self up which is nothing new to begin with, they reset it can get back working again but something definitely gets corrupted in between when the usb controller get reset and recovered back.

I seem to be fixed by blacklisting my USB drive controller.