tl;dr - only very few users are affected by regressions. The fail safe update system makes updating Home Assistant OS generally a very safe operation.
Last week we released Home Assistant Operating System 15.0, which was a larger release after several weeks in the making, bringing major updates to some building blocks our OS stands on - especially the Linux kernel and the U-Boot bootloader, that inevitably brought a couple of regressions that affected some of our users. A few hours after the release, issues with this latest update were posted on various channels in our community, at which point some of you might have started to believe something was seriously wrong with this release. Immediately we started to investigate these reports to find out how serious the problems were, and in the end distilled several root causes out of the incoming reports. At this point we have a good overview of the issues, and we can safely say that for most users the 15.0 update is just fine! If you are still worried, here is an overview of the issues in OS 15.0.
What didn’t work?
Let’s take a look at what issues the release had that caused all the commotion. Although the following list may look scary at first, the bugs affected only a subset of the users. While some people started to report problems, the majority couldn’t see anything unusual even when running similar hardware. The reasons for this are explained below, as well as the individual issues we found.
Raspberry Pi 5 stuck on reboot or boots back to the old version
A mixed bag of reports, and probably the most visible one, was coming from users of Raspberry Pi 5. We already received some reports of intermittent update failures after previous releases, which eventually went away after several retries. The same we saw here but with a few more cases. Finally, some of you posted the boot logs where we noticed the problems were caused by missing firmware support for new features added to the Linux kernel. This was a known problem that was later fixed in a development branch of the Raspberry Pi kernel, and an update of the bootloader reportedly resolved all these issues. Unfortunately, it’s a bit trickier, as Home Assistant OS doesn’t have the tools to update or even check the version of the bootloader. However, the upcoming version contains backported patches that should even cope with the old bootloader. Needless to say, there was one more bug on the Home Assistant OS side with Raspberry Pi 5 reboots that we only fixed in 15.0, that may have caused lockup in the bootloader when you restarted the Pi from the software without performing an update.
The known issues have been fixed, but the recommendation of updating the bootloader sticks, as there were some errors in the bootloader that we can’t affect. We also got in touch with the Raspberry Pi folks to understand if these breakages are something we should be worried about, and although it was rather a bug on their side, we decided that we should take measures on Home Assistant OS as well. More about that later.
AMD-based thin clients (e.g HP t620) stuck after update
These were also among the first reports to come in, which shows how popular these repurposed devices from corporate offices are. Immediately after the update, the device became unavailable, requiring a few reboots or manual boot slot selection to get back to the working state.
The cause of this bug was a change in the Linux amdgpu
driver. While it wasn’t in effect in the old Linux version because support for integrated GPU in these AMD SoCs wasn’t enabled, changes introduced in the new kernel made it fail badly if options enabling its support were missing. Once we got logs from some of you, the root cause was shortly identified and reported as a bug to driver maintainers. The latest builds now have the options for supporting this hardware enabled, and future Linux versions should fail gracefully if amdgpu
driver isn’t compiled without those (currently “experimental”) options.
Generic x86-64 with certain SATA disks boots back to the previous version
There was another regression coming from upstream Linux kernel changes. The common denominator is an x86 machine running from a SATA drive (note that some M.2 drives are also using the SATA interface) failing to come back after the upgrade or falling back to the old version after being restarted.
Since 6.11 Linux uses a more efficient power-saving policy for SATA devices. However, not all devices that report support for it work well when it’s in effect. The fix is to add libata.force=nolpm
to the kernel command line options, which can be done either temporarily through the GRUB menu or to the cmdline.txt
file in the boot partition. For a proper fix, gathering the information about the exact hardware is needed, so the workaround can be hardcoded in the Linux kernel. However, so far these reports were mostly related to drives from no-name vendors that are usually included out of the box in cheap NUC-like mini PCs. When using hardware from reputable manufacturers there’s a much smaller chance of breakage.
ODROID-N2 fails to boot (from certain eMMC modules)
This one was not as common but maybe the most worrisome. As we updated to the latest U-Boot bootloader version, some of the patches dropped out of the list of patches that should be applied. One of these is a workaround for eMMC initialization errors that is needed to make eMMC reading in U-Boot stable on some eMMC modules, including those that were shipped with the original Home Assistant Blue. Once we identified the mistake, we pulled back the release for this board so it won’t be offered to more users. The issue only appeared intermittently, so after several reboots the devices usually came back. Downgrading to the previous version was then required, as swapping the boot slots doesn’t replace the bootloader in this case due to technical limitations.
Should you skip 15.0?
Unless you are using one of the above mentioned hardware combinations the answer is no. At the time of writing about one third of all Home Assistant OS installs are already running 15.0. With every update, there’s a chance that something will get wrong in the jungle of different configurations, and even a dozen people out of hundreds of thousands can give you a feeling that the issue is serious. However, as you could see, almost all of the bugs reported in OS 15 were affecting only some specific devices and configurations, meaning the majority of users didn’t experience anything unusual.
Don’t you test your releases?
Yes, we got this question a couple of times (not only) after this release. The short answer is: yes, we do. The longer question is a bit more complicated. We obviously do test all the changes before they are committed to the development branch. These changes then land in the daily builds that you get if you have configured your update channel to the development one. Not many users have it enabled, which is a good thing, but it’s still being regularly tested on most of the boards and configurations that Home Assistant OS is built for. However, with the modularity of various platforms, there can be edge cases that we can’t test. For example, the popular Raspberry Pis can be running from a plethora of SD cards, USB drives or NVMe disks. The situation is even worse with the generic x86-64 target - people are using anything from 15 year old PCs to setups with the latest top-of-the-line processors. This is something that we can’t test and we can only rely on the Linux kernel being working on these properly. Even then, some more exotic configurations need some tweaks first before our OS can run on them. On the other hand, simple platforms with lower modularity can be tested more thoroughly, so if you want plug and play experience, there’s a strong argument for using well-constrained platforms, like the Home Assistant Green.
Moreover, approximately 1% of users usually update to beta (release candidate) before a stable release is made. Even though this doesn’t seem like a lot, there are currently over 3000 users who are helping all of us to make sure the stable update won’t cause any trouble. Ideally we would love to motivate more people to participate in beta testing, as we can be better focused on fixing the edge-cases than when hundreds of thousands people update and start running into sometimes unrelated issues.
How can you help?
First of all, if an OS update doesn’t go as planned, don’t panic. Most of the time you will be able to get exactly the same system as before the update thanks to the A/B boot mechanism. If an issue occurs, check our issue tracker first to see if it has already been reported. If it is, be patient and try to be helpful with gathering more details about the bug. Usually, posting “me too” to the issue isn’t helpful. It creates unnecessary noise for the people tracking the issue and doesn’t provide any new information. Also keep in mind that most of the bugs (with a few rare occasions) are specific to different builds - if a Raspberry Pi fails to boot, it’s pointless to report problems with your Proxmox VM to the same issue. The bug report template asks you to pull out some information from your installation that is necessary for better identification of the bug.
Also, the general recommendation of making regular backups applies. The new backup system makes it much easier to create off-site backups using some of the backup integrations.
What’s next?
We are releasing 15.1.rc1 today which addresses all the major issues that you reported for 15.0. You can help testing this pre-release without going all-in and joining the beta channel by updating to it directly using ha os update --version 15.1.rc1
. Create a GitHub issue if anything still doesn’t work for you.
For future releases, we’ll focus on making troubleshooting easier, as we realize that Linux knowledge shouldn’t be a must if you want to run Home Assistant OS. Raspberry Pi models that are using the EEPROM bootloader will get a native Home Assistant OS tooling for updating it, so you won’t need to fiddle with spare SD cards or boot Raspberry Pi OS to get latest updates anymore. And of course, we’ll be investing more effort into automated testing to minimize breakages, but as it’s been already said, in many cases your patience and cooperation is very important too.