The current approach of HAOS, while it works, is quite lacking in resiliency. For all intents and purposes, the user needs to do multiple manual steps, regardless of their install style (on a RasPi/SBC, on an x86 board, in a VM), and overall it’s cumbersome to even consider a reinstall.
This could be simplified to a great extent by following in the footsteps of OpenWrt, and Android. Both operating systems have a concept of a separated, immutable base OS, and the user data.
Specifically, I’d like to advocate for two distinct features of these two to be added to how HAOS is structured:
- OverlayFS/UnionFS (or any similar overlay/union/layer file system implementation) separating the OS file system and the overlaid user data, akin to how OpenWrt works
- Multi-image handling for the OS file system, similar to Android’s “dynamic partition layout”, where the OS file system is mounted from an image stored on a separate partition
To implement this, I had a few key points thought out:
- The system image downloaded for HAOS would contain a single partition with one OS image (e.g. an EXT4 or SquashFS blob), instead of the current root partition
- During boot, the kernel would mount this partition, and look for a configuration file. If this config file is not found, the initial setup wizard is launched by mounting the latest available OS image (semantic versioning in file name solves this). If the config file is found, it is used to find both the OS image as well as the overlay partition, mount them, and boot into a working system
- The install wizard would begin with selecting the overlay partition - a basic partition manager would need to be written for this purpose to allow the user to format a disk right in the wizard. If the selected partition contains an existing HA install, the rest of the process could be skipped and a new config file is generated (kinda like how a backup would work)
- If an empty partition is selected, the install wizard continues as it would right now, writing the partition information onto the config file
- System updates - these become a bit more simple as all the Supervisor needs to do is download the new rootFS, put it on the aforementioned holder partition, and mark it in the config as the new image to use
The format of this config file could simply be that of a standard ini
file format:
[haos]
data_partition = {GUID_OF_PARTITION},fs=btrfs,otherparams=1
system_image = haos-x86_64_7.1.0.0.squashfs
update_image = haos-x86_64_7.1.1.0.squashfs
In this scenario, the update_image
variable, if present, would be used on boot. If HAOS can boot successfully and managed to initialise the system, the update_image
variable is removed, and system_image
's value is replaced with the 7.1.1.0
image version. If HAOS fails to start up for any reason, that boot log is saved on this partition, the update_image
variable is removed, and the system is rebooted into the previously working state.
This configuration could be further extended to e.g. include the default IP address to use, and other default configuration options that would be used either on first launch, or in case the data partition cannot be accessed. In fact I’d support writing the bare minimum (i.e. no integrations, just the answers for the initial wizard) configuration into this file as well, to support a sort of “safe mode” boot option for when the data partition cannot be accessed for any reason.