[HAOS] Add support for overlay/union FS

The current approach of HAOS, while it works, is quite lacking in resiliency. For all intents and purposes, the user needs to do multiple manual steps, regardless of their install style (on a RasPi/SBC, on an x86 board, in a VM), and overall it’s cumbersome to even consider a reinstall.

This could be simplified to a great extent by following in the footsteps of OpenWrt, and Android. Both operating systems have a concept of a separated, immutable base OS, and the user data.

Specifically, I’d like to advocate for two distinct features of these two to be added to how HAOS is structured:

  • OverlayFS/UnionFS (or any similar overlay/union/layer file system implementation) separating the OS file system and the overlaid user data, akin to how OpenWrt works
  • Multi-image handling for the OS file system, similar to Android’s “dynamic partition layout”, where the OS file system is mounted from an image stored on a separate partition

To implement this, I had a few key points thought out:

  • The system image downloaded for HAOS would contain a single partition with one OS image (e.g. an EXT4 or SquashFS blob), instead of the current root partition
  • During boot, the kernel would mount this partition, and look for a configuration file. If this config file is not found, the initial setup wizard is launched by mounting the latest available OS image (semantic versioning in file name solves this). If the config file is found, it is used to find both the OS image as well as the overlay partition, mount them, and boot into a working system
  • The install wizard would begin with selecting the overlay partition - a basic partition manager would need to be written for this purpose to allow the user to format a disk right in the wizard. If the selected partition contains an existing HA install, the rest of the process could be skipped and a new config file is generated (kinda like how a backup would work)
  • If an empty partition is selected, the install wizard continues as it would right now, writing the partition information onto the config file
  • System updates - these become a bit more simple as all the Supervisor needs to do is download the new rootFS, put it on the aforementioned holder partition, and mark it in the config as the new image to use

The format of this config file could simply be that of a standard ini file format:

[haos]
data_partition = {GUID_OF_PARTITION},fs=btrfs,otherparams=1
system_image = haos-x86_64_7.1.0.0.squashfs
update_image = haos-x86_64_7.1.1.0.squashfs

In this scenario, the update_image variable, if present, would be used on boot. If HAOS can boot successfully and managed to initialise the system, the update_image variable is removed, and system_image's value is replaced with the 7.1.1.0 image version. If HAOS fails to start up for any reason, that boot log is saved on this partition, the update_image variable is removed, and the system is rebooted into the previously working state.

This configuration could be further extended to e.g. include the default IP address to use, and other default configuration options that would be used either on first launch, or in case the data partition cannot be accessed. In fact I’d support writing the bare minimum (i.e. no integrations, just the answers for the initial wizard) configuration into this file as well, to support a sort of “safe mode” boot option for when the data partition cannot be accessed for any reason.

Before even considering the merits of the proposal, there is a major push-back I see: you need a bespoke kernel per device, as OpenWrt and Android actually have.

Supporting dozens of kernels is probably not in the scope of Home Assistant.
Actually, it’s my opinion that HAOS is not in the scope of HA either. I understand they want to easy the entry level for non-techies, but that’s a “side job” that probably takes too much of their limited dev resources.

Uhm, actually, apart from architecture-specific builds (so, in this case, an x86, an x86_64, an armhf and an arm64), you don’t need “specific kernel per device”, since we’re not talking about embedded devices on the level of a smartphone or router, with limited space. The most limited unit here would be the Raspberry Pi (or similar ARM-based SBCs), but even those will use more or less generic kernels.

Besides, the kernel isn’t even the important part here! As long as it has the appropriate drivers for HAOS (so, device drivers for e.g. ZigBee/Z-Wave/WiFi/BLE dongles, etc.; and more specifically, file system drivers for the union/overlay/merge FS, plus the FS driver for w/e file system one would use for the data partition), things should work just fine.

And I would argue that HAOS not just in the scope of HA, but HA actively depends on it. Sure, you can run HA in a Docker container, but that won’t allow you to use a lot of integrations, which, incidentally, also require Docker. Sure, one could build an approach where the main HA container gets the Docker socket and manages other containers akin to e.g. Portainer… But the current Supervisor approach is built with the “full system control in my hands” approach, and a rewrite would be quite costly with very little benefits, if any at all. So HAOS (and the Supervised install option) is needed for a fully fledged install, especially with the recent official hardware support releases…

Point is, if there is no existing distro including all of these in their kernel, you need to build your own, hence bespoke. There is a reason why they focus on Debian 11 as their base: They know exactly what is in the kernel.

???
You’re confused, here. HAOS / Supervised only brings Addons, which are piece of generally available software (e.g. mosquitto, mariadb, nginx) that are bundled by HA (once again, shouldn’t be their focus) for the ease of less tech-savvy users.

No integration depend specifically on any addon (that I can think of) and every single software bundled as an addon can be installed any number of other ways.

HA is a single docker image or pip install of core/frontend. All the rest is “ecosystem”, or stuff made for people not able to manage a linux system.

Debian has already been ditched for HAOS in favour of Alpine… And Alpine is specifically made to be easily recompiled to fit any purpose, and be (mostly) kernel independent, since it’s meant as a container distro (i.e. something you’d use as a base of a Docker container instead of a fully fledged Ubuntu or Debian).

Okay, small correction - not using HAOS/Supervised will not let you use a number of integrations easily. HA is still meant to be something for the end user, not just tinkerers. So if something depends on running software in a separate container (instead of letting the user set up said software on a separate host as they wish, and connecting to it via an integration), that needs to be, in fact, part of the HA install. Which you can’t do without Add-ons.

But you’re still arguing that focus is unjustly on the ecosystem, not the core. Which in my opinion is stupid, simply because the core of HA is nothing more than an event bus combined with actionables (everything within the core of HA either provides events to the bus, or reacts to events from the bus). The ecosystem is what makes HA actually usable, and in fact I’d be in full support of ripping out all the current integrations from the core of HA, and hosting them in separate repos similar to add-ons (combined with an easily cached “device discovery” pattern that would allow newly discovered devices to be recommended - e.g. a Nanoleaf integration could have an mDNS “device discovery” pattern stored, and when that pattern is recognised during an event, HA could recommend the user to install and configure that integration, instead of carrying it all within). But we’re veering off-topic here.

My recommendation is still for HAOS, and you’re still complaining that HAOS exists at all. Which, in my opinion, has no place in this thread, and if you’re actually annoyed by the existence of HAOS, you should open your own thread and discuss it there.

You’re super confused :wink:
HAOS is Debian 11, which runs docker. The HA docker image (exactly the same for all kind of installations) uses Alpine…
In case you don’t know, a docker image uses the host kernel, that’s what differentiate them from a VM, basically.

As I said, nothing in HA requires to have an addon running “instead of letting the user set up said software on a separate host as they wish”

Your opinion. Fair enough.
Fact is still you don’t need addons for anything.

Totally agree on that one.
That’s what HACS is, basically, and that needs to be extended to all non-essential integrations.

I see architecture/adr/0019-GPIO.md at 780ea1b180263fce7b08805c019dd580f3db6fe2 · home-assistant/architecture · GitHub as an eureka moment, and I expect to see more like this in the future.

Actually, we’re both wrong - Buildroot is now being used for HAOS, not Debian. But I was still correct about Debian being dropped as a base. And nonetheless, it still allows for incredibly configured setups for the OS, including support for overlay/union/merge file system mounting.

True, nothing in the core of HA requires it. On the other hand, a usable HA install does require addons. Just like how a Linux install does not need GNU tools, but to make it usable… You kinda do need to rely on them.

Oh, indeed, I stand corrected.
So they already build their own kernel, which makes the proposal viable indeed.

Nope, really not, sorry to insist. If you’re capable to install your own MQTT broker, mariadb, or whatever external tool you might need, you really, really do not need an addon to do it for you.

Really, with cherry on top.
As the roughly 25% users not using HAOS nor Supervised could testify…