Behind the scenes of the "move data disk" feature

I got the following questions which are unanswered so far, even after searching for 40 minutes (also this is unanswered so far, so I’m copying those interesting questions here too):

What is it about?

This feature:

What are the open questions?

  1. What data exactly will remain on the SD card (and therefore what will be outsourced) once the move data disk feature has been applied?
    Docker containers (addons), which parts of the OS, what about the Supervisor etc. - as precisely as possible please.
    Assumption: I imagine HA OS data will stay on the SD card, everything else will be moved to the external data disk.
  2. Can this be undone, so can data from an outsourced data disk be sourced back in (everything back on internal SD card)?
    Or is this feature a one way road?
  3. What happens once I would restore a full backup: will I need to have an external data disk attached for a successful restore or would it be possible to restore everything on the internal SD card only?
    (partly related to question #2)
    (not sure about HAOS Data Disk how to restore)

What’s my current setup (background of these questions)?

I’m running HA OS on a Pi 4 with 8 GB memory on a microSDXC Class 10 A2 sd card (SanDisk Extreme Pro) for over 3 years.

  • Quite stable and reliable
  • Only very few incidents which might possibly be related to storage (like Sudden HA Core restart after 72 days of uptime - looking for the reason, never found the root cause).
  • No issues during/after HA OS updates like most / many - definitely a lot of - users who run SSD boot do (see Issues · home-assistant/operating-system · GitHub)
  • I tried to migrate to SSD boot several (okay, exactly two) times and all those attempts failed, supposedly because of a power issue (old SSD juicing too much power, power one Pi 4 USB 3 port can not provide). Today I’m kind of lucky I don’t run a SSD boot setup - because of all those update nightmares listed above.
  • Of course performance is a thing. The current bottleneck is definitely I/O and therefore the CPU load. CPU needs to wait quite a lot for the disk in heavy load situations (backups, reading a lot of LTS/statistics data etc.).
  • Of course I create backups regularly and store them also remotely (a backup on a corrupt storage disk is not worth much)
  • Recently my system gained a significant performance boost - a boost I still can not explain where it comes from - it still is a complete mystery:

What’s my target / motivation?

So I’m hoping to
a) preserve that stability of the “everything on the internal SD card” setup I’m used to for years while at the same time
b) improving performance (and storage capacity)

Of course adding another component basically does not add to target a) - what can fail, can fail. The NVMe itself, the controller of the NVMe-to-USB-C enclosure, the power supply of the enclosure/SSD etc. etc.

Therefore I plan to at first connect the external SSD and run some kind of benchmark, both in terms of performance as well as in reliability (let it read/write data for few hours and see if there are any hiccups) to estimate if there are any potential issues, before migrating the data.

I really need to answer those 3 questions before I’ll consider to use this feature at all. That’s mainly because the current setup runs very stable and reliable, something many users (maybe prejudices from the past or facts based on poor quality SD cards) say which is not possible.

1 Like

Hi,
These threads might be of use:

From my greybeard sysadmin understanding:

  • What data exactly will remain on the SD card
    Unless you start a fresh HAOS install to a specific disc (which seems to be your existing config…), only the user data partitions will be copied onto the new “data disk”. All HAOS, HASS, and code will remain as before.
    For a worked example see this post with partition data.
  • Can this be undone
    At the basic level, disconnect the “data disk” and reboot - the old data is still there, so without the newer version, it reverts.
    Remember HAOS is an appliance, so to undo properly, backup, wipe, reinstall, restore.
  • What happens once I would restore a full backup
    HAOS seems to work at arms length from HASS - HAOS sets up which physical partitions are mounted, and HASS restore restores multiple tar archives to the file system, regardless of the actual mount points.

Have I tested this with references to the HASS admin scripts in GitHub? Nope!

Basically, these are educated guesses from what mount looks like from HASS userland without developer shell access to the host HAOS.

I’d create a test rig, and do the things you are interested in to confirm what you need to know. A RPI3b+, two uSD, and one USB reader would be enough for this.

If this helps, :heart: this post!

4 Likes

Have you tried this? When I did it, the CLI just said waiting for hassos-data

Would be useful if this was the case though, maybe a way of backing up to this old data disk and in case of failure of the new one, revert to it for redundancy.

I bumped on the same situation.
I actually don’t remember what keys I pressed (like Ctrl + Alt + F2, Ctrl + F1 and so on) but after some of them I got a usual login promt. I entered ‘root’ as login and it authorized me without any password.

Then I run ls /dev/disk/by-label/ and saw ‘hassos-data-old’ disk here. But HA requires ‘hassos-data’ disk, so I renamed ‘hassos-data-old’ to ‘hassos-data’ with the following command:

e2label /dev/disk/by-label/hassos-data-old hassos-data

After reboot I got system running as before moving data disk!

6 Likes

So the move data disk feature did not work for you? Would you provide some details on how you ran into this situation? I think for exactly those cases they leave the data on the internal storage, probably completely untouched next to changing the label.

But this way you did me/us a real favor: it seems like you answered questions 1 and 2 of

So it’s really „just“ the label of the partition HA OS is using to mount the data disk… quite handy. I would have expected to use a UUID or something similar specific.

Anyway, wondering if it’s therefore possible (aiming at answering initial question 3) to copy the data partition from an external disk back to internal storage. Probably with some steps like

  1. make sure target partition/storage (internal) is of same size or larger as the source (external)
  2. copying over with some dd magic
  3. rename the partition as shown in the previous post
  4. restart HA OS

Does that make sense? Unfortunately I (still) don’t have a spare/testing system to answer this on my own.

Sorry, my bad.

“The situation” in my case was the same as in @Morphy last reply.

I successfully moved data disk to new virtual disk on my ProxMox env. Then I realize, that I can just increase the size of “old” data disk, so, actually, I don’t need a new one.
I do the same as @Morphy: detach “new data disk” and run machine. And got the same message said “waiting for hassos-data”. That’s what I meant.

I ask google how to Undo move data disk and it leads me to this page. So after I found the solution I decided to post it here, in case some one will search the same solution.

And IMO, yes, HA uses disk labels to resolve some of its disks (I saw at least three disks with labels prefixed “hassos”).

Anyway, wondering if it’s therefore possible (aiming at answering initial question 3) to copy the data partition from an external disk back to internal storage.

I see no reason why your solution wouldn’t work :upside_down_face:

Still could not test this so far. But I saw there were some changes in the Supervisor regarding discovery of external disks in 2024.4 releases aiming at improving things.

Adding to this thread - I’m recent HA Green user, it worked fine for couple weeks. Yesterday I’ve connected 750GB storage to it and used “Move data disk”, it showed the correct name of the device… but it never booted back up. I connected the disk to my PC and I do see the ext4 partition called hassos-data, so I guess this part work. Disconnecting the disk and power cycling the HA green didn’t start it up properly (it responds to pings though).

I just take the hardrive out from my HA-Mini-PC, put it in my Linuxbox, rename the drive, and put it back…works perfect :wink:

I get “zsh: command not found: e2label” when I run that command in Terminal? Do I have to run it direct connected to my NUC that is running HAos?

Giving my two cents here.
I run HAOS on a RPi 4 with a 32GB SD card. Backups go to a NAS I have so don’t take space on the data disk, although after 2 years of running and a few dozens of devices accumulating data, I was running out of space, so I plugged in an external 500GB drive and moved the data disk. Unfortunately the additional disk wouldn’t allow me to get other USB devices to work (namely a Google Coral TPU, because of lack of power, no powered usb hub at hand). So I started to look at how I could roll it back and finding posts stating it’s not possible. Although I came up with a simple solution which worked perfectly for me:

  • got a bigger SD (Sandisk 256Gb)
  • flashed HAOS
  • done a full backup of my current HA
  • replaced the SD
  • removed the external drive
  • restored the backup on the new installation on the new SD

and everything worked seamlessly.
Now I have plenty of space on data disk, backups still on NAS, Coral perfectly working again.

I guess the same approach can be taken with virtualized environment.

1 Like

In my case, I’m running a Dell Optiplex 3050 Micro with an internal 128G SATA SSD drive internally that was starting to run low on space because of backups. I put a pretty slow 5400 RPM USB drive on and moved the data disk. That was a mistake. My boot time is now forever … many minutes.

So, I’ve installed an internal M.2 PCIe SSD and will be moving the data disk to that.

One problem that I encountered is the network interface name changed upon boot after installing the M.2 stick. The original interface was enp1s0 but changed to enp2s0 so HA would not come up.

Some research show the internal NIC uses PCI. I suppose a way to fix this would be to hook a USB gigabit NIC to the box, but I didn’t want to complicate it, and it would have changed the interface name anyway.

From CLI I was able to use the network update command to add the original IP address (just the IPv4 address, not the GW, etc) that was on enp1s0 onto enp2s0 and issues core restart. HA then came up.

Once back into HA I was able to finish setting the gateway and DNS addresses, and confirmed my NC remote connection worked as well and the cloud type integrations.

I’m in this same situation. I have a spare older NVMe drive I could use, but I’d have to connect it via USB (for now) and was thinking the “Move Data Disk” button would:

  1. Copy everything over including the OS and configs.
  2. Update my bootloader of my Pi.
  3. Tell me to remove the microSD card when it completes and restarts.

There’s no documentation for this feature in Home Assistant itself, so it’s very vague. All I want is to have more data tracked for longer.

My understanding is “Move Data Disk” WONT copy over a boot loader, but with the prospect of upgrades from CM4 to CM5 modules, a new install of HAOS 14 will ignore eMMC and install both OS and data on SSD to make upgrades easier (e.g. just move the SSD over).

The change in HA OS 14:

My Yellow has the OS images on eMMC, and user data (move data disk) on the NVMe SSD. This suggests I’ll need to backup, swap CM4 → CM5, reinstall a new HAOS to the SSD, then restore (means you can save £10 on a CM5 Lite with no eMMC).

And don’t use the CM4 bolts, and apparently the USB-2 ports can’t be used on a Yellow for a CM5, so you need to install via the USB-C serial console or perhaps using a NVMe carrier on another machine.

1 Like

Hello, I have an additional question.

How long should the process take?

I ask because the text in HA said 20 minutes, but it’s been 80 minutes and I’m still waiting for HA to restart.

Thanks!

I ended up making a backup and putting that on an NVMe drive imaged with Home Assistant.

The trick was writing the USB bootloader to a microSD and running that first. I waited some time and took it out, then it booted from the NVMe drive over USB.

That image is available in the Raspberry Pi imager app.

That way you probably aren’t running hybrid mode but booting completely from your NVMe disk, right? That’s not what the move data disk feature is made for - and there are good reasons to not use a SSD as boot device, considering the issues for many board users on almost every HA OS update.

@rwelsh09 so how long did the progress actually take? What is the size of your original/source disk (storage used)?

Note the initial HA OS install process has changed to use ONLY a NVMe drive for ALL of the both the OS and HA if both exist - this is apparently to make upgrades from a CM4 to a CM5 just moving the drive across.

1 Like

How big is your long-term statistics database and /media contents?

It depends…

What are the reasons? Saying “there are good reasons” is why I used an NVMe SSD as my boot drive. I couldn’t see anything wrong with it. I have Windows on those same drives.

Reasons to go NVMe:

  1. Make upgrading from a Raspberry Pi 4 to 5 much easier as @FloatingBoater noted.
  2. Faster boot times. With the number of updates Home Assistant releases, it happens more often than you think.
  3. More reliable storage. While my 32GB microSD card was designed for longevity even when it’s been written to a bunch in security cameras, it’s not gonna be as good as an NVMe drive many times its size that wasn’t used much.
  4. The ability to store stats much longer. It’s only 1.5 weeks by default, but I’d like at least 3 months. That’d also make it easier to debug issues like “when was this device last online?”.