[HOW-TO] Recover the files from a failed hardware proxmox server

Hello everyone,

My HA setup and most of the components of my house (firewall, wifi, and so on…) was installed on a NUC clone with an i7, 32Go RAM and M2 256Go SATA SSD. Everything was done under Proxmox using VMs or LXD. This was very efficient to manage and orchestrate and I had a very good setup at the end.

However, this was the past, as a couple of days ago, the server died ! Apparently it’s a motherboard problem and I send it back for repair. This is where the fun start : no internet at home because no firewall and network routing, no HA and automation anymore, … And as I need to wait for at least a month to get a replacement computer back at home, I started to figure out how to recover the data and put in place a temporary setup.

This is a summary of my journey, written as a how-to to help everyone who would face the same trouble.

  1. Restart the local network

First step for me was to put back my internet provider box to router mode (it was bridged to the server), and restart the wifi from it.

I configured the wifi to the same SSID and the same password than my VM access point was configured. This way all my devices should connect automaticaly without having to change the password or reconfigure them.

  1. Find an adapter for the NUC hard disk

The disk is a SATA M2 (not PCIe/NVMe). As I had a SATA to USB adapter, I bought a second adapter for M2 SATA disk from Amazon. With this new adapter, I’m now able to plug my disk to my disk to my main computer over USB as an external drive.

  1. Connect it to the computer to access the Proxmox filesystem

First of all, I’m using Windows as it’s my “work” computer and I can’t add a dual boot to Linux on it. This is fine for me as I had a dedicated linux VM on the server to work with it most of the time using SPICE.
However, Windows is a big problem here: I can’t mount easily the disk on my computer. As I already use WSL2 for work, I’ve started looking at it to recover my data, but at the moment it’s not possible to do it.

I ended up installing Virtualbox and added a Linux VM with the following configuration. I installed Manjaro XFCE as I’m using Arch linux since more than 10 years and Manjaro is very easy to install quickly, but feel free to choose the one your prefer.

Be sure to have installed the extension pack of Virtualbox and configured USB3 to the VM. Then plug the drive to the USB port of your computer and add USB Passthrough of the drive to the VM. The partitions appears in the file manager of the Linux VM. Perfect, we can start from here !

The 8,6Go is the EFI partition and the 62Go is for the Proxmox system. However, my proxmox install was done using the standard settings with LVM-thin to store the VM disks. Even if this is very convenient for snapshots and thin-provisioning, it’s harder to get the data outside of Proxmox… Here is how to do it.

Open the Terminal, start by updating your packages and install lvm2.

sudo pacman -Syu
sudo pacman -S lvm2

Then you can have a look at the lvm systems available on your system by doing in the Terminal:

# Scan the system to search for an LVM volume group
sudo vgscan

# Activate the volume group named "pve"
sudo gvchange -ay pve

# List it's volumes
sudo lvs
sudo lvdisplay

  1. Mount the HA VM disk (using LVM) and save your data

As the volume group was activated at the previous step with the vgchange command, we should find /dev/pve in the device list.

As a side note, I’ve installed HA on top of a clean debian VM using the Supervised install method. So inside the disk that we will mount, we should find a standard Linux partition scheme and thus, this tutorial is generic and should work with any VM LVM volume.

# List the virtual disks devices and find the one of your HA VM
ls -l /dev/pve

In my Proxmox setup, HA was having the VM number 200. So the device mountpoint is /dev/dm-10 (keep it for later).

As it’s a VM filesystem, this LVM volume have it’s own partition scheme inside and you will not be able to mount it directly. Use cfdisk to look at what is inside and find the partition name.
In my case it’s /dev/pve/vm-200-disk-0p1 as you can see on the next screenshot from cfdisk.

sudo cfdisk /dev/pve/vm-200-disk-0

Then, using kpartx from the multipath-tools package will help you create the device mapper entries corresponding to the partitions seen with cfdisk.

sudo pacman -S multipath-tools
sudo kpartx -a /dev/dm-10 # Use the mountpoint associated with the LVM volume

# To verify everything is ok, have a look at the entries created
ls -l /dev/mapper/

Then, you can mount your virtual disk partition as read-only in a newly created folder. I choose to let it read-only to avoid doing a wrong manipulation inside the disk as my goal is to put back the disk in place as soon as I got back my NUC to let everything start as it was just before the crash.

sudo mkdir /mnt/ha-recover
sudo mount -o ro /dev/dm-14 /mnt/ha-recover

You can now access the folder needed to do a backup of all your config. If you’ve done a backup recently, you can even get it directly instead of saving all the config file and then restore it on a clean install.

Next step is to share a folder between your virtualbox VM and you computer to finally extract everything from the M2 SSD and restore it on another machine. To do so, you can follow this great ressource from @kanga_who here on github.
In my case, I put back my old RaspberryPi 3 in place, waiting for my server to come back freshly repaired !

At the end, when you have finished to extract everything, don’t forget to unmount the drive, destroy the virtual device mapper and deactivate the LVM volume group to be able to disconnect the drive from the computer without damage.

sudo umount /dev/dm-14 # Unmount the Internal partition of the LVM volume
sudo kpartx -d /dev/dm-10 # Destroy the device mapper associated to the LVM volume
sudo vgchange -an pve # Deactivate the LVM volume group named 'pve'

That’s all ! As it took me a couple of hour to make it work, I put it here in the hope it could help someone else in the same situation !

Have a great week-end :grinning:

2 Likes

Thank you for this. worked a treat.

My NUC died and my backups were stuck at July last year (appears I had an issue with them after July that I was unaware of)

Owe you a beer.

1 Like

Backup, backups, backups. On remote storage. Everyday !
Backup VMs, but also backup files, for instance Home Assistant backup.

Plus, I’ve configured a mini proxmox cluster @home: 2 physical nodes and one Rpi which acts as a witness.
All my VM are backupped on the remote host and also synchronized. In case a proxmox node failure, all my VM start on the other node in a few seconds.
You don’t need an expensive server for the second node, a small Pentium NUC works just fine. It will cost 200€/$ but can save you days. :slight_smile: