Backup/Snapshots no longer decryptable?

Is it just me?
I just realized in horror, that I can no longer decrypt my backups manually, for some time now. A backup from January 27th 2019 still works. the next backup from Febuary 2nd does not work anymore. In between I had to “wipe&restore” from backup, so this might have something to do with it.
I am always up to date with HASSIO so currently I run 0.89 (since this morning, before .88.2)
To reproduce, I
1: Created a manual encrypted backup through GUI.
2: Download backup to Windows 10 Desktop
3: UnTar with 7-Zip
4: gunzip one of the contained .tar.gz-Files with 7-zip and entry of encryptionKey: Error “Can not open, not an archive”

The same steps are successful on the older backup I mentioned before.

So can anyone confirm this observation?
If not, do you have any idea how the “wipe&restore” might have caused this behavior or how to fix it?

[UPDATE 20190308] It seems the dates were wrong. The backup from January 27th is already broken. The previous backup from January 6th however is working.

I just checked mine from last night and I can manually decrypt ok

I’m using something called Universal Extractor (v1.6.1). It’s old but I’ve been using it for many, many years.

Thanks for your feedback.
Since I can extract and decrypt the older backup without problems I don’t think it’s a problem with the tool used.

I frequently extract the odd file from a backup and never seen any issue - but I don’t encrypt them with a password either.

ah… hang on, neither do I.
Maybe I misunderstood. I took encryption to just mean the compressing side of things. My mistake.

So no one uses the builtin encryption with backups?

Could someone possibly just try it once to verify?

According to https://github.com/home-assistant/hassio/blob/dev/hassio/utils/tar.py
HomeAssistant’s snapshot is tar archive that consists of SecureTar archives.

SecureTar is a HomeAssistant abstraction.
It looks like normal tar.gz archive but it isn’t - that’s why you can’t just uncompress it.

First 16 bytes - is a salt
Then goes AES 128 CBC encrypted data. (the data is a tar.gz archive)
Key = your password.
IV calcultaed based on salt and key by following code:

def _generate_iv(key: bytes, salt: bytes) -> bytes:
    """Generate an iv from data."""
    temp_iv = key + salt
    for _ in range(100):
        temp_iv = hashlib.sha256(temp_iv).digest()
    return temp_iv[:16]

I can’t decrypt it via command line tools because of too complex iv calculation function. Maybe we can just use python to decrypt it.

Do you think this changed with a release around January this year? Because I have no problems decrypting backups from before January 2019.

As long as I can actually restore from an encrypted backup this would be ok. Will give it a try. I would much prefer to have access to the files in the backup though.

Hey! Any updates? Is there any progress in decrypt?:slight_smile:

I have to admit, I didn’t have the nerves to test it and just changed my backup to unencrypted.
But since you insisted I just tested it now :wink:

Yes, I was able to restore the encrypted backup. I am just not able to decrypt it manually with 7-zip.
Sometimes I wisch I could access single files from my backup.

You can if you don’t encrypt them.

For those who are still looking for a solution: i have created a simple python script to decrypt a secured snapshot. Off course it would be nicer to have the option to extract a snapshot from the ‘hassio’ tool, but this works. You will need to have python3 installed and the cryptography package.

#!/usr/bin/env python3

import sys
import getopt
import hashlib
import tarfile
import glob
import os
import shutil

from pathlib import Path

from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.ciphers import (
    Cipher,
    algorithms,
    modes,
)

def _password_to_key(password):
    password = password.encode()
    for _ in range(100):
        password = hashlib.sha256(password).digest()
    return password[:16]

def _generate_iv(key, salt):
    temp_iv = key + salt
    for _ in range(100):
        temp_iv = hashlib.sha256(temp_iv).digest()
    return temp_iv[:16]

class SecureTarFile:
    def __init__(self, filename, password):
        self._file = None
        self._name = Path(filename)

        self._tar = None
        self._tar_mode = "r|gz"

        self._aes = None
        self._key = _password_to_key(password)

        self._decrypt = None

    def __enter__(self):
        self._file = self._name.open("rb")

        cbc_rand = self._file.read(16)

        self._aes = Cipher(
            algorithms.AES(self._key),
            modes.CBC(_generate_iv(self._key, cbc_rand)),
            backend=default_backend(),
        )

        self._decrypt = self._aes.decryptor()

        self._tar = tarfile.open(fileobj=self, mode=self._tar_mode)
        return self._tar

    def __exit__(self, exc_type, exc_value, traceback):
        if self._tar:
            self._tar.close()
        if self._file:
            self._file.close()

    def read(self, size = 0):
        return self._decrypt.update(self._file.read(size))

    @property
    def path(self):
        return self._name

    @property
    def size(self):
        if not self._name.is_file():
            return 0
        return round(self._name.stat().st_size / 1_048_576, 2)  # calc mbyte

def _extract_tar(filename):
    _dirname = '.'.join(filename.split('.')[:-1])

    try:
        shutil.rmtree('_dirname')
    except FileNotFoundError:
        pass

    print(f'Extracting {filename}...')
    _tar  = tarfile.open(name=filename, mode="r")
    _tar.extractall(path=_dirname)

    return _dirname

def _extract_secure_tar(filename, password):
    _dirname = '.'.join(filename.split('.')[:-2])
    print(f'Extracting secure tar {filename.split("/")[-1]}...')
    try:
        with SecureTarFile(filename, password) as _tar:
            _tar.extractall(path=_dirname)
    except tarfile.ReadError:
        print("Unable to extract SecureTar - maybe your password is wrong or the tar is not password encrypted?")
        sys.exit(5)

    return _dirname

def print_usage():
    print(f'{sys.argv[0]} -i <inputfile> -p <password>')

def main():
    _inputfile = None
    _password=None

    try:
        opts, args = getopt.getopt(sys.argv[1:],"hi:p:")
    except getopt.GetoptError:
        print_usage()
        sys.exit(2)
    for opt, arg in opts:
        if opt == '-h':
            print_usage()
            sys.exit()
        elif opt in ("-i"):
            _inputfile = arg
        elif opt in ("-p"):
            _password = arg

    if not _inputfile:
        print ("Missing inputfile")
        print_usage()
        sys.exit(3)

    if not _password:
        print ("Missing password")
        print_usage()
        sys.exit(4)

    _dirname = _extract_tar(_inputfile)
    for _secure_tar in glob.glob(f'{_dirname}/*.tar.gz'):
        _extract_secure_tar(_secure_tar, _password)
        os.remove(_secure_tar)

    print("Done")

if __name__ == "__main__":
    main()
36 Likes

Thanks @Taapie !
This deserves to be pinned up somewhere - or better yet, included in the core toolset. After much digging around to figure out what this “diy crypto” format was that they’d introduced, I was about to start writing my own script until I found yours.

Appreciate your taking the time to write it, make it human-friendly (usage message, getopts etc.) and share it with us.

For anyone with their brain in neutral like I had for a moment: this script operates on the outer archive, not the individual .tar.gz files you get when you naively untar.

2 Likes

Thank you soo much for this! My password was not correct (typo) but this script helped me to find it by brute forcing it. Thanks again!

I need some help. How did you use the script to brute force yours?

EDIT: Never mind, I got it working after much hacking on windows. I was able to confirm that the password I thought should be right for the snapshot. Now I need to figure out why its not restoring

1 Like

Thank you for the script.

For all those struggling with password with special characters like ` or $ and using bash to run the script - use single quotes after -p, so […] -p ‘p`a$$word’ - single quotes will make parsing those skipped. :wink:

Pretty neet indeed.

However it failed in my case.

Maybe something’s change since the writing of the script in 2019?

I systematically got a “Unable to extract SecureTar - Maybe your password is wrong or the tar is not password encrypted?” message even on freshly created backups .

Scratch that: I’m an idiot and was using the wrong password. :blush:
The script is great and useful but I’ll avoid encryption in the future still.

This doesn’t work for me. Has something changed since 2019 or am I using the script in the wrong way or for the wrong purpose?

I’ve downloaded a tar archive backed up to Google Drive by the Google Drive Backup add-on. That archive unpacks without issues. Inside it is a number of tar.gz archives, which I assume are encrypted, since tar can’t deal with them and file thinks they’re just “data”. However, the script can’t decrypt them:

%  ./decrypt.py -i homeassistant.tar.gz -p redacted
Extracting homeassistant.tar.gz...
Traceback (most recent call last):
  File "../decrypt.py", line 145, in <module>
    main()
  File "../decrypt.py", line 137, in main
    _dirname = _extract_tar(_inputfile)
  File "../decrypt.py", line 89, in _extract_tar
    _tar  = tarfile.open(name=filename, mode="r")
  File "/Users/ehn/.pyenv/versions/3.8.3/lib/python3.8/tarfile.py", line 1604, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

You need to extract the tar in this gz before running it through this program.