Fix corrupted / malformed SQLite database (home-assistant_v2.db)

Good work!
For me it didn’t work with the “sqlite3 home-assistant_v2.db “.dump” | sqlite home-assistant_v2.db_fixed” command. Ended with an empty file.
What worked for me was “sqlite3 home-assistant_v2.db “.recover” | sqlite home-assistant_v2.db_fixed”
(result: almost same size as original and no errors when copied back to HA)

4 Likes

Thanks for this, will give it a go. Part of my problem is that I do have long-term values, especially from utility_meter values (and the underlying sensors) that I do not want to lose.

I see in my case the problem is in the events table – is there any way to only recreate the events table and keep the rest?

Delete all from the events table? or drop table and recreate? https://www.home-assistant.io/docs/backend/database/

1 Like

This worked perfectly – thanks! My recovered DB ended up about a third of the original, but
a) it works, and,
b) the state history is there, and so are my utility_meter values.

Thanks!!

That´s not necessarily because of loosing data, but possibly due to the former version not been repacked for some time. Purging only limits growth, repacking throws away old data according to the purge_keep_days setting (Recorder - Home Assistant).

Always great to come back to a topic started by oneself.

Had a foreign key constraint error resulting in some kind of “self protection” where the database copied itself to a home-assistant_v2.db.corrupt.2022-02-03T20:59:16.490886+00:00 file and recreated the home-assistant_v2.db from scratch - almost empty. EVERYTHING was gone, entities history of course and the Energy Dashboard was completely empty too - of course. What a shocking moment.

OK, I had two options:

  1. Short way: try to fix the foreign key constraint errors end restore consistency ==> potentially only loose some hours of data
  2. Long way: restore database from latest backup (nightly) ==> loose about 19 hours of data.

Long story short:

  • Option 2 is still the hard way. Step 3 of Fix corrupted / malformed SQLite database (home-assistant_v2.db) would probably work as it did for many many others already, but (fortunately) I didn’t need to go that path up to now as there’s always been another “corrupted/inconsistent, but not malformed” database file.

  • Finally HA devs are actively working on the core issue the SQLite database gets malformed during backup (because it’s a physical/file based backup during operation (HA still running and writing data to the db) - see 1.2 on How To Corrupt An SQLite Database File) which - still, in February 2022 - renders that file pretty useless in the first place (or for every normal user without fix knowledge).
    → Here we go, that’s where I put my bets for a few weeks meanwhile, closely watching every activity: https://github.com/home-assistant/core/pull/60874
    Needless to say a trustworthy backup mechanism which conserves a consistent database IS A MUST meanwhile (always was for history data addicted like me) because we now all have the energy dashboard and long-term statistics. All those great 2021 improvements/features rely on the database, so it’s not too early to finally make that groundwork imho. I think that should fit in 2022s year of “streamlining experiences”:

  • Ah yes, by the way: Option 1 was the path I went. Fast lane. Fixed the foreign key constraint issues (only 3 rows and I exactly knew why/what happened) in the corrupted database, shutdown HA, replaced the new/empty one by the fixed one, started up HA, and hello history, I missed you so much. Energy dashboard only calculates once a hour so it took until the next full hour to create the next hour graph.

To sum up:

3 and a 1/2 hours of unplanned and unnecessary restoration tasks and a few lessons learned later… I really hope not too many others will experience that “database malformed/corrupted, hell no what to do!” situation, won’t land here and won’t read this.

By the way, for me there are only two possible ways to go in terms of database (because as my HA grew I really really REALLY need to make sure my database is safe and backed up reliably - I did mention it, did I?):

  1. Either https://github.com/home-assistant/core/pull/60874 will finally make its way into a productive release so I can reestablish trust in homeasisstant container backups including the SQLite database.
  2. Or some experts at Migrating home assistant database from sqlite to mariadb - which is not yet a perfect but still the best migration guide out there! - will finally fix the last little bugs (energy dashboard related) for migrating from SQLite to MySQL so I can finally switch to another level of database management. Including a professional online editing (phpMyAdmin addon with logical backups/DB dumps instead of very simple SQLite web addon which doesn’t even provide a query history etc.), more control, more options, improved backup strategys etc…

Until that

  • I will do a ha core stop; cp /config/home-assistant_v2.db /backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).db; ha core start and copy that “working cause created offline” backup away from the HA server, just in case. I plan to do that at least once a month, probably before every major (monthly) HA update
  • I will also have a look into these logical backups
    • classical dump using sqlite3 /config/home-assistant_v2.db .dump > /backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).sql)
    • backup function sqlite3 /config/home-assistant_v2.db ".backup '/backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).db'"

Maybe I can automate one of these backups to run it regularly to reduce HA downtime by running a physical backup. Will have a look at what I will get and if consistency is fine. But if it would be that easy I’m sure HA dev team would have implemented it as backup subtask for homeassistant container years ago. Sounds promising anyway:

It's different from regularly file copying, because it takes care of any users currently working on the database. There are proper locks set on the database, so the backup is done exclusively.
3 Likes

Now it got me.

Was away for 3 days an when I came back I releazied the history just ends yesterday (2022-02-10).

Checking the config folder I found the home-assistant_v2.db.corrupt.2022-02-10..... and did run a

sqlite3 ./home-assistant_v2.db ".recover" | sqlite3 ./home-assistant_v2.db.fix

on the corrupted db. The process it self looked successful and the db was shrinking from 4.4GB to 3.8GB.

When starting ha core with the new (fixed) db file it just got detected as corrupted again :frowning:

So I bite the bitter apple and restored a db file from my last backup a week ago. First I was pleased as I could browse the energy statistics again while ha core was starting up (with the db from the backup) but as time ha was completely started (indicated on the web front down left with a message) the database from the backup was again detected as corrupted and the energy statistics I just browsed seconds ago are gone :bowing_man:


2022-02-11 23:14:45 WARNING (Recorder) [homeassistant.components.recorder.migration] Database is about to upgrade. Schema version: 23
2022-02-11 23:14:51 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT statistics_short_term.id AS statistics_short_term_id, statistics_short_term.created AS statistics_short_term_created, statistics_short_term.start AS statistics_short_term_start, statistics_short_term.mean AS statistics_short_term_mean, statistics_short_term.min AS statistics_short_term_min, statistics_short_term.max AS statistics_short_term_max, statistics_short_term.last_reset AS statistics_short_term_last_reset, statistics_short_term.state AS statistics_short_term_state, statistics_short_term.sum AS statistics_short_term_sum, statistics_short_term.metadata_id AS statistics_short_term_metadata_id 
FROM statistics_short_term LEFT OUTER JOIN (SELECT statistics_short_term.start AS start, statistics_short_term.metadata_id AS metadata_id, 1 AS is_duplicate 
FROM statistics_short_term GROUP BY statistics_short_term.metadata_id, statistics_short_term.start 
HAVING count(*) > ?) AS anon_1 ON anon_1.metadata_id = statistics_short_term.metadata_id AND anon_1.start = statistics_short_term.start 
WHERE anon_1.is_duplicate = ? ORDER BY statistics_short_term.metadata_id, statistics_short_term.start, statistics_short_term.id DESC
 LIMIT ? OFFSET ?]
[parameters: (1, 1, 998, 0)]
(Background on this error at: https://sqlalche.me/e/14/4xp6)

Any idea what’s happening here? :eyes:

I feel you. We know that DB files from the „backup“ are useless. They are working on it, but it might take months they finally fix this - after years. Short: forget the database backups.

I don’t know what else you can try. If the recover process did not work for you there might be no other option.

Maybe… run some offline consistency checks after the .recover, (quick check, full check and foreign key check). Perhaps you’re lucky and output tells you what exactly is wrong so you can dig into it trying to fix it.

By the way: looks like your database is trying to upgrade. Did you update HA meanwhile? So are you trying to use a database from a former HA release with a newer HA? I would try to have HA Core and DB at the same date.

I did this but I now also tried to restore backups (including full backups) with same ha & db versions. Unlucky me all database get recognized as corrupt. Latest log spitted out this:

2022-02-20 01:45:29 ERROR (Recorder) [homeassistant.components.recorder.util] The database at //config/home-assistant_v2.db is corrupt or malformed
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 223, in validate_sqlite_database
run_checks_on_open_db(dbpath, conn.cursor())
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 234, in run_checks_on_open_db
sanity_check_passed = basic_sanity_check(cursor)
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 212, in basic_sanity_check
cursor.execute(f"SELECT * FROM {table} LIMIT 1;") # nosec # not injection
sqlite3.OperationalError: Could not decode to UTF-8 column 'created' with text '2022-01-30!04:2:19.�0553'
2022-02-20 01:45:29 ERROR (Recorder) [homeassistant.components.recorder.util] The system will rename the corrupt database file //config/home-assistant_v2.db to //config/home-assistant_v2.db.corrupt.2022-02-20T01:45:29.591454+00:00 in order to allow startup to proceed

It’s a shame that the backups I took aren’t of any value because they are all with corrupt databases :cry:

Could you drop me 3 lines how exactly to do this? I’m on a linux host but not familiar with any database stuff. Only thing I did till no is to have the sqlite3 recover thing running for a couple of times.

I’m super disappointing that the ha database got silently corrupted and even the backups not working. I hope I have a chance to restore my data (even if that means I will have a 3-4 weeks loss now).

Actually I was just planing to get some offers about solar power based on the ha statistics (whole house + dedicated appliances) … and now they are just gone :persevere:

Open with sqlite3 /config/home-assistant_v2.db, then:

  • PRAGMA quick_check;
  • PRAGMA integrity_check;
  • PRAGMA foreign_key_check;

Depending on if and what you get, the real work just begins. Note that the integrity_check can take a long time (as it does a full consistency check) depending on the size of your database.

1 Like

I’m executed this yesterday and it’s still running…

$ sqlite3 home-assistant_v2.db 
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> PRAGMA quick_check
   ...> 

The db is 4.4GB and the system isn’t to stressed at all:

image

Is this the normal behavior? Is it possible to get some verbose output by any chance?

Without the ; at the end of the line it waits forever.

2 Likes

I had a similar experience. But without doing any db restoration.
Basically, I tried the crude way of simply making multiple backups and importing them into a new HAOS VM until one was not corrupted. For me, iteration #3 seemed to work. The history was there and went back months. All was working.

Yesterday (days later), I noticed the history only goes back a couple of days (the day of the migration) and there is a corrupted database file.

1 Like

What difference a semicolon can make :laughing:

Turns out the quick check is really quick actually:

sqlite> PRAGMA quick_check;
*** in database main ***
Page 729453: free space corruption

next: integrity_check;

actually very quick too:

sqlite> PRAGMA integrity_check;
*** in database main ***
Page 729453: free space corruption
row 733 missing from index ix_events_event_type_time_fired
row 733 missing from index ix_events_context_id
row 733 missing from index ix_events_time_fired
database disk image is malformed

and the last check:

PRAGMA foreign_key_check;
states|152065729|events|1

Any ideas what this all means? Is it totally broken or is their a chance to get this db back to a working state?

Are you sure the restore was complete while you were browsing the history? My history is always available while the restore takes place but once it’s finished it will mark the (just imported) db as corrupt and start with a fresh one.
You can look ad the suffix of the corrupt db to see the exact timestamp.

Not completely sure.
On a fresh HAOS VM, instead of making a new account, I choose to import the backup file. And when it was done doing that and the dashboard loaded, the graphs etc were populated with past data. So I assumed to import succeeded.

But its possible it was still working on it in the background and declared the database corrupt later. I don’t know how it works internally.

I also battled with these issues for a while.

Interestingly, #60874 was merged in December 2021, but as far as I can tell, it only appears in pre-release 2021.12.0b5 and not in any release changelogs since. I probably have missed it somewhere.

Still, I’ve encountered this issue in March and April 2022 so it seems to be still not fixed entirely. Until then, I’ve written a bash script that creates a backup every night via a conjob. So far, it seems to work fine and I wanted to share it with you guys.

The following script

  • Stops HA
  • Copies the database from the docker environment to the local filesystem
  • Performs an integrity check
  • Dumps, recreates and copies the database back to HA if the integrity check returns issues
  • Starts HA

For now, this seems to to the trick and I’ve not seen any corrupted databases for a few days now.

#!/bin/bash
BASEDIR="/opt/ha-backups/"
SOURCE="homeassistant:/config/home-assistant_v2.db"
TARGET="$BASEDIR$(date +%F\_%H%M%S)_home-assistant_v2.db"

echo "$(date +%F\_%H%M%S) Starting Backup" >> $BASEDIR"log.txt"
echo -n "$(date +%F\_%H%M%S) Shutting down HA: " >> $BASEDIR"log.txt"
ha core stop >> $BASEDIR"log.txt"

echo "$(date +%F\_%H%M%S) Copying database file" >> $BASEDIR"log.txt"
docker cp $SOURCE $TARGET

echo -n "$(date +%F\_%H%M%S) Checking for issues..." >> $BASEDIR"log.txt"
RESULT=$(sqlite3 $TARGET "pragma integrity_check")

if [ "$RESULT" != "ok" ]; then
  echo "issues were found. Fixing..." >> $BASEDIR"log.txt"
  echo
  sqlite3 $TARGET -cmd ".output tmp.sql" ".dump"
  rm $TARGET

  sqlite3 $TARGET -cmd ".read tmp.sql" ".quit"
  rm tmp.sql

  echo
  echo "$(date +%F\_%H%M%S) The backup was dumped and rebuilt. Now copying fixed backup back to HA." >> $BASEDIR"log.txt"
  docker cp $TARGET $SOURCE
else
  echo " no issues were found :)" >> $BASEDIR"log.txt"
fi

echo -n "$(date +%F\_%H%M%S) Starting HA: " >> $BASEDIR"log.txt"
ha core start >> $BASEDIR"log.txt"

My crontab for this looks like this:

# HomeAssistant Backup
0 4 * * * /opt/ha-backups/backup-and-fix

I hope this scripts are useful for other people as well until these issues are finally solved.

4 Likes

Today I had to restore a Home Assistant (partial) backup - due to an irrepealable ShellyForHASS (shelly custom integration) issue. First time to restore a backup ever (since running HA productively). Well:

Database is fine after restore, passed all integrity checks, all data there. :white_check_mark:

So it seems like these mentioned fixes (How to make a full backup and restore with all the data as well? - #2 by tom_l) have been implemented/rolled out meanwhile:

The only negative side effect was the missing “config” SAMBA share:
It was available but not accessable, no content. Not via SAMBA clients, not for any addon.
Solution:

  1. Restart SAMBA share addon:
    → now working for SAMBA clients and other addons, but: not for the host itself (HA OS) - see [*1] below. Therefore step 2:
  2. Restart Host / HA OS
    (maybe there’s also a magic remount command or sth to avoid a host reboot, did not investigate further)
    → Now also the host is able to access /config again.

That is certainly not a perfect “restore experience” but related to the database pretty smooth. Makes me a lot more confident to restore backups in the future.

Anyway, that SAMBA thing is something probably worth reporting.
Where to? To that addon repository and the HA OS repository? What do you experts think.

[*1]

Maybe, or coincidentally there were no writes durring you backup creation. It deserves confirmation from someone who is confident about this improvement. I’m not aware it was mentioned in changelogs released during last half a year.