Fix corrupted / malformed SQLite database (home-assistant_v2.db)

delessebe · January 5, 2021, 7:41am

Good work!
For me it didn’t work with the “sqlite3 home-assistant_v2.db “.dump” | sqlite home-assistant_v2.db_fixed” command. Ended with an empty file.
What worked for me was “sqlite3 home-assistant_v2.db “.recover” | sqlite home-assistant_v2.db_fixed”
(result: almost same size as original and no errors when copied back to HA)

ebendl · January 18, 2021, 7:23am

Thanks for this, will give it a go. Part of my problem is that I do have long-term values, especially from utility_meter values (and the underlying sensors) that I do not want to lose.

I see in my case the problem is in the events table – is there any way to only recreate the events table and keep the rest?

delessebe · January 18, 2021, 8:16am

Delete all from the events table? or drop table and recreate? https://www.home-assistant.io/docs/backend/database/

ebendl · January 18, 2021, 8:51pm

This worked perfectly – thanks! My recovered DB ended up about a third of the original, but
a) it works, and,
b) the state history is there, and so are my utility_meter values.

Thanks!!

e-raser · January 29, 2021, 6:44pm

That´s not necessarily because of loosing data, but possibly due to the former version not been repacked for some time. Purging only limits growth, repacking throws away old data according to the purge_keep_days setting (Recorder - Home Assistant).

e-raser · February 4, 2022, 12:27am

Always great to come back to a topic started by oneself.

Had a foreign key constraint error resulting in some kind of “self protection” where the database copied itself to a home-assistant_v2.db.corrupt.2022-02-03T20:59:16.490886+00:00 file and recreated the home-assistant_v2.db from scratch - almost empty. EVERYTHING was gone, entities history of course and the Energy Dashboard was completely empty too - of course. What a shocking moment.

OK, I had two options:

Short way: try to fix the foreign key constraint errors end restore consistency ==> potentially only loose some hours of data
Long way: restore database from latest backup (nightly) ==> loose about 19 hours of data.

Long story short:

Option 2 is still the hard way. Step 3 of Fix corrupted / malformed SQLite database (home-assistant_v2.db) would probably work as it did for many many others already, but (fortunately) I didn’t need to go that path up to now as there’s always been another “corrupted/inconsistent, but not malformed” database file.
Finally HA devs are actively working on the core issue the SQLite database gets malformed during backup (because it’s a physical/file based backup during operation (HA still running and writing data to the db) - see 1.2 on How To Corrupt An SQLite Database File) which - still, in February 2022 - renders that file pretty useless in the first place (or for every normal user without fix knowledge).
→ Here we go, that’s where I put my bets for a few weeks meanwhile, closely watching every activity: https://github.com/home-assistant/core/pull/60874
Needless to say a trustworthy backup mechanism which conserves a consistent database IS A MUST meanwhile (always was for history data addicted like me) because we now all have the energy dashboard and long-term statistics. All those great 2021 improvements/features rely on the database, so it’s not too early to finally make that groundwork imho. I think that should fit in 2022s year of “streamlining experiences”:
- Streamlining Experiences - Home Assistant
- 2022.2: Let's start streamlining! - Home Assistant
Ah yes, by the way: Option 1 was the path I went. Fast lane. Fixed the foreign key constraint issues (only 3 rows and I exactly knew why/what happened) in the corrupted database, shutdown HA, replaced the new/empty one by the fixed one, started up HA, and hello history, I missed you so much. Energy dashboard only calculates once a hour so it took until the next full hour to create the next hour graph.

To sum up:

3 and a 1/2 hours of unplanned and unnecessary restoration tasks and a few lessons learned later… I really hope not too many others will experience that “database malformed/corrupted, hell no what to do!” situation, won’t land here and won’t read this.

By the way, for me there are only two possible ways to go in terms of database (because as my HA grew I really really REALLY need to make sure my database is safe and backed up reliably - I did mention it, did I?):

Either https://github.com/home-assistant/core/pull/60874 will finally make its way into a productive release so I can reestablish trust in homeasisstant container backups including the SQLite database.
Or some experts at Migrating home assistant database from sqlite to mariadb - which is not yet a perfect but still the best migration guide out there! - will finally fix the last little bugs (energy dashboard related) for migrating from SQLite to MySQL so I can finally switch to another level of database management. Including a professional online editing (phpMyAdmin addon with logical backups/DB dumps instead of very simple SQLite web addon which doesn’t even provide a query history etc.), more control, more options, improved backup strategys etc…

Until that

I will do a ha core stop; cp /config/home-assistant_v2.db /backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).db; ha core start and copy that “working cause created offline” backup away from the HA server, just in case. I plan to do that at least once a month, probably before every major (monthly) HA update
I will also have a look into these logical backups
- classical dump using sqlite3 /config/home-assistant_v2.db .dump > /backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).sql)
- backup function sqlite3 /config/home-assistant_v2.db ".backup '/backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).db'"

Maybe I can automate one of these backups to run it regularly to reduce HA downtime by running a physical backup. Will have a look at what I will get and if consistency is fine. But if it would be that easy I’m sure HA dev team would have implemented it as backup subtask for homeassistant container years ago. Sounds promising anyway:

It's different from regularly file copying, because it takes care of any users currently working on the database. There are proper locks set on the database, so the backup is done exclusively.

orange-assistant · February 11, 2022, 11:24pm

Now it got me.

Was away for 3 days an when I came back I releazied the history just ends yesterday (2022-02-10).

Checking the config folder I found the home-assistant_v2.db.corrupt.2022-02-10..... and did run a

sqlite3 ./home-assistant_v2.db ".recover" | sqlite3 ./home-assistant_v2.db.fix

on the corrupted db. The process it self looked successful and the db was shrinking from 4.4GB to 3.8GB.

When starting ha core with the new (fixed) db file it just got detected as corrupted again

So I bite the bitter apple and restored a db file from my last backup a week ago. First I was pleased as I could browse the energy statistics again while ha core was starting up (with the db from the backup) but as time ha was completely started (indicated on the web front down left with a message) the database from the backup was again detected as corrupted and the energy statistics I just browsed seconds ago are gone


2022-02-11 23:14:45 WARNING (Recorder) [homeassistant.components.recorder.migration] Database is about to upgrade. Schema version: 23
2022-02-11 23:14:51 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT statistics_short_term.id AS statistics_short_term_id, statistics_short_term.created AS statistics_short_term_created, statistics_short_term.start AS statistics_short_term_start, statistics_short_term.mean AS statistics_short_term_mean, statistics_short_term.min AS statistics_short_term_min, statistics_short_term.max AS statistics_short_term_max, statistics_short_term.last_reset AS statistics_short_term_last_reset, statistics_short_term.state AS statistics_short_term_state, statistics_short_term.sum AS statistics_short_term_sum, statistics_short_term.metadata_id AS statistics_short_term_metadata_id 
FROM statistics_short_term LEFT OUTER JOIN (SELECT statistics_short_term.start AS start, statistics_short_term.metadata_id AS metadata_id, 1 AS is_duplicate 
FROM statistics_short_term GROUP BY statistics_short_term.metadata_id, statistics_short_term.start 
HAVING count(*) > ?) AS anon_1 ON anon_1.metadata_id = statistics_short_term.metadata_id AND anon_1.start = statistics_short_term.start 
WHERE anon_1.is_duplicate = ? ORDER BY statistics_short_term.metadata_id, statistics_short_term.start, statistics_short_term.id DESC
 LIMIT ? OFFSET ?]
[parameters: (1, 1, 998, 0)]
(Background on this error at: https://sqlalche.me/e/14/4xp6)

Any idea what’s happening here?

e-raser · February 11, 2022, 11:51pm

I feel you. We know that DB files from the „backup“ are useless. They are working on it, but it might take months they finally fix this - after years. Short: forget the database backups.

I don’t know what else you can try. If the recover process did not work for you there might be no other option.

Maybe… run some offline consistency checks after the .recover, (quick check, full check and foreign key check). Perhaps you’re lucky and output tells you what exactly is wrong so you can dig into it trying to fix it.

full integrity check (might take a while): Pragma statements supported by SQLite
see foreign key check on the same page

By the way: looks like your database is trying to upgrade. Did you update HA meanwhile? So are you trying to use a database from a former HA release with a newer HA? I would try to have HA Core and DB at the same date.

orange-assistant · February 20, 2022, 1:33am

I did this but I now also tried to restore backups (including full backups) with same ha & db versions. Unlucky me all database get recognized as corrupt. Latest log spitted out this:

2022-02-20 01:45:29 ERROR (Recorder) [homeassistant.components.recorder.util] The database at //config/home-assistant_v2.db is corrupt or malformed
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 223, in validate_sqlite_database
run_checks_on_open_db(dbpath, conn.cursor())
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 234, in run_checks_on_open_db
sanity_check_passed = basic_sanity_check(cursor)
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 212, in basic_sanity_check
cursor.execute(f"SELECT * FROM {table} LIMIT 1;") # nosec # not injection
sqlite3.OperationalError: Could not decode to UTF-8 column 'created' with text '2022-01-30!04:2:19.�0553'
2022-02-20 01:45:29 ERROR (Recorder) [homeassistant.components.recorder.util] The system will rename the corrupt database file //config/home-assistant_v2.db to //config/home-assistant_v2.db.corrupt.2022-02-20T01:45:29.591454+00:00 in order to allow startup to proceed

It’s a shame that the backups I took aren’t of any value because they are all with corrupt databases

orange-assistant · March 3, 2022, 1:45pm

Could you drop me 3 lines how exactly to do this? I’m on a linux host but not familiar with any database stuff. Only thing I did till no is to have the sqlite3 recover thing running for a couple of times.

I’m super disappointing that the ha database got silently corrupted and even the backups not working. I hope I have a chance to restore my data (even if that means I will have a 3-4 weeks loss now).

Actually I was just planing to get some offers about solar power based on the ha statistics (whole house + dedicated appliances) … and now they are just gone

e-raser · March 3, 2022, 8:05pm

Open with sqlite3 /config/home-assistant_v2.db, then:

PRAGMA quick_check;
PRAGMA integrity_check;
PRAGMA foreign_key_check;

Depending on if and what you get, the real work just begins. Note that the integrity_check can take a long time (as it does a full consistency check) depending on the size of your database.

orange-assistant · March 5, 2022, 10:07am

I’m executed this yesterday and it’s still running…

$ sqlite3 home-assistant_v2.db 
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> PRAGMA quick_check
   ...>

The db is 4.4GB and the system isn’t to stressed at all:

Is this the normal behavior? Is it possible to get some verbose output by any chance?

e-raser · March 5, 2022, 11:28am

Without the ; at the end of the line it waits forever.

Istria1704 · March 5, 2022, 12:02pm

I had a similar experience. But without doing any db restoration.
Basically, I tried the crude way of simply making multiple backups and importing them into a new HAOS VM until one was not corrupted. For me, iteration #3 seemed to work. The history was there and went back months. All was working.

Yesterday (days later), I noticed the history only goes back a couple of days (the day of the migration) and there is a corrupted database file.

orange-assistant · March 5, 2022, 12:10pm

What difference a semicolon can make

Turns out the quick check is really quick actually:

sqlite> PRAGMA quick_check;
*** in database main ***
Page 729453: free space corruption

next: integrity_check;

actually very quick too:

sqlite> PRAGMA integrity_check;
*** in database main ***
Page 729453: free space corruption
row 733 missing from index ix_events_event_type_time_fired
row 733 missing from index ix_events_context_id
row 733 missing from index ix_events_time_fired
database disk image is malformed

and the last check:

PRAGMA foreign_key_check;
states|152065729|events|1

Any ideas what this all means? Is it totally broken or is their a chance to get this db back to a working state?

orange-assistant · March 5, 2022, 12:14pm

Are you sure the restore was complete while you were browsing the history? My history is always available while the restore takes place but once it’s finished it will mark the (just imported) db as corrupt and start with a fresh one.
You can look ad the suffix of the corrupt db to see the exact timestamp.

Istria1704 · March 7, 2022, 3:39pm

Not completely sure.
On a fresh HAOS VM, instead of making a new account, I choose to import the backup file. And when it was done doing that and the dashboard loaded, the graphs etc were populated with past data. So I assumed to import succeeded.

But its possible it was still working on it in the background and declared the database corrupt later. I don’t know how it works internally.

Blogshot · May 11, 2022, 11:18am

I also battled with these issues for a while.

Interestingly, #60874 was merged in December 2021, but as far as I can tell, it only appears in pre-release 2021.12.0b5 and not in any release changelogs since. I probably have missed it somewhere.

Still, I’ve encountered this issue in March and April 2022 so it seems to be still not fixed entirely. Until then, I’ve written a bash script that creates a backup every night via a conjob. So far, it seems to work fine and I wanted to share it with you guys.

The following script

Stops HA
Copies the database from the docker environment to the local filesystem
Performs an integrity check
Dumps, recreates and copies the database back to HA if the integrity check returns issues
Starts HA

For now, this seems to to the trick and I’ve not seen any corrupted databases for a few days now.

#!/bin/bash
BASEDIR="/opt/ha-backups/"
SOURCE="homeassistant:/config/home-assistant_v2.db"
TARGET="$BASEDIR$(date +%F\_%H%M%S)_home-assistant_v2.db"

echo "$(date +%F\_%H%M%S) Starting Backup" >> $BASEDIR"log.txt"
echo -n "$(date +%F\_%H%M%S) Shutting down HA: " >> $BASEDIR"log.txt"
ha core stop >> $BASEDIR"log.txt"

echo "$(date +%F\_%H%M%S) Copying database file" >> $BASEDIR"log.txt"
docker cp $SOURCE $TARGET

echo -n "$(date +%F\_%H%M%S) Checking for issues..." >> $BASEDIR"log.txt"
RESULT=$(sqlite3 $TARGET "pragma integrity_check")

if [ "$RESULT" != "ok" ]; then
  echo "issues were found. Fixing..." >> $BASEDIR"log.txt"
  echo
  sqlite3 $TARGET -cmd ".output tmp.sql" ".dump"
  rm $TARGET

  sqlite3 $TARGET -cmd ".read tmp.sql" ".quit"
  rm tmp.sql

  echo
  echo "$(date +%F\_%H%M%S) The backup was dumped and rebuilt. Now copying fixed backup back to HA." >> $BASEDIR"log.txt"
  docker cp $TARGET $SOURCE
else
  echo " no issues were found :)" >> $BASEDIR"log.txt"
fi

echo -n "$(date +%F\_%H%M%S) Starting HA: " >> $BASEDIR"log.txt"
ha core start >> $BASEDIR"log.txt"

My crontab for this looks like this:

# HomeAssistant Backup
0 4 * * * /opt/ha-backups/backup-and-fix

I hope this scripts are useful for other people as well until these issues are finally solved.

e-raser · July 20, 2022, 12:16am

Today I had to restore a Home Assistant (partial) backup - due to an irrepealable ShellyForHASS (shelly custom integration) issue. First time to restore a backup ever (since running HA productively). Well:

Database is fine after restore, passed all integrity checks, all data there.

So it seems like these mentioned fixes (How to make a full backup and restore with all the data as well? - #2 by tom_l) have been implemented/rolled out meanwhile:

The only negative side effect was the missing “config” SAMBA share:
It was available but not accessable, no content. Not via SAMBA clients, not for any addon.
Solution:

Restart SAMBA share addon:
→ now working for SAMBA clients and other addons, but: not for the host itself (HA OS) - see [*1] below. Therefore step 2:
Restart Host / HA OS
(maybe there’s also a magic remount command or sth to avoid a host reboot, did not investigate further)
→ Now also the host is able to access /config again.

That is certainly not a perfect “restore experience” but related to the database pretty smooth. Makes me a lot more confident to restore backups in the future.

Anyway, that SAMBA thing is something probably worth reporting.
Where to? To that addon repository and the HA OS repository? What do you experts think.

[*1]

maxym · July 21, 2022, 6:24pm

Maybe, or coincidentally there were no writes durring you backup creation. It deserves confirmation from someone who is confident about this improvement. I’m not aware it was mentioned in changelogs released during last half a year.