Fix corrupted / malformed SQLite database (home-assistant_v2.db)

Always great to come back to a topic started by oneself.

Had a foreign key constraint error resulting in some kind of “self protection” where the database copied itself to a home-assistant_v2.db.corrupt.2022-02-03T20:59:16.490886+00:00 file and recreated the home-assistant_v2.db from scratch - almost empty. EVERYTHING was gone, entities history of course and the Energy Dashboard was completely empty too - of course. What a shocking moment.

OK, I had two options:

  1. Short way: try to fix the foreign key constraint errors end restore consistency ==> potentially only loose some hours of data
  2. Long way: restore database from latest backup (nightly) ==> loose about 19 hours of data.

Long story short:

  • Option 2 is still the hard way. Step 3 of Fix corrupted / malformed SQLite database (home-assistant_v2.db) would probably work as it did for many many others already, but (fortunately) I didn’t need to go that path up to now as there’s always been another “corrupted/inconsistent, but not malformed” database file.

  • Finally HA devs are actively working on the core issue the SQLite database gets malformed during backup (because it’s a physical/file based backup during operation (HA still running and writing data to the db) - see 1.2 on How To Corrupt An SQLite Database File) which - still, in February 2022 - renders that file pretty useless in the first place (or for every normal user without fix knowledge).
    → Here we go, that’s where I put my bets for a few weeks meanwhile, closely watching every activity: https://github.com/home-assistant/core/pull/60874
    Needless to say a trustworthy backup mechanism which conserves a consistent database IS A MUST meanwhile (always was for history data addicted like me) because we now all have the energy dashboard and long-term statistics. All those great 2021 improvements/features rely on the database, so it’s not too early to finally make that groundwork imho. I think that should fit in 2022s year of “streamlining experiences”:

  • Ah yes, by the way: Option 1 was the path I went. Fast lane. Fixed the foreign key constraint issues (only 3 rows and I exactly knew why/what happened) in the corrupted database, shutdown HA, replaced the new/empty one by the fixed one, started up HA, and hello history, I missed you so much. Energy dashboard only calculates once a hour so it took until the next full hour to create the next hour graph.

To sum up:

3 and a 1/2 hours of unplanned and unnecessary restoration tasks and a few lessons learned later… I really hope not too many others will experience that “database malformed/corrupted, hell no what to do!” situation, won’t land here and won’t read this.

By the way, for me there are only two possible ways to go in terms of database (because as my HA grew I really really REALLY need to make sure my database is safe and backed up reliably - I did mention it, did I?):

  1. Either https://github.com/home-assistant/core/pull/60874 will finally make its way into a productive release so I can reestablish trust in homeasisstant container backups including the SQLite database.
  2. Or some experts at Migrating home assistant database from sqlite to mariadb - which is not yet a perfect but still the best migration guide out there! - will finally fix the last little bugs (energy dashboard related) for migrating from SQLite to MySQL so I can finally switch to another level of database management. Including a professional online editing (phpMyAdmin addon with logical backups/DB dumps instead of very simple SQLite web addon which doesn’t even provide a query history etc.), more control, more options, improved backup strategys etc…

Until that

  • I will do a ha core stop; cp /config/home-assistant_v2.db /backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).db; ha core start and copy that “working cause created offline” backup away from the HA server, just in case. I plan to do that at least once a month, probably before every major (monthly) HA update
  • I will also have a look into these logical backups
    • classical dump using sqlite3 /config/home-assistant_v2.db .dump > /backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).sql)
    • backup function sqlite3 /config/home-assistant_v2.db ".backup '/backup/home-assistant_v2_BU_$(date +%F\_%H%M%S).db'"

Maybe I can automate one of these backups to run it regularly to reduce HA downtime by running a physical backup. Will have a look at what I will get and if consistency is fine. But if it would be that easy I’m sure HA dev team would have implemented it as backup subtask for homeassistant container years ago. Sounds promising anyway:

It's different from regularly file copying, because it takes care of any users currently working on the database. There are proper locks set on the database, so the backup is done exclusively.
4 Likes