Are HA Backups useless? šŸš® (sqlite db always corrupted)

I use the mariadb add on and the the Google drive back up add on. I stop the DB prior to back up and this letā€™s you get a backup that is not corrupt.

2 Likes

Do not store anything you care about in SQLite. Ever.

2 Likes

This is mostly due to an open transaction while the backup was taken.
In most cases this can be repaired, by dumping and reimporting the database. Plenty of posts here how to do this. I never experienced any problems with this but there might be some inconsistencies.
Apart from that I remember reading here that the team is working on making the database backup more reliable.

1 Like

Thatā€™s not fair. When used properly, Sqlite is perfectly robust and can be used for mission-critical data. Sure, other options can be more powerful ā€” if you want to be DBA. Otherwise, youā€™re adding complexity and probably not actually getting much real benefit (especially if youā€™re running it on the same system). At best, youā€™re moving the backup problem to a new server.

The best way to safely back up sqlite databases Iā€™m aware of ā€” but let me state right away that Iā€™m definitely not a DBA! ā€” is to run VACUUM INTO before you do backups, and back up that file instead of the main one. (Or, disk space being cheapā€¦ in addition to it.)

Hmm; in fact, this has me considering moving my home-assistant_v2.db to an in-memory database, and using vacuum-into to dump that periodically.

1 Like

You mimd sharing a link? Linke described above I was already runing sqlite3 .recover many times and at the very moment pragma checks are running already for hours

I tried to restore all the backups on two different machines and 4 different disks in total. Result is always the same and resulting in a ā€œdiscardā€ db once restore is finish despite showing proper history data while the restore takes place.

No one uses SQLite for mission critical data. That would be insane.

Either way, I would assume that most people who have dealt with HA backups, or HA in general, have seen DB corruption. While it might be able to be fixed, for the most folks the data just doesnā€™t really matter that much. I assume thatā€™s why the devs havenā€™t gone to something more robust.

Personally, Iā€™ve moved it all to MariaDB.

2 Likes

Thatā€™s quite the confident assertion, but you are wrong, both in whether it happens and whether it is ā€œsaneā€. Iā€™m not going to escalate on the name calling ā€” I think itā€™s inappropriate.

But I will say: for every case where Iā€™ve seen someone using Sqlite where a different database server would be more appropriate, Iā€™ve seen a hundred cases where someone has installed mysql, mariadb, or postgresql with minimal configuration and zero administrative attention or care, and those are at least as troubling. (Especially but not only on the topic at hand ā€” reliable backups.)

Apologies if you took that as a personal attack. It wasnā€™t.

No normal IT person is going to run mission critical data on SQLite. If they do, they wonā€™t be in IT for long. If you disagree, then we will just have to leave it there. Iā€™ve no need to convince you and itā€™s probably not that constructive for this thread anyway.

My suggestion for OP is still to install MariaDB.

I didnā€™t take it as a personal attack, but I donā€™t think itā€™s constructive to call people ā€œinsaneā€. And your other assertions continue to just be factually wrong.

To stay to the topic, if you install MariaDB, you will still need to do something about backups other than just back up or snapshot the filesystem where the on-disk database files live, or you will have similar problems.

At no point did I call anyone insane. :wink:

Thankā€™s but the dump didnā€™t help. It results in a .db file with 0 bytes :persevere:

If you run sqlite3 ./home-assistant_v2.db and then type .dump at the interactive prompt, what do you get?

What about .recover instead?

SQLite is the most widely used database engine around, and it is often and correctly used for mission critical data. I think youā€™d be amazed how many products and devices are running a SQLite backend. For example, SQLite is built into the Android SDK so almost all data storage in Android is in SQLite DBs.

The real problem is that HA absolutely abuses SQLite and every other database engine by storing every state change as VARCHAR(255) with an attributes field as TEXT (seriously?!) containing a JSON string of all state attributes. This is why HA databases bloat to multiple gigabytes when storing 25MB worth of data. Until this is fixed, the database will always be a ticking time bomb that the user must actively manage. Combine this with SD cards and the problem is amplified 100x.

The HA backup (I am assuming here, as I donā€™t use Supervised/HASIO/HAS OS, etc.) makes useless backups of the SQLite DB because it doesnā€™t take a snapshot. It just copies the file. Leaving transactions open (does HA even use transactions? I donā€™t think so), commits open, and half written data.

I fully agree with your statement though with a minor edit: No one should use the Home Assistant database to store data you actually want. That would be insane. :smiley:

8 Likes

A console full of

INSERT INTO events VALUES(...

and the last 5 lines:

CREATE INDEX ix_statistics_short_term_metadata_id ON statistics_short_term (metadata_id);
CREATE INDEX ix_statistics_short_term_start ON statistics_short_term (start);
CREATE UNIQUE INDEX ix_statistics_statistic_id_start ON statistics (metadata_id, start);
CREATE UNIQUE INDEX ix_statistics_short_term_statistic_id_start ON statistics_short_term (metadata_id, start);
ROLLBACK; -- due to errors

I was running the .recover before (non interactive/verbose) couple of times already and it actually spits out a new db file which is smaller than the original but still discard by ha once restore is complete.

I even did try to recover a recovered db (and did one more recover on that), the db file getā€™s smaller and smaller but ha still doesnā€™t want thatā€¦

When running .recover in the interactive prompt the last 5 lines are:

CREATE INDEX ix_statistics_short_term_start ON statistics_short_term (start);
CREATE UNIQUE INDEX ix_statistics_statistic_id_start ON statistics (metadata_id, start);
CREATE UNIQUE INDEX ix_statistics_short_term_statistic_id_start ON statistics_short_term (metadata_id, start);
PRAGMA writable_schema = off;
COMMIT;

I also did some PRAGMA checks but donā€™t have an idea what the output means:

SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> PRAGMA quick_check;
*** in database main ***
Page 729453: free space corruption
sqlite> PRAGMA integrity_check;
*** in database main ***
Page 729453: free space corruption
row 733 missing from index ix_events_event_type_time_fired
row 733 missing from index ix_events_context_id
row 733 missing from index ix_events_time_fired
database disk image is malformed
sqlite> PRAGMA foreign_key_check;
states|152065729|events|1

Itā€™s also the reason why history accesses are so unbelievably slow. The entire recorder component will have to be rewritten from scratch at some point.

1 Like

+1 for swapping to the Maria DB. I also donā€™t include the DB in my backups, since itā€™s around 5.1GB. Since swapping to the Maria DB I havenā€™t had any issues at all.

Except development is too busy creating a media player for some reason. :wink:

(I hope anyone doesnā€™t take that as more than it was meant to beā€¦slight ribbing. I appreciate everything the HA developers do with the exception of the database/recorder component).

1 Like

Open source development isnā€™t interchangeable ā€” people work on what theyā€™re interested in. I get that youā€™re joking, butā€¦ if people want this better, the best way to do it is to start actually making it better.

Thatā€™s easier said than done. The recorder is a very central part of HA and has inter-dependencies into several other parts. Changing it would be a very significant amount of work and requires indepth knowledge of the HA internals. Add to this a very ā€˜closed-inner-circleā€™ BDFL approach of the core developers, where it is extremely difficult to get anything approved into core that would change a fundamental part of HA as an outsider, regardless of how good it is technically. There are so many hoops they make you jump through, and thereā€™s no guarantee they wonā€™t just throw it out at the end and you wasted all your time.

Realistically, something like a completely new recorder component would be impossible to get approved into core unless one of the main devs does it.

4 Likes