Lots of problems after restoring snapshot. What to do?

maxym · December 24, 2020, 9:56am

you could at least give the link to original post xposted here. I didn’t see it
TBH using arguments about xposting or other formal things instead of supporting the issue reporter doesn’t help HA development.

Also there are mods who can merge doubled posts if they decide so. I Don’t understand why you are adding unnecessary side-topic to discussion

You also have to consider that history while not important to you, might be crucial for others. At the end this is what the history is designed for.

But the fact alone that history doesn’t survive snapshot/restore is indication that something is wrong. I suppose it might be consequence if sqllite being no transactional rdbms. but if there is no way to solve that then it should be stressed enough in docs

tom_l · December 24, 2020, 10:06am

I gave help towards getting a solution in my post. Read it again. Maybe get someone to help you if you have trouble with the English language.

DavidFW1960 · December 24, 2020, 10:18am

Why would it be crucial? It is only storing the states…

maxym · December 24, 2020, 10:42am

Cannot be those states important?
For me those are important when tracing back some issues or building automations. If I lost history I lost not only data but the exact sample which might be hard to reproduce.
I assume there are persons who might have different needs.

DavidFW1960 · December 24, 2020, 10:49am

Explain please because none of that makes much sense

maxym · December 24, 2020, 11:04am

I could but I have a feeling it’s futile. Those are MY data and it’s pretty normal that I might want to not lost them. Doen’t matter what do they contain.

Just inagine you are losing files from your pc. Are you expecting someone will argue an importance of lost content or you would rather get your computer fixed?

francisp · December 24, 2020, 5:16pm

I store the data I want to keep long time in influxdb. So it does not matter if sometime I have to delete my HA database.

maxym · December 24, 2020, 5:30pm

it’s for you.
Keep in mind that some people are looking for reliable tool, not something which “must be turned off and on to work”. (regardless it’s free, open source, community driven or other dev model - in case you would like to mention it).

If default database is considered unstable amd unreliable, it should be stated tin documentation at least.
From sw development pov it’s wrong if something fails. Even if some users are ok with that.

Merry Christmas

tom_l · December 24, 2020, 5:33pm

If it is so important to you, have you done anything about it?

You could have done this six times over by now instead of moaning here (which does nothing):

petro · December 24, 2020, 5:55pm

If you set up a Maria db in a separate container or on a separate system, data corruption is less likely. I’ve been running with an external Mariadb on my symbology for ~4 years and I’ve never lost my database when loading snapshots.

maxym · December 24, 2020, 7:35pm

It is as you said because such external db is not part of backup created by by suprvisor, isn’t it?

By rule of thumb no database should be backed up by copying files between filesystems when db is alive. It ussually leads to data inconsistency, therefore damaged database.

HA devs ignored this fact adding db files to the backup which is being created when system is up and running.
It should be resolved somehow (don’t know sqllite, but other rdbms ussually provides consistent backup mechanisms) or communicated clearly that the sqllite is not reliable for this usecase. (or maybe sql lite should not be backed up at all).

Anyway sqllite has been chosen by devs as out-of-the-box solution. Therefore it gives impression to the general users that this part of the system and related features is tested and reliable as expected.

BTW another rootcause might lay in using ORM which maps data to objects. In case of inconsistent data it likely ends with inability to create complete object resulting general failures. I’m not even touching performance issues when using ORMs

arganto · January 18, 2022, 5:01pm

Even this is old, I would second this. Esp. with the new long term statistics, energy, etc. If I always loose the whole histroy, esp. in this areas, we should directly remove month or year in the selection.

Referring to this, is there a FR or github entry or whatever, for which to vote or keep track of improvements (I would call it bugfixing) of this db-backup/restore-process?

petro · January 18, 2022, 5:11pm

As stated previously in this thread, move your database outside of the configuration folder or exclude it from your backups. If your database integrity is important to you, move to the mariadb addon and you’ll avoid this issue all together.

e-raser · January 18, 2022, 5:27pm

Indeed there is. Finally. Now that I decided to switch to MySQL (which brings - „other“ types of issues ) I recently found

this Database corruption after full restore - is this “normal” still? - #12 by tom_l
and therefore this Allow to lock SQLite database during backup by agners · Pull Request #60874 · home-assistant/core · GitHub

…which should at least make the SQLite based recorder DB more robust (of course not in all cases like power loss etc. - but at least being able to safely restore a DB backup! Unbelievable, backups able to be restored, let’s call it 2022 )

e-raser · January 18, 2022, 5:39pm

Wait - putting it outside the config directory means… it’s not part of any backup (default HA container backup), right? Not an improvement.

Regarding switching to MySQL: but there are other things to take care of, right?
E. g.

When using the official MariaDB Addon this needs to run before HA container is ready, right? Thinking bout HASS OS starting up, what about timing issues etc.? It adds another layer of complexity (which is the biggest advantage of the recommended/default SQLite database).
What does a proper MySQL based backup look like: using phpMyAdmin (addon) for a scheduled logical backup (simple dump of all tables)? Or same issue when running a container based and therefore physical backup? I read about workarounds like „when doing a full backup, stop MySQL Addon for a moment, backup the container and start the Addon again after“.
There still is no official or at least reliably working migration guide from SQLite to MySQL. Best community guide available (Migrating home assistant database from sqlite to mariadb - #68 by jr3us ) struggles with issues related to the Energy Dashboard. Would be an easy 5 minute finger exercise for a HA expert to have a look at why and finally have a rock solid migration guide. I guess. Currently we do some try and error blackbox testing there.

petro · January 18, 2022, 5:56pm

Yes that’s correct, however you can swap to maria db addon and back that up.

Supervisor already handles this…

Yes, because contrary to popular belief, you can’t write to the database and back it up at the same time. That’s why you get corruptions with the current database when it’s backing up. But no-one stops home assistant before performing a backup so you end up with this thread. Sure, home assistant could possibly stop all events from going to the database during backup, but then everyone would be mad about missing data during backups. It’s a no win situation and the best option is to plan for it by handling it yourself.

Home assistant is not a database management system. It happens to use a database to store the states. You’re going to have to get your hands dirty if you want to migrate to a new database. The likelihood of any of these tools existing is extremely low. Companies that specialize in databases offer migrations as a service, they themselves do not have built in tools that do them. They have software engineers migrate the tables with scripts. Home assistant will most likely never have any of these tools. In fact, HA keeps restricting database types to reduce the overhead when dealing with what we currently have.

The longer you wait to solve this issue, the more upset you will become. Fix it yourself by doing one of the following:

Try out the maria db addon. Build a backup schedule for it separately from HA if you really want backups by stopping the container, performing the container backup, and starting it back up. You’ll lose data during backups.
Move your database out of the config folder and back it up with a copy/paste whenever home assistant is off.
Move your database to another server like a naz that has a built in backup/restore system. This is probably the best option as you won’t lose data during HA backups, and HA will continue to write to the database during backups. The backups of your drives are handled through images with the backup/restore system on the naz. I’m sure there could be corruptions, but I would assume that the Naz will handle this properly. I’ve never had issues fwiw.

arganto · January 19, 2022, 7:44am

No, unfortunately not. The background of my question yesterday, was, that I have this still at most of the restores. Yesterday I played around with this and yes, still and (here) always happening.

Didn’t get this. These are work-arounds (which brings then other effort), but not a solution for the problem. Of course, more or less bug or problem in the world can be mitigated by workarounds. But this does not change anything on the behavior.

And no, unfortunately I’m technically not able to bring in a PR and therefore only asked if there is a FR or github report to vote for (didn’t find one via search) to have this perhaps improved/solved in the future

But there are solutions for years, without stopping. Changelogs, Caching, Snapshots, etc. At least I thought, that if I use the supervisor backup, I get a backup/point-in-time snapshot, which is possible to be restored 1:1. As this is “out of the box” and I use out of the box-db, etc. A backup from which it is (known) not possible to restore 1:1 is not what I expected. Esp. not if one of the new main topics is long term energy measurements, etc., which are all gone with this approach.

Nevertheless one question.

From your explanations and experiences, does this include the 1:1 case from above, so supervisor backup will backup the mariadb inkl. point-in-time status of the db with 1:1 completely restore capabilities? If yes, why

If no, where is the benefit of this approach?

Everything self-baked via work-arounds will/would have other disadvantages as the expected 1:1 total backup. E.g. different jobs, different follow-up-processed (move backup to secure places) to be aligned. Otherwise you will have data in the debase, which should not be there or is missing, if you restore then later on.

To be clear: My datalog is not that relevant, that I can’t live without a loss. But if there is a focus on long term data and backup is in place, it should work as a backup with the possibility to restore it - including the datalog,

petro · January 19, 2022, 12:44pm

This is not a bug. I already told you about the technical limitations. You’re welcome to sit here and argue, but this will not change. The simple fact is: You run a corruption risk every time you backup that file and the file contains a partial write.

Please show me these solutions that involve users using home assistants built in SQLLite database. I can tell you the solution, however you rather ignore my advice and argue with me that it’s a “Work around”.

I have no clue what you’re even asking here. None of this makes sense.

It backs up the MariaDB container which contains the database. The container locks the database during a backup, which isn’t possible with the provided default SQLLite database. The SQLLite database is a flat file and essentially a copy/paste into your zip file, which leads to potential corruption. For extra safety, you can shutdown the container during your backup if you like. Otherwise you can keep it running and hold trust in the database lock.

I’ve already listed the benefits and you’ve chosen to argue/ignore them. I’m not sure how else I can make this clear:

You’re less likely to run into data loss when using MariaDB, the official addon built and maintained by home assistant.

arganto · January 19, 2022, 1:49pm

You have written

I only asked, why I have to do it, when you now write (and I thought to have understood before)

That was all, what I did not undertand and still not understand. Why another backup schedule for it separately from HA, when the supervisor-backup is doing this as well?

petro · January 19, 2022, 1:52pm

Where did I say that you needed another backup schedule? I said you can choose to make your own backup schedule with a shutdown of the container. You can’t shutdown HA while running a backup, hence you’d need a separate schedule that shuts down just the MariaDB container.