Merge old database with new (statistics)

Hi everyone,

I recently had a corrupted database which I managed to recover.
However, in the meantime the statistics etc. move on and the old and new database are out of sync.
This is because when Home Assistant recreated the database after it was corrupted all statistic entities got new ID’s.

Is there an existing script or something that can merge the old longterm statistics with the new one?

Basically the following would need to be done:

  • Create a new database with all the regular entities and history of the current active (new) database.
  • First add all longterm statistics from the recovered (old) database.
  • Record by records add the long term statistics from the active (new) database.
  • It will nee to create entities on the fly if they do not exist in the old database and also needs to parse the record ID’s to the old database values.

Any ideas/suggestions are very welcome!

For anyone interested in such a feature:

I’ve created a PHP based script which you can use in any local or remote server you want.

What is does is basically the following:

  1. Copy the new database (non-corrupted)
  2. Empty the statistics tables (long and short term)
  3. Copy the old statistics
  4. Link the entities meta ID’s in a separate table to the old and new ID’s are known.
  5. Append all new entities metadata that didn’t exist in the old (corrupted) database.
  6. Create a new sum base value based on the last old and first new values. All new sum values will be calculated with this value.
  7. Iterate over X number of records for both the short and long term statistics. In case it finds a summable entity it will recalculate the sum total based on the older base value.

Requirements:

  • A (local) PHP server to run this script on.
  • Old and new .db files needs to be in the root of this script in order to be found (might change this later)
  • PDO SQLite needs to be installed: PHP: SQLite (PDO) - Manual
  • I’ve added ini_set('precision', 16); for improved float calculations. If your system doesn’t allow ini_set you’ll have to modify the PHP config yourself.

I might enhance this script in the future to support:

  • Date selection for when to switch to the new DB (in case it’s not a currupted DB issue but something else).
  • Support for other SQL engines

Repo:

3 Likes

@jory
Hi Jory, this is EXACTLY the tool that I need.
Unfortunately i’m unable to get it working. I get the following errors:

init_stats SQL Exec: SQLSTATE[HY000]: General error: 1 no such table: db_old.statistics INSERT INTO main.statistics SELECT * FROM db_old.statistics

init_stats SQL Exec: SQLSTATE[HY000]: General error: 14 unable to open database: home-assistant_v2_old.db ATTACH home-assistant_v2_old.db as db_old false

Maybe you can help me in the right direction?

This looks like the old db file isn’t loading correctly. Maybe file access limitations?
Feel free to open an issue in GH to further check discuss this if you can’t find the problem.

Thanks, I recently had the same problem, so this script will be of great use for me!

However, I am not very familiar with php and web stuff, could you please provide some example on how to run the script?

Can it be run directly from command line in Linux/WSL or do I need to install a web server?

Any help appreciated!

Ok, I managed to get it to work in the end, thanks for a great script!

Here is some steps that I did:

  • Install Apache2 web server on my raspberry (running Raspberry Pi OS/Linux)
  • Put the files from Github in the www folder (/var/www/html) in my case.
  • Put the database files that are going to be merged in the same folder
  • Be sure to enable php (I did sudo apt install libapache2-mod-php)
  • Be sure to install PDO SQLite (sudo apt install php-sqlite3)
  • Navigate to the IP-address of the device where the web server is running (port 80, not the home assistant port), and the index.php will help you :slight_smile:
  • Let the script run for a while (may take pretty long time). If the script does not produce any output on the web page (may take a while), check the apache error logs in /var/log/apache2/error.log.

When completed, a new database file is created.
Stop home assistant and move the new file to your home assistant config directory and name it home-assistant_v2.db. (It may be a good idea to rename and keep the old database in case something goes wrong)

I am not sure if these are all the steps needed, but hopefully it is useful to someone and may point in the right direction.

Hi @maral
You’ve already figured it out indeed :slight_smile: This is the way to use this script; with a webserver.
Thanks for sharing!
Cheers, Jory

Same issue as @Butei here, did u find a way to fix it?

SQL Exec: SQLSTATE[HY000]: General error: 1 no such table: db_old.statistics

It does seem to load the Db file as it does load all entities from that old db if u select it. :confused:
I manually opened the Db and did it does exists and integrity check passes.

Are you on Windows? → make sure that you gave the database files all the rights (everyone full access)

@jory Thank you for this. You’re a real life-saver (or energy data saver to be more correct). Due to a power failure my energy data was all gone, but now it’s back. Only strange thing is that my solar energy seems to be a factor 1000 lower that it should have been. Guess it somehow converted kWh to Wh. For consumed energy this doesn’t seem to be the case.

Ah, never mind. It’s more than what I had, and consumption is more what I’m interested in in the first place.

Hi @lpwevers

It could be that your device registered under a new unit since HA fully initialized as new.
You could of course first run a script that converts these stats :slight_smile:

Cheers, Jory

Thanks. I’ll see what I can do. Should be a simple update with everything before the crash and multiply by 1000. (I think the merge change kWh to just Wh). Should be not to hard to fix.

I don’t know if I was using the tool incorrectly but the data was incorrect/invalid when I used it and it took a very long time to complete with a 4GB database…

So I created my own tool which is cross-platform and works with other databases (uses .net ormlite) not just sqlite. It worked to merge my old database with the new one that HA created after it marked the old one as ‘corrupted’. The tool will recalculate and copy statistics from the new db to the old db, and optionally copy it all back to the new db so both databases are the same. It only works on the statistics tables right now though. Hopefully it is useful to someone.

2 Likes

Thanks for sharing! Looks like a nice alternative for users that aren’t able to setup a PHP server.

Wow CrypticKlippo,

This is amazing!! My home-assistant_v2.db was corrupted, newly created, and my historical energydata was corrupted then.

You’re tool copied all the data from new.db to old.db. I’ve stopped the HA container, renamed old.db to home-assistant_v2.db and started the HA container again, and voila, it works.

Many thanks!!

Hi Jory, first of all thank you for your effort for making this and thank you @maral for making it easier to set it up for people like me ( without any knowledge of PHP and databases ) unfortunately I didn’t manage to get it to work and hopefully someone can help me out. i followed al the intructions and after i clicked run i get the error:

“SQL Exec: SQLSTATE[23000]: Integrity constraint violation: 19 UNIQUE constraint failed: statistics.metadata_id, statistics.start_ts”

@jory I used your tool and it still works.
The only thing I noticed is that all data from the corrupt db is now considered “long-term statistics”, but that’s not a big problem.

I used Docker to use your tool and created a fork of your repo for it.
If anyone wants to use it here is the fork:

2 Likes

Thanks @arough ! Docker support is great.
I’ve made a PR and will test it locally as well. Feel to add commits or docs descriptions if needed before merge.

@arough Thanks for your contributions! I’ve merged it into the main repo. :+1:

I’m trying this in Windows 11 but I can’t get the databases merged.
Which databases should be selected where, is the following correct?
DB OLD = corrupt database
DB NEW = current database
DB CONTINUE = empty database