Migrating home assistant database from sqlite to mariadb.. ONLY if you are VERY familiar with Database administration

The database file is 1.8 GB and I have 3 GB of RAM.
The system runs under Synology VMM in a VM.

There I can also expand the memory by a few GB, only then home assistant no longer starts. What do I have to do under the home assistant OS to expand the memory?

Thanks a lot

Maybe just increase swap would do it

@eddiewebb

There is a bug with the cost calculation, i did some extended tests and came to the conclusion that using the old data stored for gas or energy don’t work when u migrate, as the cost tracker, no matter if its a fixed price or an entity cause a very high output.

To solve that is creating a new gas or energy entity what gonna collect the data from now on and import to the energy dashboard, than it will OK, this has nothing to do with the cost tracker entity.

So if u want to see the old data, just select the old entity device what contains the old data and you can look back of course…

There is no way to fix it, nor in the database or manually trough the gui of HASS…

I do see some people having this issue, so thats why i’m reporting this.

The rest works all great, all entity data gets migrated correct, without any dataloss… so this migration works as expected!! except it MAY cas the cost problem, but it’s only the cost tracker entity…

Regards

1 Like

Hmm, that seems sufficient, but agree with swap recommendation, You can also try adding the batch flag (see above) to break the work into chunks. Also worth checking your provider docs to see if they have a process killer for vms that exceed some memory/cpu burst limits. (I know I could never run composer on dreamhost for that reason)

Thanks for your work and reporting.

I unfortunately dont have any cost data, but hoping someone else who does and also some deeper knowledge of the schema can identify cause. While I have done dozens of database migrations in my career, and, encoding and timestamps can be problematic in transfer, but otherwise data is data.

The library should use the ecoding and collation required by HA, so shouldnt be an issue. Timestamps would make things inaccurate, shifted, but not exponentially wrong.

So something is not being transferred correctly (bug in sqlite3-to-mysql library) or HA has some bit of information either not store in DB(perhaps .storage/core.config_entries), or stored in DB but without proper relationship/constraint defined for migration to respect.

I’m hoping someone who understands schema and energy dashboard a bit more and has the data to test with can shine some light.

There is indeed a little difference while using the conversion.

The cost device is affected in the statistic_short_term
see my post

The rest is all fine, so whoever uses a cost entity should be aware that it gets messed up, so you need to start using a new entity instead to fix that problem.

Anyway, +++++ for @eddiewebb as besides this little issue it works flawless!, running my second system for a week now :slight_smile:

Can we highlight that little (but very important) information in the post here?

I was hoping to see a solution which makes it just work without the need of fixing stuff after the conversion.
Maybe someone with knowledge of the database scheme and energy dashboard has a „tweak“ for the conversion progress itself.

There is no need for fixing anything, besides some of us having a problem with the cost entity, thats it.
Its a home assistant issue and not a migration issue, as it’s really complicated how the last resets works in that particular entity.

Besides that i tested it for some days now and no problems at all, u can just convert safe.
Always make back-ups.

ALSO IMPORTANT, DURING THE CONVERSION DEPENDING HOW BIG THE homeassistant_v2.db is , THERE WILL BE NO DATA RECORDED IF YOU DUPLICATE THE DB AND KEEP THE OLD ONE RUNNING U CAN MANUAL INSERT THAT MISSING DATA AFTERWARDS ONLY IF YOU KNOW PHPMYADMIN!.

I tell you a way what went totally fine for me having tons of data in my database for almost 2 years.

So what i did:

  1. use winscp and go into homeassistant config folder, there you see homeassistant_v2.db file.
    Just right click it and press duplicate, call it home-assistant_v3.db

do all this in terminal :

# Install pre-req
apk add python3
apk add py3-pip
pip3 install sqlite3-to-mysql

than do this, be sure u change password,username,and host,db file

sqlite3mysql \
--sqlite-file /config/home-assistant_v3.db \
--mysql-database homeassistant \
--mysql-host core-mariadb \
--mysql-user homeassistant \
--mysql-password 'YOURPASSWORD' \
--ignore-duplicate-keys   

i changed this:

--sqlite-file /config/home-assistant_v2.db \  

to this:

 --sqlite-file /config/home-assistant_v3.db \

and did not use: (was my choice)

 --ignore-duplicate-keys

Always make back-up!!!
Just let it migrate, if its ready, u just change the yaml and let it use the new MARIADB url as recorder and restart Home assistant and you ready. thats it :slight_smile:

As some of us who are using the cost entity, somehow there is a bug/or mechanism that triggers the last reset and mess up the cost data.

The work-around for the cost entity is this:

  1. create a new total cost entity, and just import the old data doesn’t matter if it’s static or a fixed price entity… just import that into the new total cost entity and it should work as expected, just choose a day before and not the same day.

using this code:
the id’s are found here image using phpmyadmin.
be sure to change MYNEWID and OLDID and the date 2022-12-01.
thanks to @vingerha

insert the code here, be sure u select the correct database and always make back-ups!
image

insert into statistics (created,start,state,sum,metadata_id) select created, start, 0, sum, MYNEWID from statistics where metadata_id = OLDID and start < ‘2022-12-01’

When the next energy update happens in the energy dashboard, it could show for this day a negative value, no worries as the next day it calculates correct, and u can manual just correct the value of that day… all the old data is also normal visibel in the energy dashboard.

But remember this is not a fix just a temporary work around, till we know how the last reset works for that particular entity…

Again all credits to @eddiewebb for making this conversion possible

4 Likes

I do have one question regarding the new method.

Is home assistant daemon stopped while the conversion is running? I haven’t seen that 100% defined.

If it doesn’t matter, please add that information to your instructions in your step I linked to and should be referenced at the beginning of your instructions. Or if needs to be shutdown, should be one of the first steps of the migration.

Also make note at the beginning about memory requirements and workarounds if dealing with a large Home Assistant DB.

My commentary: It seems like leaving HA running while doing the conversion would leave some holes in the data once migrated.

Regards

1 Like

Just use my method to do it by just duplicating your homeassistant_v2.db so it stops recording, u can also do it while recording, but still doesn’t change anything. i did on 3 different systems 3 different ways. all the same.

Yes there will be holes between the time it converts thats normal, so you will lose some data during the conversion, depends how big your database is.
For me it took more than 2 hours, so yes 2 hours no EXTRA data… but i did duplicate the homeassistant_v2.db database so if i want i can manual import the data i just missed during the conversion, but i did not care.

As for energy, for me at least like solar power or gas or energy, it gets added anyway as its a sum entity, but sensor data is not recorded…

Hello There, I think i found one of your big peroblems. I had the Same Problem, my energymontior was going crazy. I used the python srcipt and transfered the DB with a second homeassisten in Virtualbox. when everythin was done i uploadet the DB with MySql Workbench. I was going crazy becaus the states table was correct and the data was plausible but the statistic had the same issiue. I setup a new Database without any table and let them creat by home assistant to understand what the difference could be. i recognized that the structure was worng. The uploadet Database used float as datatype and the original (createt by homeassistant) double.(for example) I compared all tablestructures and correctet them now the new incomming data looks much better :wink:


I am sorry for my bad english :wink:

1 Like

AMAZING!!!

I checked the the database which was created by script and you can see that the script was the problem. :smiley:


don´t use this structur :wink:

1 Like

Is „the script“ referring to this migration how-to and if so, will it get updated?

Yes I used this migration how-to. you can use ist also, but befor you change the recorder-url from your “live-system” you´ll have to controll the data-types and correct them. I made some screenshots for you, with my “correct” structures.










Yes, the prices mentioned are completely off. I don’t use them, so I didn’t notice.

Edit: Just read about the incorrect database structure. @eddiewebb any suggestions to what/if we need to manually correct? It runs so far, but indeed only prices are affected.

For the prices just use a total entity instead of static, or an entity containing the price will have you not correcting anything.
I did some new tests and worked as expected, so the problem has nothing to do with float value or without.

Its the last-reset whats implemented making the costs go reset if u use a (fixed price to track cost or entity that contains a price) instead of that just start using the total cost entity.

But many of the HASS users don’t even use a cost entity at all, so you can just ignore my posts :stuck_out_tongue:

So basically you won’t have to correct anything if you don’t use a cost entity.

Regards

I did get an error this morning (haven’t checked if I got more, but saw these after a restart:

Looks like these relate to the manual fixes mentioned above?

Unhandled database error while processing task CommitTask(): (MySQLdb.DataError) (1406, "Data too long for column 'event_type' at row 1") [SQL: INSERT INTO events (event_type, event_data, origin, origin_idx, time_fired, context_id, context_user_id, context_parent_id, data_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)] [parameters: ('recorder_5min_statistics_generated', None, None, 0, datetime.datetime(2022, 12, 13, 9, 30, 10, 480968, tzinfo=datetime.timezone.utc), '01GM5D3XNGB91RANM0XT1CGHRQ', None, None, None)] (Background on this error at: https://sqlalche.me/e/14/9h9h)
Unhandled database error while processing task CommitTask(): (MySQLdb.DataError) (1406, "Data too long for column 'event_type' at row 1") [SQL: INSERT INTO events (event_type, event_data, origin, origin_idx, time_fired, context_id, context_user_id, context_parent_id, data_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)] [parameters: ('recorder_5min_statistics_generated', None, None, 0, datetime.datetime(2022, 12, 13, 9, 35, 10, 427349, tzinfo=datetime.timezone.utc), '01GM5DD2JVKS40GFT2K9RQ38TE', None, None, None)] (Background on this error at: https://sqlalche.me/e/14/9h9h)
Unhandled database error while processing task CommitTask(): (MySQLdb.DataError) (1406, "Data too long for column 'event_type' at row 1") [SQL: INSERT INTO events (event_type, event_data, origin, origin_idx, time_fired, context_id, context_user_id, context_parent_id, data_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)] [parameters: ('recorder_5min_statistics_generated', None, None, 0, datetime.datetime(2022, 12, 13, 9, 40, 10, 465530, tzinfo=datetime.timezone.utc), '01GM5DP7K13H400BZ997D4AXE3', None, None, None)] (Background on this error at: https://sqlalche.me/e/14/9h9h)
Unhandled database error while processing task CommitTask(): (MySQLdb.DataError) (1406, "Data too long for column 'event_type' at row 1") [SQL: INSERT INTO events (event_type, event_data, origin, origin_idx, time_fired, context_id, context_user_id, context_parent_id, data_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)] [parameters: ('recorder_5min_statistics_generated', None, None, 0, datetime.datetime(2022, 12, 13, 9, 45, 10, 492551, tzinfo=datetime.timezone.utc), '01GM5DZCJWRZ033YA03ZYEAKFW', None, None, None)] (Background on this error at: https://sqlalche.me/e/14/9h9h)
Unhandled database error while processing task CommitTask(): (MySQLdb.DataError) (1406, "Data too long for column 'event_type' at row 1") [SQL: INSERT INTO events (event_type, event_data, origin, origin_idx, time_fired, context_id, context_user_id, context_parent_id, data_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)] [parameters: ('recorder_5min_statistics_generated', None, None, 0, datetime.datetime(2022, 12, 13, 9, 50, 10, 441131, tzinfo=datetime.timezone.utc), '01GM5E8HG9SPAQHDXGM0GNG8XY', None, None, None)] (Background on this error at: https://sqlalche.me/e/14/9h9h)

So just changed the allowed length on column ‘event_type’ from 32 to 64. Should resolve it if I understand the error message properly.

You sir, are a hero!

Worked flawlessly. Just a minor edit suggestion: I was searching how to stop recording. Can you add de steps ‘Developer Tools’ > ‘Services’ > Search for Recorder: Disable > Press the ‘Call service’?

1 Like

Hi,

I have a SQLite database with 30GB and I’m trying to convert to MariaDB on a 2GB Rpi4. All attempts with “sqlite3mysql” result in Killed when running: Transferring Table Events.

Most likely it is a lack of memory, since the database is large.

Is it possible to run this script on an external server or another PC?
My idea would be to copy the Sqlite DB to another machine, convert it to MariaDB and then import the new database into RPI.

Thank you and happy new year!