I think I might have a db older than May 2024 (mine was created at least a year ago). But I doubt you mean creation date. Could you clarify please?
I think I might have a db older than May 2024 (mine was created at least a year ago). But I doubt you mean creation date. Could you clarify please?
# sqlite home-assistant_v2.db
sqlite> select * from schema_changes;
1|33|2023-03-05 03:12:00.672507
2|34|2023-03-08 12:03:43.173219
3|35|2023-03-08 12:03:43.223264
4|36|2023-04-05 19:53:47.382529
5|37|2023-04-05 19:53:47.405016
6|38|2023-04-05 19:53:48.134406
7|39|2023-04-05 19:53:48.224035
8|40|2023-04-05 19:53:48.270669
9|41|2023-04-05 19:53:48.278902
10|42|2023-11-01 17:59:36.802397
Make sure the last row is 43 or higher. In the above example, its only 42.
gotcha, my db schema was at 44:
sqlite> select * from schema_changes;
...
30|42|2023-11-03 00:08:36.442936
31|43|2024-04-05 14:10:17.169585
32|44|2024-08-08 07:54:57.273545
fyi: Just finished executing the statements on a 25GB db. They took around 90 minutes to complete, mostly pegging one core at 100% for long periods of time. After uploading it back to HAOS, HA was able to successfully purge again.
The Insert should be done in batches of 10,000 or 100,000. The drop table should be a rename and dropped after the conversion is complete. How can I go about helping with this? I do database work and do not understand GIT and such. Any pointers would be great.
The above is a quick solution for an offline migration. It is only a rough approximation of a 12-step SQLite migration, and does not exactly match the implementation HA uses internally.
I appreciate your help, but some suggestions apply to something other than this use case:
- Batching would not make sense in this case as there are no row-level locks in SQLite, and doing it in batches would make it slower in most cases. That might make sense if this was for
MariaDB
withInnoDB
, but not for this case. Additionally, SQLite shows theINSERT INTO new_X SELECT ... FROM X.
syntax in its docs. - Dropping the temp table does not make sense either, as you would lose all the data since it has been renamed the new
states
table.
2024.8.1 is shipping very soon. Don’t forget to free up disk space before updating.
https://www.home-assistant.io/integrations/recorder/#disk-space-requirements
My DB is currently 8.9GB, I managed to free up 9.4GB (one day I’d like to understand where the all of my 32GB of space allocated to the HA VM actually went…) but a manual purge after upgrading to 2024.8.1 still failed due to insufficient disk space:
Error executing query
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: database or disk is full
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/helpers/recorder.py", line 101, in session_scope
yield session
File "/usr/src/homeassistant/homeassistant/components/recorder/migration.py", line 2515, in rebuild_sqlite_table
session.execute(text(f"DROP TABLE {orig_name}"))
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2351, in execute
return self._execute_internal(
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2245, in _execute_internal
result = conn.execute(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
return meth(
^^^^^
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
return self._exec_single_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2353, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
self.dialect.do_execute(
File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database or disk is full
[SQL: DROP TABLE states]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
-rw-r--r-- 1 root root 8989638656 Aug 10 19:41 home-assistant_v2.db
-rw-r--r-- 1 root root 13860864 Aug 10 19:41 home-assistant_v2.db-shm
-rw-r--r-- 1 root root 7124443136 Aug 10 19:41 home-assistant_v2.db-wal
I still however have 2+ GB of free space:
5f2e27d3aa9c:/data/homeassistant# df -h
Filesystem Size Used Available Use% Mounted on
overlay 34.7G 30.6G 2.6G 92% /
to be clear, the shell is running on the hassio_supervisor container, connected via the portainer addon.
After reboot, I go back to 9.3GB of available space:
-rw-r--r-- 1 root root 8989978624 Aug 10 19:46 home-assistant_v2.db
-rw-r--r-- 1 root root 32768 Aug 10 19:47 home-assistant_v2.db-shm
-rw-r--r-- 1 root root 8272 Aug 10 19:47 home-assistant_v2.db-wal
Filesystem Size Used Available Use% Mounted on
overlay 34.7G 24.0G 9.3G 72% /
Your database is contained in the 3 files you listed above: 16127942656 bytes combined
Which means you could need as much as 15.02GiB free.
There is likely some duplication in the WAL which will go away after a repack, so you probably don’t need exactly that much, but 9.4GB isn’t likely to be enough.
I’ve edited my post - the 15GB combined size was a result of the purge attempt, after a reboot I’m back to the DB being (barely…) smaller than my available disk space.
If you have any pointers as to how to free up the rest of the 24GB I genuinely cannot find, I’ll be grateful. I have a different Linux system I can use to purge the DB offline but I honestly don’t know how to extract/import the DB from my HA instance (VM on Proxmox) without it actually running…
If the migration has completed successfully, and you have enough free disk space, repack the database with the recorder.purge
action (with repark turned on) to reduce size. This will combine the WAL file into the main DB file and clean up some of the duplication.
Cleaning up your disk usage beyond the database is outside this thread’s scope, and it would be better to start a new thread for that.
Will the migration try to run automatically after a reboot (once on 2024.8.1) or do I need to initiate it with the recorder.purge service action? Because that’s what I did and how I ended up with the error message.
EDIT: I see I’m (almost) out of space again so I assume it starts automatically.
Is there any way to tell if this is working? I ran a purge
with repack
enabled, and gave myself +40 GB to my VM thinking I would be okay since the DB is only 19 GB, but it created a wal
file that is larger than the DB, and my storage is almost maxed out again. Did I not give it enough?
Or should I reboot, give more storage, and try again?
Edit: I left this going for like 2 hours, and now I’m down to 73% storage free. So I guess I cut it close, but it worked? The wal
file is still present. I’m guessing it will disappear on reboot?
Edit edit: it disappeared.
Final update from me. Back to normal. Changed my purge_keep_days back to default by commenting it out of my .yaml file. Was set to 45 days. Now that 24 hours has ticked by and an auto purge ran, I am successfully back to normal. From 26 GB down to 4 GB.
Enabling debug logging for recorder.migration
will give you logs detailing the schema upgrade/migration process:
logger:
logs:
homeassistant.components.recorder.migration: debug
On my system the backup directory in homeassistant took up 20GB, you might want to check it.
Other method is to do a du -sh *
in the root directory to see how much space each folder takes and drill down from there.
This 2024.8.* core update ruined my entire weekend.
My issue must be #2 as outlined in OPs edit. I wiped my system, started clean, did a million things to try and resolve these stupid issues.
RPi 4, power over ethernet, External SSD with 500gb space, running HAOS, thousands of entities, local video feeds over RTSP… and yes, a lot of custom integrations. Was running solid for well over a year without issue. Then since upgrading my host just randomly started becoming unresponsive multiple times per day, and the only way to get it back was pulling the ethernet cord to reboot.
Insanely frustrating. Seemed like as long as I didn’t touch anything on the UI, the host would stay alive for quite a while (at least a few hours). But once I started moving around the UI, pressing a button on one of my dashboards, etc… it was seconds or minutes before I get the stupid “Connection lost” toast and the whole process started over again. I went full “burn it down” mode and deleted my database files, stored images, filtered out a ton of entities from the recorder… anything I could think of to reduce load on the system and stop it from hanging. Nothing.
I ended up using the terminal to roll back to 2024.7.1 after fighting with it all weekend, and whaddya know — zero hangs.
Works completely fine now, though I had to go and fix the “Action” vs “Service” syntax across a bunch of automations.
I’m going to give it a lot of time before I open up this core update can of worms again. Maybe I’ll get forced into getting a proper NUC to handle the poor optimization.
Ugh. What a waste of a weekend.
2024.8 caused one automation to “go nuts” for me. It’s trigger is “timer.cancelled” and i got error that it was ran 4096 times at once by this trigger( then system crashed). This very automation ran perfectly over 2 years. Now i’ve had to change trigger…
Very strange…
This 2024.8 upgrade is such a shitshow. HELP
My home-assistant_v2.db is 7.1GB, due to this bug. Today I started upgrade and it seems I ran out of space at some point. I was looking for a way to re-trigger the upgrade process but there doesn’t seem to be any.
So went back and restored my backup. Lost ten hours of data.
Saw the PSA on the disk space and generously added 20GB. More than twice as much as should be needed. Now I am running the upgrade again for over 5 hours and home-assistant_v2.db-wal grows out of bounds. It’s already at 10GB.
Getting errors about system load too high. Logs are filled with timeouts, connection errors etc.
Already deleting media but im afraid will be running out of space when theres not more to delete. And cant add storage without reboot
Where can i see the status/progress of this database upgrade?
How do I know it is ended/successful?
I don’t use the internal backup at all, I run regular backups of the entire VM, plus snapshots before any major version upgrade. In the end I had to free up 14GB of disk space for a 9GB DB to successfully go back to the ~3.5GB it was prior to the bug introduced in 2024.7.0.
To clarify:
Doing the inserts in batches with commits would save on lock space. I agree it would slow down smaller databases but would actually speed up larger ones due to processing smaller commits. A more sophisticated approach would be to count the rows and based on a threshold, do it in one set or in batches.
I should have been clearer. I would not drop the States table until the end. By renaming it, I would have it until I was certain the new table was complete. It is just a pattern that I follow.
I have always thought I could not help with Home Assistant development. That is because I work mostly in back ends with tools like SQL. If my skills can be of any help, I am happy to jump in.