PSA: 2024.7 recorder problems

Hope this is correct: Recorder still locking up even after update to 2024.8.0 · Issue #123348 · home-assistant/core · GitHub

For everyone who ran out of disk space during the original table rebuild, there is another fix coming

1 Like

Hello, I’m still have an issue with the db, also. But I’m unsure if its the same problem.
I have deactivated the automatic recorder purge, when I do a manually purge with activated repack I have a corrupted database and home assistant comes up with a clean db. Using sqlite here, current DB size ~13.5GB. Before deactivating automatic purge ~6-8GB.

Is this the issue in this thread or is the issue that the recorder got only stuck?

2024.8.1 will fix the issue where the rebuild wasn’t retired if it could not complete the first time.

Please be sure to check your disk space before updating

Thanks, I was indeed initially confronted with a full disk after upgrading now I’ve come to think of it.

Is there a way to kick this rebuild off manually in it’s current state to test the procedure?

Here are the sql statements for an OFFLINE migration. Be sure to make a backup first. This code is only safe to run with schema 43 or 44.

- Do not run it on older database (from before May 2024).
- Do not run it on a live system, it may destroy your data.
- Do not run this on anything other than SQLite.

BEGIN;
PRAGMA foreign_keys=OFF;
COMMIT;
BEGIN;
CREATE TABLE states_temp_1720542205 (
	state_id INTEGER NOT NULL, 
	entity_id CHAR(0), 
	state VARCHAR(255), 
	attributes CHAR(0), 
	event_id SMALLINT, 
	last_changed CHAR(0), 
	last_changed_ts FLOAT, 
	last_reported_ts FLOAT, 
	last_updated CHAR(0), 
	last_updated_ts FLOAT, 
	old_state_id INTEGER, 
	attributes_id INTEGER, 
	context_id CHAR(0), 
	context_user_id CHAR(0), 
	context_parent_id CHAR(0), 
	origin_idx SMALLINT, 
	context_id_bin BLOB, 
	context_user_id_bin BLOB, 
	context_parent_id_bin BLOB, 
	metadata_id INTEGER, 
	PRIMARY KEY (state_id), 
	FOREIGN KEY(old_state_id) REFERENCES states (state_id), 
	FOREIGN KEY(attributes_id) REFERENCES state_attributes (attributes_id), 
	FOREIGN KEY(metadata_id) REFERENCES states_meta (metadata_id)
);
INSERT INTO states_temp_1720542205 SELECT state_id,entity_id,state,attributes,event_id,last_changed,last_changed_ts,last_reported_ts,last_updated,last_updated_ts,old_state_id,attributes_id,context_id,context_user_id,context_parent_id,origin_idx,context_id_bin,context_user_id_bin,context_parent_id_bin,metadata_id FROM states;
DROP TABLE states;
ALTER TABLE states_temp_1720542205 RENAME TO states;
CREATE INDEX ix_states_metadata_id_last_updated_ts ON states (metadata_id, last_updated_ts);
CREATE INDEX ix_states_last_updated_ts ON states (last_updated_ts);
CREATE INDEX ix_states_context_id_bin ON states (context_id_bin);
CREATE INDEX ix_states_attributes_id ON states (attributes_id);
CREATE INDEX ix_states_old_state_id ON states (old_state_id);
PRAGMA foreign_key_check;
COMMIT;
BEGIN;
PRAGMA foreign_keys=ON;
COMMIT;
1 Like

I think I might have a db older than May 2024 (mine was created at least a year ago). But I doubt you mean creation date. Could you clarify please?

I think I might have a db older than May 2024 (mine was created at least a year ago). But I doubt you mean creation date. Could you clarify please?

# sqlite home-assistant_v2.db
sqlite> select * from schema_changes;
1|33|2023-03-05 03:12:00.672507
2|34|2023-03-08 12:03:43.173219
3|35|2023-03-08 12:03:43.223264
4|36|2023-04-05 19:53:47.382529
5|37|2023-04-05 19:53:47.405016
6|38|2023-04-05 19:53:48.134406
7|39|2023-04-05 19:53:48.224035
8|40|2023-04-05 19:53:48.270669
9|41|2023-04-05 19:53:48.278902
10|42|2023-11-01 17:59:36.802397

Make sure the last row is 43 or higher. In the above example, its only 42.

gotcha, my db schema was at 44:

sqlite> select * from schema_changes;
...
30|42|2023-11-03 00:08:36.442936
31|43|2024-04-05 14:10:17.169585
32|44|2024-08-08 07:54:57.273545

fyi: Just finished executing the statements on a 25GB db. They took around 90 minutes to complete, mostly pegging one core at 100% for long periods of time. After uploading it back to HAOS, HA was able to successfully purge again.

1 Like

The Insert should be done in batches of 10,000 or 100,000. The drop table should be a rename and dropped after the conversion is complete. How can I go about helping with this? I do database work and do not understand GIT and such. Any pointers would be great.

The above is a quick solution for an offline migration. It is only a rough approximation of a 12-step SQLite migration, and does not exactly match the implementation HA uses internally.

I appreciate your help, but some suggestions apply to something other than this use case:

  • Batching would not make sense in this case as there are no row-level locks in SQLite, and doing it in batches would make it slower in most cases. That might make sense if this was for MariaDB with InnoDB, but not for this case. Additionally, SQLite shows the INSERT INTO new_X SELECT ... FROM X. syntax in its docs.
  • Dropping the temp table does not make sense either, as you would lose all the data since it has been renamed the new states table.

2024.8.1 is shipping very soon. Don’t forget to free up disk space before updating.

https://www.home-assistant.io/integrations/recorder/#disk-space-requirements

My DB is currently 8.9GB, I managed to free up 9.4GB (one day I’d like to understand where the all of my 32GB of space allocated to the HA VM actually went…) but a manual purge after upgrading to 2024.8.1 still failed due to insufficient disk space:

Error executing query
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: database or disk is full

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/recorder.py", line 101, in session_scope
    yield session
  File "/usr/src/homeassistant/homeassistant/components/recorder/migration.py", line 2515, in rebuild_sqlite_table
    session.execute(text(f"DROP TABLE {orig_name}"))
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2351, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2245, in _execute_internal
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
    return meth(
           ^^^^^
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2353, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database or disk is full
[SQL: DROP TABLE states]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
-rw-r--r--    1 root     root     8989638656 Aug 10 19:41 home-assistant_v2.db
-rw-r--r--    1 root     root      13860864 Aug 10 19:41 home-assistant_v2.db-shm
-rw-r--r--    1 root     root     7124443136 Aug 10 19:41 home-assistant_v2.db-wal

I still however have 2+ GB of free space:

5f2e27d3aa9c:/data/homeassistant# df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                  34.7G     30.6G      2.6G  92% /

to be clear, the shell is running on the hassio_supervisor container, connected via the portainer addon.

After reboot, I go back to 9.3GB of available space:

-rw-r--r--    1 root     root     8989978624 Aug 10 19:46 home-assistant_v2.db
-rw-r--r--    1 root     root         32768 Aug 10 19:47 home-assistant_v2.db-shm
-rw-r--r--    1 root     root          8272 Aug 10 19:47 home-assistant_v2.db-wal

Filesystem                Size      Used Available Use% Mounted on
overlay                  34.7G     24.0G      9.3G  72% /

Your database is contained in the 3 files you listed above: 16127942656 bytes combined

Which means you could need as much as 15.02GiB free.

There is likely some duplication in the WAL which will go away after a repack, so you probably don’t need exactly that much, but 9.4GB isn’t likely to be enough.

I’ve edited my post - the 15GB combined size was a result of the purge attempt, after a reboot I’m back to the DB being (barely…) smaller than my available disk space.

If you have any pointers as to how to free up the rest of the 24GB I genuinely cannot find, I’ll be grateful. I have a different Linux system I can use to purge the DB offline but I honestly don’t know how to extract/import the DB from my HA instance (VM on Proxmox) without it actually running…

If the migration has completed successfully, and you have enough free disk space, repack the database with the recorder.purge action (with repark turned on) to reduce size. This will combine the WAL file into the main DB file and clean up some of the duplication.

Cleaning up your disk usage beyond the database is outside this thread’s scope, and it would be better to start a new thread for that.

Will the migration try to run automatically after a reboot (once on 2024.8.1) or do I need to initiate it with the recorder.purge service action? Because that’s what I did and how I ended up with the error message.

EDIT: I see I’m (almost) out of space again so I assume it starts automatically.

Is there any way to tell if this is working? I ran a purge with repack enabled, and gave myself +40 GB to my VM thinking I would be okay since the DB is only 19 GB, but it created a wal file that is larger than the DB, and my storage is almost maxed out again. Did I not give it enough?

image

Or should I reboot, give more storage, and try again?

Edit: I left this going for like 2 hours, and now I’m down to 73% storage free. So I guess I cut it close, but it worked? The wal file is still present. I’m guessing it will disappear on reboot?

Edit edit: it disappeared.

1 Like

Final update from me. Back to normal. Changed my purge_keep_days back to default by commenting it out of my .yaml file. Was set to 45 days. Now that 24 hours has ticked by and an auto purge ran, I am successfully back to normal. From 26 GB down to 4 GB.
image

2 Likes

Enabling debug logging for recorder.migration will give you logs detailing the schema upgrade/migration process:

logger:
  logs:
    homeassistant.components.recorder.migration: debug