I’ve been running my Home Assistant database on PostgreSQL for a while due to issues with the Recorder’s SQLite database. However, since the Recorder has seen improvements (like long-term statistics), I decided to migrate back to SQLite. Additionally, my PostgreSQL ran on a separate server which added complexity, and I’d like the option to turn off that server occasionally without impacting my Home Assistant setup.
I couldn’t find a straightforward guide on how to migrate data back to SQLite, so I decided to try my luck and figure it out myself. I’m sharing my process here to help others who might be in a similar situation. This is a rough guide, but it should provide a good starting point!
Steps I Followed:
- Clean up data in PostgreSQL:
- Reduce the amount of data to migrate. Less data means a faster migration.
- This step is the only one that actually deletes anything. So make backups as needed.
- Export data from PostgreSQL to CSV:
- Ensure the column order matches the SQLite schema and apply necessary transformations (like datetime and boolean conversions) during export.
- Prepare a fresh
home-assistant_v2.db
database:- Start with a clean slate for importing the data.
- Import the CSV files into SQLite:
- Use the SQLite command-line tool to import each CSV file into the corresponding table.
- Fix null fields:
- CSV files don’t natively handle null values, so empty strings need to be replaced with nulls.
- Finish up and enjoy:
- Verify everything is working correctly with your newly migrated SQLite database.
Detailed Process
My PostgreSQL database was quite large—around 80 GB. I had set my Recorder retention to 365 days, which retained a lot of data. To reduce this, I started by cleaning up the states
table, setting the Recorder to purge data older than 10 days. Despite several days of waiting (and invocations of the purge service), I still had hundreds of millions of rows, so I needed another approach.
Eventually, I decided to rename my states
table to states_old
and create a new states
table with the same structure. Here’s why:
- Figure out what to delete: Depending on your data needs, decide how much you can safely delete.
- Renaming the table: Since I needed to keep about 12% of my data, it was faster to insert the needed rows into a new table than to delete rows from the old one.
- Protip, adjusting sequence numbers: If, like me, you get sidetracked and need to run on the Postgres again for a while. Make sure to set the sequence number for new rows higher than your current max
state_id
. This ensures Home Assistant continues functioning without conflicts. Search the internet for how to do this.
My Script
Below is the script I used. It’s more of a series of notes than a script you can run directly. You can copy-paste these commands to handle the export from PostgreSQL to CSV, making sure to apply the necessary data transformations. Note that the column layout should match your SQLite database.
After exporting, get a fresh home-assistant_v2.db
from Home Assistant. I did this by reconfiguring my Recorder to use the built-in database, restarting Home Assistant, and waiting for it to create a new database file. Then, I stopped Home Assistant again to perform the import.
Export Data from PostgreSQL
Adjust credentials and other details as necessary
# Set PostgreSQL credentials and server details
PGHOST="postgres.home"
PGDATABASE="homeassistant"
PGUSER="homeassistant"
PGPASSWORD="homeassistant"
# Export data to CSV files with column ordering, date formatting, and boolean conversion
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT data_id, hash, shared_data FROM event_data) TO 'event_data.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT event_type_id, event_type FROM event_types) TO 'event_types.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT attributes_id, hash, shared_attrs FROM state_attributes) TO 'state_attributes.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT metadata_id, entity_id FROM states_meta) TO 'states_meta.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT id, statistic_id, source, unit_of_measurement, CASE WHEN has_mean THEN 1 ELSE 0 END as has_mean, CASE WHEN has_sum THEN 1 ELSE 0 END as has_sum, name FROM statistics_meta) TO 'statistics_meta.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT run_id, (start AT TIME ZONE 'UTC')::timestamp as start, (\"end\" AT TIME ZONE 'UTC')::timestamp as \"end\", CASE WHEN closed_incorrect THEN 1 ELSE 0 END as closed_incorrect, (created AT TIME ZONE 'UTC')::timestamp as created FROM recorder_runs) TO 'recorder_runs.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT migration_id, version FROM migration_changes) TO 'migration_changes.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT change_id, schema_version, (changed AT TIME ZONE 'UTC')::timestamp as changed FROM schema_changes) TO 'schema_changes.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT run_id, (start AT TIME ZONE 'UTC')::timestamp as start FROM statistics_runs) TO 'statistics_runs.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT id, (created AT TIME ZONE 'UTC')::timestamp as created, created_ts, metadata_id, (start AT TIME ZONE 'UTC')::timestamp as start, start_ts, mean, min, max, (last_reset AT TIME ZONE 'UTC')::timestamp as last_reset, last_reset_ts, state, sum FROM statistics) TO 'statistics.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT id, (created AT TIME ZONE 'UTC')::timestamp as created, created_ts, metadata_id, (start AT TIME ZONE 'UTC')::timestamp as start, start_ts, mean, min, max, (last_reset AT TIME ZONE 'UTC')::timestamp as last_reset, last_reset_ts, state, sum FROM statistics_short_term) TO 'statistics_short_term.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT event_id, event_type, event_data, origin, origin_idx, (time_fired AT TIME ZONE 'UTC')::timestamp as time_fired, time_fired_ts, context_id, context_user_id, context_parent_id, data_id, context_id_bin, context_user_id_bin, context_parent_id_bin, event_type_id FROM events) TO 'events.csv' CSV HEADER"
psql -h $PGHOST -U $PGUSER -d $PGDATABASE -c "\COPY (SELECT state_id, entity_id, state, attributes, event_id, (last_changed AT TIME ZONE 'UTC')::timestamp as last_changed, last_changed_ts, last_reported_ts, (last_updated AT TIME ZONE 'UTC')::timestamp as last_updated, last_updated_ts, old_state_id, attributes_id, context_id, context_user_id, context_parent_id, origin_idx, context_id_bin, context_user_id_bin, context_parent_id_bin, metadata_id FROM states) TO 'states.csv' CSV HEADER"
Prepare and Import into SQLite
-
Acquire the
home-assistant_v2.db
: Get a fresh database file by reconfiguring Home Assistant to use SQLite, restarting it, and then stopping it once the new database is created. -
Open the SQLite database using the SQLite CLI tool:
sqlite3 home-assistant_v2.db
-
Clean out existing data that Home Assistant may have saved before shutting down:
DELETE FROM main.event_data; DELETE FROM main.event_types; DELETE FROM main.events; DELETE FROM main.statistics_meta; DELETE FROM main.statistics_runs; DELETE FROM main.statistics; DELETE FROM main.states; DELETE FROM main.states_meta; DELETE FROM main.statistics_short_term; DELETE FROM main.state_attributes; DELETE FROM main.recorder_runs; DELETE FROM main.migration_changes;
-
Import data from the CSV files exported from PostgreSQL:
.mode csv .import event_data.csv event_data .import event_types.csv event_types .import state_attributes.csv state_attributes .import states_meta.csv states_meta .import statistics_meta.csv statistics_meta .import recorder_runs.csv recorder_runs .import migration_changes.csv migration_changes .import statistics_runs.csv statistics_runs .import statistics_short_term.csv statistics_short_term .import statistics.csv statistics .import events.csv events .import states.csv states
-
Replace empty strings with NULL for all nullable fields:
## Replace empty strings with NULL in 'event_data'
UPDATE event_data SET
hash = NULLIF(hash, ''),
shared_data = NULLIF(shared_data, '');
## Replace empty strings with NULL in 'event_types'
UPDATE event_types SET
event_type = NULLIF(event_type, '');
## Replace empty strings with NULL in 'state_attributes'
UPDATE state_attributes SET
hash = NULLIF(hash, ''),
shared_attrs = NULLIF(shared_attrs, '');
## Replace empty strings with NULL in 'states_meta'
UPDATE states_meta SET
entity_id = NULLIF(entity_id, '');
## Replace empty strings with NULL in 'statistics_meta'
UPDATE statistics_meta SET
statistic_id = NULLIF(statistic_id, ''),
source = NULLIF(source, ''),
unit_of_measurement = NULLIF(unit_of_measurement, ''),
name = NULLIF(name, '');
## Replace empty strings with NULL in 'recorder_runs'
UPDATE recorder_runs SET
start = NULLIF(start, ''),
end = NULLIF(end, '');
## Replace empty strings with NULL in 'migration_changes'
## No nullable fields
## Replace empty strings with NULL in 'schema_changes'
UPDATE schema_changes SET
schema_version = NULLIF(schema_version, '');
## Replace empty strings with NULL in 'statistics_runs'
## No nullable fields
## Replace empty strings with NULL in 'events'
UPDATE events SET
event_type = NULLIF(event_type, ''),
event_data = NULLIF(event_data, ''),
origin = NULLIF(origin, ''),
context_id = NULLIF(context_id, ''),
context_user_id = NULLIF(context_user_id, ''),
context_parent_id = NULLIF(context_parent_id, ''),
context_id_bin = NULLIF(context_id_bin, ''),
context_user_id_bin = NULLIF(context_user_id_bin, ''),
context_parent_id_bin = NULLIF(context_parent_id_bin, '');
## Replace empty strings with NULL in 'statistics'
UPDATE statistics SET
created = NULLIF(created, ''),
start = NULLIF(start, ''),
mean = NULLIF(mean, ''),
min = NULLIF(min, ''),
max = NULLIF(max, ''),
last_reset = NULLIF(last_reset, ''),
state = NULLIF(state, ''),
sum = NULLIF(sum, '');
## Replace empty strings with NULL in 'statistics_short_term'
UPDATE statistics_short_term SET
created = NULLIF(created, ''),
start = NULLIF(start, ''),
mean = NULLIF(mean, ''),
min = NULLIF(min, ''),
max = NULLIF(max, ''),
last_reset = NULLIF(last_reset, ''),
state = NULLIF(state, ''),
sum = NULLIF(sum, '');
## Replace empty strings with NULL in 'states'
UPDATE states SET
entity_id = NULLIF(entity_id, ''),
state = NULLIF(state, ''),
attributes = NULLIF(attributes, ''),
last_changed = NULLIF(last_changed, ''),
last_updated = NULLIF(last_updated, ''),
context_id = NULLIF(context_id, ''),
context_user_id = NULLIF(context_user_id, ''),
context_parent_id = NULLIF(context_parent_id, ''),
context_id_bin = NULLIF(context_id_bin, ''),
context_user_id_bin = NULLIF(context_user_id_bin, ''),
context_parent_id_bin = NULLIF(context_parent_id_bin, '');
Conclusion
By following these steps, you should be able to successfully migrate your Home Assistant database from PostgreSQL back to SQLite. The process involves careful handling of data types and null values, but it’s definitely doable. Good luck!