Running with Very Large Databases

I hasten to add that I’m not talking about 10 GB. I’m not even talking about 50GB. I’m currently sitting in the vicinity of 80GB.

(And I should say that I’ve already pruned heavy contributors and too-frequent updaters from the entities I’m storing. I think it’s just because I have closing-in-on 3,000 entities. Of those, the biggest contributor takes up around 3.5% of the space and only six of them more than 1%, so I think the gains from adjusting my recorder settings are pretty much used up.)

I currently am using a remote MySQL 8.2 container to host the recorded database (I have HA running on my cluster with its config directory mounted over NFS from the NAS, so SQLite isn’t an option and avoiding tying the DB to any given cluster node is preferable).

But this is starting to creak. While HA performs well automation-wise and in executing actions, fetching data to display on the dashboard, particularly for any history graphs or other data-over-time types of display has become very slow indeed.

I’m looking for advice on how to optimize this. Would Postgres be better at handling a database of this size, or this type of read? Is there anything else I’ve overlooked that would improve performance in this sort of scenario? Anyone else out there with a really big database and advice to offer?

All thoughts much appreciated!

You wouldn’t want my advice, but I would delete it. I delete mine when it gets to a gig. like once a year or so. I would rather delete everything over like 3 months old, but just tossing it is the next best thing.

1 Like

You’re stretching HA and your database beyond its capabilities. It’s one thing to keep a large database to do your own queries; it’s another to expect HA to deal with a ginormous database, when it isn’t clear that either component was meant to work with one another for that purpose.

Keep your database of archived data if you want; I do that. But keep it separate from HA’s production database.

2 Likes

+1 on what both @Sir_Goodenough and @d921 have said. 80GB? That’s intense! I would also take action on what information you actually want in your DB and adjust your recorder configuration. For example, I’ve got plenty of devices that have signal strength entities (WiFi and Zigbee). I like having the entities for potential future troubleshooting, but why would I care that my bathroom light was -66 dBm 2 weeks ago? Nah… That information is filtered out of my recorder.

2 Likes

I have 3600+ entities in my system and I have ~10GB of MariaDB holding whole my history for 30 days. So something wrong with your setup.
What is you data retention, after how many days do you purge database?
You mentioned nothing about exclusions and inclusion of entities in recorder. I excluded by defauls everything from being recorded and only explicitly added these entities for which I want to display some charts (and there is plenty of them :slight_smile: ). Otherwise you do not need history, only current states. Additional reason for inclusion is some troubleshooting or debugging of entity history, but this is temporary only.
Finally after finetuning of databvase you do noot need to delete it, it will eventually shrink over time as unused history is purged.

1 Like

I’m probably going to end up doing that by default if I end up switching DB servers, so… :slight_smile:

(Shame about the [relatively small] long-term-statistics data. I wonder if there’s any way to fish that out of the old database…)

I should probably have been clear up front that, well, yes, I know. :grin:

By what information we’ve got from the HA team and elsewhere, my installation could be fairly described as a heckin’ chonker along most metrics. (And the forbidden k8s install, too!) So I haven’t got here by accident, as it were, but because I’m the kind of home-automation enthusiast who enjoys pushing the limits, and am looking to share ideas with my fellow limit-pushers.

(I do appreciate your response; just wanted to clarify the angle I’m coming from.)

I do that, too - I rake off data points I want for long-term analysis into a separate InfluxDB. The recorder database itself shouldn’t have anything older than a fortnight.

My recorder database purge_keep_days is set to 14 days (a little longer than the default, but not excessively so, I would have thought), and I have an automation set up to purge and repack the database once per week to keep its size down.

I did, I think? I’m still using default-include as so far I’ve wanted to keep the drill-down-into-history option available, but I’ve always excluded unnecessary entities (like time and date entities, and randomness entities, etc.) from the recorder. I’ve also excluded entities which either have a very large state/attributes and/or which are updated excessively frequently, and have checked this with database queries; I’ve got no outliers left that are taking up unusually large amounts of database space.

It’s a good thought, though, and if I can’t database-tune my way out of this I may resort to excluding all the diagnostic entities except when needed.

(Sure would be nice if we could exclude all diagnostic entities in one line…)

Actually it is very easy (quoting official recorder documentation). What is recorded:

  1. Only includes (mirekmal: only includes listed in configuration.yaml)
  • Entity listed in entities include: include
  • Otherwise, entity matches domain include: include
  • Otherwise, entity matches glob include: include
  • Otherwise: exclude

This is actually what I do, no excludes and includes only and it works perfectly fine, only entities explicitly listed are stored in the database :slight_smile:

I run with a 26 gb database on mariadb and it works fine. I have it on a fast disk and give it 4gb of RAM running on an AMD middle of the road CPU. Memory and disk speed are the key levers. What are you running with?

# [GUIDE] Fix & Shrink Massive HA Databases (>50GB) – The “Table-Swap” Method

### The Problem

When a Home Assistant database explodes (e.g., to 60GB) due to millions of orphaned entries, standard PURGE or DELETE commands fail. They take hours, lock your I/O, and often lead to further corruption. This usually happens when the states table is flooded with “zombie” rows that lack a valid entity association.

### Step 0: The “NULL-Bloat” Diagnostic

Run this command on your PC to see if your database is filled with garbage rows:

```cmd
sqlite3.exe “home-assistant_v2.db” “SELECT quote(entity_id), COUNT() FROM states GROUP BY entity_id ORDER BY COUNT() DESC LIMIT 5;”

```

If you see a result like NULL | 213,892,148, your database is 80% garbage. Deleting these via standard SQL will crash your NUC. You must use the Table-Swap method.

### The Solution: The “Table-Swap”

Instead of deleting garbage, we “extract the gold” by streaming valid rows into a fresh database. This method uses sequential I/O, which is exponentially faster than random-access deletion.

### Step 1: Preparation & Recovery

1. Stop HA Core: Run ha core stop to prevent file locks.
2. Move to PC: Transfer the .db to a computer with a fast NVMe SSD and sqlite3.exe.
3. Structural Recovery: If your DB is corrupted, create a stable base first:
sqlite3.exe home-assistant_v2.db .recover > HA_FIXED.db

### Step 2: Identify Your Unique Schema

SQLite requires an exact match for column counts during a mass import. Check your source with:
sqlite3.exe HA_FIXED.db ".schema states" ".schema statistics"

Count the column names inside the parentheses (...):

| Feature in Output | Column Count | Type |
| — | — | — |
| entity_id_low_id present | 11 | Modern (v4) |
| event_id present, but no old_state_id | 15 | Hybrid (v2/v3) |
| old_state_id AND origin_idx present | 20 | Legacy (v1) |
| mean_weight present in statistics | 14 | Legacy Stats |
| mean_weight missing in statistics | 10 | Modern Stats |

### Step 3: Create the “Custom-Fit” Target (HA_NEW.db)

Create a new file HA_NEW.db. You must define tables manually to include PRIMARY KEY AUTOINCREMENT, otherwise the Recorder will fail to write new data later.

Choose the block that matches your column count from Step 2:

#### Option A: Modern (11 Columns States / 10 Columns Stats)

```sql
sqlite3.exe HA_NEW.db “CREATE TABLE states (state_id INTEGER PRIMARY KEY AUTOINCREMENT, entity_id_low_id INTEGER, state VARCHAR(255), attributes_id INTEGER, last_changed_ts FLOAT, last_reported_ts FLOAT, last_updated_ts FLOAT, old_state_id INTEGER, attributes CHAR(0), context_id_bin BLOB, metadata_id INTEGER); CREATE TABLE statistics (id INTEGER PRIMARY KEY AUTOINCREMENT, created_ts FLOAT, start_ts FLOAT, mean FLOAT, min FLOAT, max FLOAT, last_reset_ts FLOAT, state FLOAT, sum FLOAT, metadata_id INTEGER);”

```

#### Option B: Hybrid (15 Columns States / 10 Columns Stats)

```sql
sqlite3.exe HA_NEW.db “CREATE TABLE states (state_id INTEGER PRIMARY KEY AUTOINCREMENT, entity_id VARCHAR(255), state VARCHAR(255), attributes TEXT, event_id INTEGER, last_changed DATETIME, last_changed_ts FLOAT, last_reported_ts FLOAT, last_updated DATETIME, last_updated_ts FLOAT, old_state_id INTEGER, attributes_id INTEGER, context_id VARCHAR(36), context_user_id VARCHAR(36), context_parent_id VARCHAR(36)); CREATE TABLE statistics (id INTEGER PRIMARY KEY AUTOINCREMENT, created_ts FLOAT, start_ts FLOAT, mean FLOAT, min FLOAT, max FLOAT, last_reset_ts FLOAT, state FLOAT, sum FLOAT, metadata_id INTEGER);”

```

#### Option C: Legacy (20 Columns States / 14 Columns Stats)

```sql
sqlite3.exe HA_NEW.db “CREATE TABLE states (state_id INTEGER PRIMARY KEY AUTOINCREMENT, entity_id CHAR(0), state VARCHAR(255), attributes CHAR(0), event_id SMALLINT, last_changed CHAR(0), last_changed_ts FLOAT, last_reported_ts FLOAT, last_updated CHAR(0), last_updated_ts FLOAT, old_state_id INTEGER, attributes_id INTEGER, context_id CHAR(0), context_user_id CHAR(0), context_parent_id CHAR(0), origin_idx SMALLINT, context_id_bin BLOB, context_user_id_bin BLOB, context_parent_id_bin BLOB, metadata_id INTEGER); CREATE TABLE statistics (id INTEGER PRIMARY KEY AUTOINCREMENT, created DATETIME, start DATETIME, mean FLOAT, min FLOAT, max FLOAT, last_reset DATETIME, state FLOAT, sum FLOAT, metadata_id INTEGER, created_ts FLOAT, start_ts FLOAT, last_reset_ts FLOAT, mean_weight FLOAT);”

```

### Step 4: High-Speed Data Migration

Use ATTACH and INSERT OR IGNORE to skip corrupted duplicates and leave the NULL rows behind.

```cmd
sqlite3.exe “HA_NEW.db” “ATTACH ‘HA_FIXED.db’ AS alt; BEGIN; INSERT OR IGNORE INTO states_meta SELECT * FROM alt.states_meta; INSERT OR IGNORE INTO states SELECT * FROM alt.states; INSERT OR IGNORE INTO events SELECT * FROM alt.events; INSERT OR IGNORE INTO statistics SELECT * FROM alt.statistics; INSERT OR IGNORE INTO statistics_meta SELECT * FROM alt.statistics_meta; INSERT OR IGNORE INTO statistics_short_term SELECT * FROM alt.statistics_short_term; COMMIT;”

```

### Step 5: Finalizing (Indexing & Vacuum)

Without indexes, the DB is useless. This step takes the longest (30-60 mins for 90M rows).

```cmd
sqlite3.exe “HA_NEW.db” “CREATE INDEX IF NOT EXISTS ix_states_last_updated_ts ON states (last_updated_ts); CREATE INDEX IF NOT EXISTS ix_states_metadata_id_last_updated_ts ON states (metadata_id, last_updated_ts); ANALYZE; VACUUM;”

```

### Step 6: Move Back to NUC

1. Transfer: Move HA_NEW.db to /config/home-assistant_v2.db.
2. Permissions: Run chmod 644 home-assistant_v2.db.
3. Start: Run ha core start.

### What to Expect: Auto-Migration

Upon startup, HA will detect the legacy structure and perform final optimizations. Do not interrupt!
Typical Logs:

```text
WARNING (Recorder) Adding index ix_states_context_id to table states
WARNING (Recorder) Data migration step ‘state_context_id_as_binary’ completed
WARNING (Recorder) Data migration step ‘entity_id_post_migration’ completed

```

On a clean file, this takes minutes. On the old 60GB file, it would have caused a system hang.

### Results

* Original: 59.5 GB (Fragmented/Legacy with 213M NULLs)
* Final: 17.2 GB (Cleaned/Modernized)
* Time: ~15 min Swap + ~45 min Indexing.

Conclusion: If your DB is over 20GB and hanging: stop deleting. Start Swapping.

[GUIDE] Proactive Maintenance & Shrinking for Massive HA Databases (50GB+)

The Scenario

Your Home Assistant database (home-assistant_v2.db) has reached 50GB-100GB. Even without errors, backups are slow and SSD wear is high. Large files (90M+ rows) need a structural refresh to keep the Recorder performant and the system stable.

The Goal: Cleanly migrate to a fresh file using your exact original schema via Windows, and implement a long-term storage strategy using InfluxDB.


Phase 1: Preparation (The Windows Setup)

Do not perform this on your NUC/Raspberry Pi. Use a fast Windows PC with an NVMe SSD.

  1. Stop HA: Run ha core stop.
  2. Download Tools: Get the sqlite-tools-win-x64-*.zip from the official SQLite Download Page. Extract sqlite3.exe into your working folder.
  3. Transfer Data: Move your home-assistant_v2.db to the same folder on your PC.
  4. Extract Schema DNA: Run this in your Windows terminal to capture your specific structure:
    sqlite3.exe home-assistant_v2.db ".schema --nosys" | findstr /V "CREATE INDEX" > my_tables.sql
    sqlite3.exe home-assistant_v2.db ".schema" | findstr "CREATE INDEX" > my_indexes.sql

Phase 2: Creation & Data Migration (The Table-Swap)

We stream valid rows into a new file to avoid the overhead of a massive VACUUM.

  1. Initialize Target & Stream Data:
    Run sqlite3.exe HA_NEW.db to open the console, then paste this block (Oneliner):
.read my_tables.sql
ATTACH 'home-assistant_v2.db' AS old; BEGIN; INSERT INTO states SELECT * FROM old.states; INSERT INTO states_meta SELECT * FROM old.states_meta; INSERT INTO statistics SELECT * FROM old.statistics; INSERT INTO statistics_meta SELECT * FROM old.statistics_meta; INSERT INTO statistics_short_term SELECT * FROM old.statistics_short_term; INSERT INTO events SELECT * FROM old.events; COMMIT; DETACH old;


Phase 3: Rebuild the Engine (Indexes & Stats)

Without indexes, the database is a dead weight. Rebuild them using your specific exported schema.

  1. Restore Indexes:
    While still in the sqlite3.exe HA_NEW.db console, run this:
.read my_indexes.sql

(Note: For 90M+ rows, this takes 30-60 minutes on an NVMe. Do not interrupt.)
2. Performance Injection (ANALYZE):
This is mandatory. It populates sqlite_stat1, telling the Query Planner which index to use:

ANALYZE;
.exit


Phase 4: Long-Term Scaling (The Final Fix)

To prevent future bloat, split your data storage.

  1. Set Retention: In configuration.yaml, set recorder.purge_keep_days: 7.
  2. Offload History: Use InfluxDB (e.g., on a NAS like QNAP TS-464) for “forever” storage.
  3. Visualize: Use Grafana to query the NAS, keeping the heavy load off your HA instance.

Results

  • Starting Point: 60 GB file, slow backups.
  • Outcome: 23.5 GB file, perfectly indexed, optimized Query Planner, and a scalable 5-year strategy.
1 Like

Running stable, since