I’m trying to do a couple things
- keep state history for a small set of entities for 1 year
- purge everything else in ~2 weeks (but I DO want to keep 2 weeks for (almost) everything as I want to be able to troubleshoot automations etc.)
- keep my DB size from getting out of control (2 weeks of data is ~1GB, don’t want that to go to ~25…)
I am using MariaDB on a different server, I’ve excluded a number of entities from being recorder altogether:
recorder: purge_keep_days: 365 db_url: !secret db_url commit_interval: 20 exclude: domains: - updater - media_player - scene - zone - weather - camera entity_globs: - binary_sensor.*_tamper - binary_sensor.*_battery_low - binary_sensor.*_update_available - binary_sensor.w10* - sensor.w10* - sensor.*_battery - sensor.*_linkquality - sensor.pi_hole* - sensor.samsung_* - sensor.memory_* - sensor.*_raw - sensor.nas_* - sensor.*_update_state
To define the entities I want to preserve I have created a new table called “protected_entities” that contains only an entity_id column:
DESCRIBE protected_entities; entity_id varchar(255) NO PRI NULL SELECT * FROM `protected_entities` LIMIT 1; sensor.external_temperature
Now my very rudimentary approach to this so far has been to use cron to simply delete everything from states that’s older than 14 days and is not included in protected_entities (MariaDB syntax):
DELETE FROM states WHERE entity_id NOT IN (SELECT entity_id FROM protected_entities) AND last_updated < (NOW() - INTERVAL 14 DAY);
For events I would do the same, delete events older than 14 days excluding those related to protected entities:
DELETE FROM events LEFT JOIN states on events.event_id = states.event_id WHERE states.entity_id NOT IN (select entity_id from protected_entities) AND events.time_fired < (NOW() - INTERVAL 14 DAY);
And then optimize the tables.
Now… I understand that I’m not exactly being nice to the DB schema since I don’t care about old_state_id and don’t “clean up” after myself. I’ve tried to translate the source code for purge.py into SQL statements but that’s where my very basic SQL skills very quickly end:
Question is then - am I setting myself up for a disaster in the long term or is this heavy handed but ultimately not destructive?
Obviously the dream here would be to have this feature built directly into the recorder so that you can keep two “levels” of history…