I’m starting to experience issues with my 37GB mariadb database. Last database version bump took really long, but space is also a problem.
Digging around I noticed that more than half of my data is from deleted sensors, still kept in the database. They may be re-added with the same entity id later and then I could see their data, I understand that, but somehow it would be nice to get rid of them if I wanted to.
At /config/entities I can see all ids that I currently have (enabled/disabled). I can also export them with a script and delete everything else from the states/states_meta/states_attributes tables manually, and I will probably do that today. But still, this should be handled by the frontend. Just give us a button, or list orphaned entities and let me delete them.
To clarify, these are entities that were removed with the device. Old esphome devices mostly, which I test, experiment with. Their sensors submitted values that are still within the purge period, but the device is no longer present, and never will be. I keep 1 year data. When home assistant removes a device and its entities, it is only deleted from the local recorder, not from the long term database.
If anyone needs it, I ended up marking the unused entites in states_meta in a new column and run these:
SET FOREIGN_KEY_CHECKS=0;
DELETE FROM states WHERE metadata_id IN (SELECT metadata_id FROM states_meta WHERE orphaned = 1);
DELETE FROM state_attributes WHERE attributes_id NOT IN (SELECT DISTINCT attributes_id FROM states);
DELETE FROM states_meta WHERE orphaned = 1;
SET FOREIGN_KEY_CHECKS=1;
There you will find that for each of your deleted entities you will have a FIX ISSUE button / link. When you click it there will be a message about state data no longer being available and do you want to delete the statistics.
That’s exaclty the problem! Deleted entities that are only present in the long term database are not there. I had over 100,000 in the database, but only about 2000 listed there.
Yes, it took about an hour to run those sql commands. Esphome, ble device tracker, libre hardware monitor, and the ping integration recording every roundtrip as a different attribute (this was fixed yesterday) were the main culprits.
I mosly need the large purge time to see the yearly power consumption.
I’m on the other end of the spectrum. I keep my purge_days to a minimum and exclude any entities I don’t absolutely need, keeping my Recorder database as lean as possible. And I don’t use the Recorder database for LTS at all. Every single one of those tables is wasted space to me.
I’d argue that I need the purge option for LTS even more than someone who is already managing a large database. At one point the LTS data were taking up over 80% of my database. No, that’s not a typo. 80% wasted space. Voted.
Based on what I understood from another topic, it could be possible that LTS-data occupy more space than History:
Purge interval is a few days.
Only a few entities are included to Recorder.
There were lots of sensors with LTS stored in DB; then these sensors were excluded from Recorder - but these LTS-data are still there; and there is an “issue” in “Dev tools - Statistics» saying “these entities were excluded from Recorder” with no option to delete old data. (Update: this PR made possible to delete LTS for not-recorded entities)
So, if you do not need LTS - you have to disable “LTS-ability” by setting “state_class: none” for recorded sensors; and for not recorded sensors - see a PR above, a possibility to manually delete an old LTS was added.
I have not tested changes added by that PR, to be honest. It was said in this PR “This just replaces one issue with another (entity no longer recorded → entity not recorded), but at least it cleans up the DB.” - i.e. you can delete old LTS but you will still see a related issue.
But here the OP said that many DELETED entities still have LTS-data in DB, so I wonder how it can happen.
Don’t worry, I also run InfluxDB. And I run it on a Windows Server (not ESP).
I think I am mixing up the LTS with the mariadb database, by long term I mean the latter, not the averaged values beyond the purge interval. Anyhow, no matter how short it is, those orphaned values will not go away.
Maybe I don’t understand how the frontend works, but in that PR I can’t see anything that would delete records from a database. Is it maybe _deletedStatistics? statistics_* tables are not where the bulk of the data is.