Though I’ve been fighting with DB size it was still large. Yes, I had configured recorder and logbook sections as suggested, excluding things that appear in history or logs, but still, it was growing large which in turn increased reboot time and I suspect was also cause why my installation could become irresponsible after some time or after unlucky hard reboot (doing it sometimes).
But…In fact, it was super easy to identify what makes it grow large, just by inspecting DB itself by doing the following:
enter the following commands in the shell: .header on .mode column .width 50, 10, SELECT entity_id, COUNT(*) as count FROM states GROUP BY entity_id ORDER BY count DESC LIMIT 20;
All records are ordered by count, so you will easily find out what is bloating your DB.
BTW I had all the same records as in this example, so added them into the exclude list (I don’t know what is good for me from sun/solar/cpu history so I’ve just wiped it out from recordings). FYI here is part of my config to keep DB small
NOTE: even though I already had by that moment sun in exclude.domains but records sun.sun were still there in DB. I had cleaned up DB after applying thin config so it’s no doubt about it. So I had to add sun.sun into entities exclude list too
Is there anyway to pull that directly into HA for display? My understanding of the SQL sensor is that it pulls a single value, not a table or list of data.
Not sure this will work?
- platform: sql
db_url: !secret mariadb_connection_url
scan_interval: 1800
queries:
- name: Top Entities Database
query: 'SELECT entity_id, COUNT(*) as count FROM homeassistant.states GROUP BY entity_id ORDER BY count DESC LIMIT 20;'
Sorry for necroposting, but since a while these commands don’t work anymore. The docs is outdated too. I believe Home Assistant has changed the DB scheme.
I have found the updated commands, but I can’t remember where, I think was some discussion on Github:
SELECT states_meta.entity_id as Entity, count(states.metadata_id) as Count FROM states INNER JOIN states_meta ON states.metadata_id = states_meta.metadata_id GROUP BY states.metadata_id ORDER BY count(states.metadata_id) DESC LIMIT 10;
Just FYI, this is the query that works on the current version of HA 2025.01
SELECT m.entity_id, COUNT(*) as count FROM states AS S
INNER JOIN states_meta AS M ON M.metadata_id = s.metadata_id
GROUP BY m.entity_id ORDER BY count DESC LIMIT 20;
I you know which entities are taking up a lot of storage, you can run the “Recorder: Purge entities” from the actions from the Actions tab in the Developer Tools