That helped, thanks bdraco!
it trimmed 1/3 of the size. I’m at 44GB now.
I think it won’t be enough, I have 30GB free now so I’ll need to find a more aggressive way. Is there anything I can do to signal to the DB to mark more as trashed, and then repeat the trick of vacuum in a 2nd computer?
If you are looking for a quick fix, you could stop, download again, delete the contents of the states table, vaccum again, and put it back up, than restart. That will allow you to keep the stats and you’ll only loose shorter term history.
Oh, that’s exactly what I need. I’m considering moving to MySQL to avoid this requirement of having twice the space available for the purge step. (Purge is vacuum, right?)
Any database engine you pick should have 125% of the database size available for temp tables, rebuilds, etc to be safe. MySQL will not save you from that, and its usually not worth the price of admission unless you have a specific need as you are signing up to be the DBA for your instance.
Here is the states query updated to 2023.4.x for SQLite.
The attributes and entity_id are deduplicated so they only are stored once now which makes the attributes bytes calculation misleading.
SELECT
COUNT(state_id) AS cnt,
COUNT(state_id) * 100 / (
SELECT
COUNT(state_id)
FROM
states
) AS cnt_pct,
SUM(
LENGTH(state_attributes.shared_attrs)
) AS bytes,
SUM(
LENGTH(state_attributes.shared_attrs)
) * 100 / (
SELECT
SUM(
LENGTH(state_attributes.shared_attrs)
)
FROM
states
) AS bytes_pct,
states_meta.entity_id
FROM
states
LEFT JOIN state_attributes ON (states.attributes_id=state_attributes.attributes_id)
LEFT JOIN states_meta ON (states.metadata_id=states_meta.metadata_id)
GROUP BY
states.metadata_id
ORDER BY
cnt DESC
SELECT
COUNT(*) AS cnt,
COUNT(*) * 100 / (SELECT COUNT(*) FROM states) AS cnt_pct,
states_meta.entity_id
FROM states
LEFT JOIN states_meta ON (states.metadata_id=states_meta.metadata_id)
GROUP BY entity_id
ORDER BY cnt DESC;
Interesting, it didn’t work for me. I use the default SQLite database and the SQLite DB Browser. I get this when I enter the SQL from @crazyasyou’s post, above:
Execution finished with errors.
Result: ambiguous column name: entity_id
At line 1:
However, it appears to work when I specify the table name for entity_id in the GROUP BY statement:
SELECT
COUNT(*) AS cnt,
COUNT(*) * 100 / (SELECT COUNT(*) FROM states) AS cnt_pct,
states_meta.entity_id
FROM states
LEFT JOIN states_meta ON (states.metadata_id=states_meta.metadata_id)
GROUP BY states_meta.entity_id
ORDER BY cnt DESC;
I haven’t really had time to look into how accurate the results are. So far the new database schema, along with my aggressive excludes in configuration.yaml, have kept my database size reasonable.
Hello, I was wondering what’s the use case for recording events. I already excluded all frequent events from the recorder, but maybe I should exclude them all? Is there an UI or an API that can query the recorded events and show them or do something about them? Or is the only way to access them a manual sql query on the backend db? Is the history tab able to show recorded events? Thanks!
Hello, thanks a lot for these info…
I installed SQLite Web, and succeed to view the size of each table (first SQL query)…
But whe I type the second one (Viewing events usage), (and same for states usage), I only get this message :
No, you’re not. When I wrote this guide, those queries were valid for that specific Home Assistant version. Over time, the database evolved, and thus the queries on my initial post aren’t working anymore. Unfortunately, I haven’t had time to revisit this post and update the queries. Please see the other messages on this topic for newer queries, contributed by other people.
I have modified the SQL queries from the original post so they work on homeassistant 2023.4.6 and later.
Modified query for states:
SELECT
COUNT(*) AS cnt,
COUNT(*) * 100 / (SELECT COUNT(*) FROM states) AS cnt_pct,
states_meta.entity_id
FROM states
INNER JOIN states_meta ON states.metadata_id=states_meta.metadata_id
GROUP BY states_meta.entity_id
ORDER BY cnt DESC
Modified query for Events:
SELECT
COUNT(*) as cnt,
COUNT(*) * 100 / (SELECT COUNT(*) FROM events) AS cnt_pct,
event_types.event_type
FROM events
INNER JOIN event_types ON events.event_type_id = event_types.event_type_id
GROUP BY event_types.event_type
ORDER BY cnt DESC