How to reduce your database size and extend the life of your SD card

The suggested approach is difficult to sustain as one can get overwhelmed by the number of entities to classify.

Instead of making a decision for each and every entity_id, why not look at the ‘worst offenders’ and exclude those, only. By identifying the most frequently updated entities we can significantly reduce I/O without having to micro-manage the list of exclusions.

To do so, you’d need SQL access to the database, and then use SQL queries similar to the following:

For the states table:

SELECT 
	entity_id,
    COUNT(*) AS cnt
FROM states
GROUP BY
	entity_id
ORDER BY
	COUNT(*) DESC

The events table seems to take quite a bit longer with:

SELECT 
	JSON_VALUE(event_data, '$.entity_id') AS entity_id,
	COUNT(*) AS cnt
FROM events
WHERE
	JSON_VALUE(event_data, '$.entity_id') IS NOT NULL
GROUP BY
	JSON_VALUE(event_data, '$.entity_id')
ORDER BY
	COUNT(*) DESC
5 Likes