Since upgrading to 2024.4.x, my MariaDB database (via the official addon) has roughly doubled in size. I’m pretty sure that the only change is to upgrade HA core at the beginning of the month and an upgrade of the MariaDB addon a couple of days ago.
I don’t remember seeing anything in the 2024.4 release notes about the database or recorder. Did I miss something?
SELECT table_schema "database", Round(Sum(data_length + index_length) / 1048576, 2) "value" FROM information_schema.tables WHERE table_schema="homeassistant" GROUP BY table_schema
Possible there are some sensors that are updating rapidly and filling database up.
Look at the states table and see what’s in there and see if there are a lot of updates for a specific meta data id. Alternatively if you’re good at sql, select the last month of data group it by meta data id and count the records
Without a previous baseline, it’s pinpoint the issue. The biggest table is the state_attributes (via phpmyadmin addon), but it’s impossible for me to know if anything has changed there.
Using the following query, doesn’t really highlight anything unexpected. But it would be helpful to link the size of the attributes to each row, which I don’t know how to do.
SELECT states.metadata_id, COUNT(*) as Count, states_meta.entity_id
FROM states
INNER JOIN states_meta ON states.metadata_id=states_meta.metadata_id
GROUP BY states.metadata_id
ORDER BY Count DESC;
I wonder if something has changed in an integration that now means that some of the attribute updates that were identical between state updates for multiple entities are now different for each, and so need to be stored separately.
Here’s one someone posted which has helped me. It doesn’t sort by size, but by number of records, but it does display the size:
SELECT
COUNT(state_id) AS cnt,
COUNT(state_id) * 100 / (
SELECT
COUNT(state_id)
FROM
states
) AS cnt_pct,
SUM(
LENGTH(state_attributes.shared_attrs)
) AS bytes,
SUM(
LENGTH(state_attributes.shared_attrs)
) * 100 / (
SELECT
SUM(
LENGTH(state_attributes.shared_attrs)
)
FROM
states
JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id
) AS bytes_pct,
states_meta.entity_id
FROM
states
LEFT JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id
LEFT JOIN states_meta ON states.metadata_id = states_meta.metadata_id
GROUP BY
states.metadata_id, states_meta.entity_id
ORDER BY
cnt DESC;
Actually I wonder whether it coincides with when I enabled “last seen” on zigbee2mqtt. I’ll investigate some more, which means that every message from every zigbee sensor now has a unique attributes JSON.