MariaDB growing too big... Can't get it smaller

Yeah I agree with that. I use nodered, and it behaves similarly, requiring workarounds to prevent timers reset by a reboot from causing issues. I guess if ‘persistent timers’ were added to HA, nodered may not benefit. OTOH, seems the way persistence is done in HA requires a recorder entry in the db… that could be inefficient for just timer persistence, where only a ‘timer started’ timestamp is required. Actually, I thought “state before reboot” should have been a thing for all entities regardless if they have a recorder entry or not… not huge data (just latest timestamp of a known value) and it would make reboots work better despite how recorder is configured. As it is, I think the only way for HA to know pre-reboot state is to have a recorder entry… and for example storing timer stamps for 7days seems just stupid.

See, now that is something I haven’t heard before that would totally help explain this. Something like that should go in the docs for sure!

FYI, I now submitted a pull request to achieve the above: Purge recorder database at night by amelchio · Pull Request #33646 · home-assistant/core · GitHub

2 Likes

For those looking to find out exactly what events are filling up their MariaDB, I made a video on how to do that here using the phpMyAdmin add-on:

9 Likes

Just to clarify… once we run the query, we can delete the line and it only removes the data stored - not the entity? Sorry, I’m not too familiar with database commands.

The query does just that. It does not edit the database at all. It’s only for finding out what is filling up the database.

Just in case anyone stumbles across this, MariaDB works in a very different way now days with more tables etc. Working code to give the same results as the video is:

select sim.entity_id, count(*)
from states as sid
join states_meta as sim on sid.metadata_id = sim.metadata_id
group by sim.entity_id order by count(*) desc
2 Likes

Anyone ever figure this out?

Mine is up to 50GB…the “states” table in MariaDB is 50GB by itself, and it looks in a browser like most of it is “NULL” values. Something must be amiss?

Already tried a purge last night to 10 days.

I’m wondering if there is some stale/corrupt data that needs some kind of manual cleanup?

Wow, you have more than 200 million database lines. I have a 7 day retention period and about 300 thousand lines.

You can check 2 things:
Use the sql statement from townsmcp one post above yours, there you can check how many lines are there for every entity. For 7 days an entity should not have more than about 50.000 lines (for me the most updated entity has 42.000 lines in 7 days).

Second thing is to look if there are database lines older than your retention period.
With the following sql query:

SELECT * FROM `states` order by last_updated_ts asc;

It will sort the entitíes by the date of the last_update. In the colum last_updated_ts there is a number like 1702696320.794098, this one you can convert to a date, for example with unix time stamp converter (https://www.unixtimestamp.com). If the date is older than your retention period setting there is something wrong.

Just to clarify, your recorder settings look like this and auto_purge is true?

recorder:
  db_url: !secret mariadb
  purge_keep_days: 7
  auto_purge: true
  commit_interval: 30
...

@mmiller7 when i first started using MariaDB my DB size got about 1GB bigger each day. After some digging on the forum i found all my sensors record every change in data, so for some sensors this is every second. For most my sensors this is overkill.

Therefore i change my mariadb config so it only commits data every 30sec (like the example from @madface and i reduced the sensors to only the sensors i want to keep data from.

Here’s my code;

Recorder configuration
recorder:
  db_url: !secret mariadb_url
  purge_keep_days: 14
  commit_interval: 30
  include: #Include entities
    entities:
      # Slimme Meter
      - sensor.electricity_meter_power_consumption
      - sensor.electricity_meter_power_production
      - sensor.electricity_meter_active_tariff
      - sensor.electricity_meter_energy_consumption_tarif_1
      - sensor.electricity_meter_energy_consumption_tarif_2
      - sensor.electricity_meter_energy_production_tarif_1
      - sensor.electricity_meter_energy_production_tarif_2
      - sensor.gas_meter_gas_consumption
      # Zonneplan
      - sensor.zonneplan_last_measured_value
      - sensor.zonneplan_yield_today
      - sensor.zonneplan_yield_total
      # Diverse
      - sensor.mariadb_database_size
      - sensor.influxdb_database_size
      - sensor.disk_use
      - binary_sensor.ping_homeserver
      - binary_sensor.backups_stale
1 Like

Thanks - I didn’t realize that was an option, I’ll have to consider that.

1 Like

14 days is a lot? I keep history for 3 years and want to set it up for 5 years now. :smiley: