How to keep your recorder database size under control

illuzn · January 18, 2024, 4:02am

@denilsonsa I suggest the following changes to bring it into line with the current practice and Home Assistant long term statistics - disabling the recorder for entities also disables long term statistics for them. Since your guide is the most popular guide on doing this, it makes sense to update this rather than create a new post.

Additions in underline.

Introduction

This is yet another guide on:

How to view the size of your database.

How to figure out which entities are using too much space.

And how to filter them out.

How to deal with entities which you want to keep long term statistics for but report too often

Motivation
…[add at the end of this section]
Finally, long term statistics relies on the recorder. Entities which are not recorded also will not have long term statistics gathered for them - this is not explicitly stated in the documentation but is implied by “Statistics are … summarized every hour” (presumably, from recorder data). This is often a problem for energy sensors/ solar inverters which often report data every second (generating around 85,000 entries a day) but for which you probably want to keep statistics for the Energy panel.

Filtering out entities
…[add at the end of this section]
Remember, filter entities which you do not want to keep data for at all (both history and statistics). For example, the energy dashboard relies upon statistics so be mindful of the consequences of your actions.

More aggressive configuration/ Keeping Statistics but Purging Recorder
…[add at the end of this section]
For entities which you want to keep statistics but:
* do not need history data
* report far too often taking up valuable storage
You can set up an automation to purge those entities daily. Put the following in a new automation (feel free to change the name/ description/ trigger time to something sensible):

automation:
  alias: Cleanup Database
  description: >-
    Cleans up sensors that log too much but want data of, then optimizes
    the database. Use recorder excludes for sensors you do not care about.
  trigger:
    - platform: time
      at: "03:22:00"
  condition: []
  action:
    - alias: Purge Spacehogging Entities
      service: recorder.purge_entities
      target:
        entity_id:
           - sensor.spacehogging_entity_1
           - sensor.spacehogging_entity_2
      data:
        keep_days: 1
  # This next step isn't strictly necessary. Delete if you prefer to miminise disk hits.
    - alias: Repack DB
      service: recorder.purge
      data:
        repack: true
        apply_filter_true

or
Note: It is untested what effect occurs to long term statistics if you change keep_days: 0 - this might cause a gap in your statistics data if long term statistics has not logged that data yet. If you have a lot of these spacehogging entities you might want to change keep_days:0 and use a time_pattern trigger instead:

  trigger:
    - platform: time_pattern
      # Runs every 6 hours
      hour: /6

or

Resources
…[add at the end of this section]
[1] Long Term Statistics Docs: Home Assistant Statistics