Recorder Retention Period By Entity

@tobixen I’m also interested in your script :grinning:

Another rather simple way to reduce the database size somewhat would be to drop all no longer used columns after they’ve been replaced: null columns still use up some storage in each row (*). For example in the states table are 8 out of the 15 columns never populated in my database: entity_id, attributes, event_id, last_changed, last_updated, context_id, context_user_id and context_parent_id. Dropping these unused columns from my states table -with 5.2 million rows in it- frees up almost 40MB of data, and most likely even more disk space because a smaller row size can increase page fill efficiency. And states is not the only table that has such columns that are always null.

(*) Storing a null value requires 1 byte per column per row according to this: (https://stackoverflow.com/questions/14934488/is-disk-space-consumed-when-storing-null-data).

Completely agree. I need to store data for central heating over the winter. The rest of data I don’t care much - 10 days or even less is probably enough.

Please let’s get this retention period per entity done.

Thanks

Yes, I would love to see this in the future HA version!

I guess this would be the easiest way to setup:

3 Likes

I think that I would like to see days retention implemented.

My personal preference would be to be able to bulk apply recorder strategies to entities.

Temperature strategy…

max_days: 10
precision: 1
rounding: ‘half’

I implement this already with template sensors for precision and rounding, and with a combination of labels and scripts that purge entities at midnight.

1 Like

So I know I’m late to the party, but my backups are becoming a problem as my MariaDB is full of long term stats. The actual DB is 370MB, but of that 300MB is long terms stats…my HA is connected to a super slow Internet connection to backups take ages to complete.

Most of the LT stats data is junk - for example I have an insane number of temperature and humidity readings, specifically 25,000 records for each sensor. That’s 3 years of HA data but I only need a week or a month at most. 99% of the data is unwanted junk… The recorder config is set to 2 day history but obviously that changes nothing in the LT space.

Technically all I want is my energy stats…

I really need a way to shed this dead load…is there a way to do this?

As an aside, and worse still, all this data also exists in InfluxDB (so I can get it if I need it), but here too there is no way to purge entities I no longer want or even no longer have…everything just grows and grows…it’s completely out of control!

Help?

CP.

Where integrations define a stats class you don’t want, you can override it in your customize.yaml by setting it to an empty string. You’ll then get a “fix issue” notice under the stats tab. One of the options will be to remove the data. Where you have defined a stats class for sensors, remove it and fix the resulting stats issue as mentioned.

recorder:
  include:

above will exclude anything not specified by You

Beside there are great topics in “handling” Recorder and DB ( Including Purge )

So I already have a recorder.yaml for these settings and I have specified a long list of items to exclude. For example:

db_url: !secret mariadb_url
  db_max_retries: 10
  db_retry_wait: 3
  purge_keep_days: 2
  auto_purge: true
  auto_repack: true
  commit_interval: 5
  exclude:
    domains:
      - camera
      - group
      - device_tracker
      - media_player
      - input_text
      - input_number
      - input_boolean
      - input_number
      - input_select
      - weblink
      - updater
      - sun
      - timer
      - weather
      - person
      - automation
    entity_globs:
      - binary_sensor.*_battery_low
      - binary_sensor.*_charger_type
      - binary_sensor.*_charging
      - binary_sensor.*_led_indication
      - binary_sensor.*_mqtt_room
      - binary_sensor.*_overheating
      - binary_sensor.*_overpowering
      - binary_sensor.*_tamper
      - binary_sensor.*_update_available
      - binary_sensor.*occupancy*
      - binary_sensor.*vibration*
      - binary_sensor.dryer*
      - binary_sensor.espresense_*
      - binary_sensor.home_alarm*
      - binary_sensor.motion_exterior*
      - binary_sensor.opnsense*
      - binary_sensor.washing*
      - binary_sensor.weatherflow_is*
      - binary_sensor.zigbee_router*
      - climate.*inverter*
      - number.*_occupancy_timeout
      - number.inverter_fan*
      - select.*_power_on_behavior
      - sensor.date*
      - sensor.time*
      - sensor.*_battery*
      - sensor.*_ble
      - sensor.*_current
      - sensor.*_illuminance
      - sensor.*_linkquality
      - sensor.*_lux
      - sensor.*_price
      - sensor.*_sensitivity
      - sensor.*_timeout
      - sensor.*_update_state
      - sensor.*_voltage
      - sensor.*device_temperature
      - sensor.*extractor*energy*
      - sensor.*wifi_rssi
      - sensor.*last_seen*
      - sensor.*lights*energy*
      - sensor.*power*
      - sensor.*rail*energy*
      - sensor.*rx*
      - sensor.*motion*temperature
      - sensor.*motion*humidity

...and so on...

However, it still merrily generates long term statistics…I think I need more elaboration from @parautenbach as to exactly what he means. Override it how? Where?

It’s not necessarily the case for example that I never want temperature statistics, but I certainly don’t want long term stats for all of them. So, excluding a whole class isn’t necessarily the answer.

I have half a mind to ditch some integrations and then re-add them but I don’t know if that would clear the data…and to be fair it isn’t really a sustainable approach. :frowning:

homeassistant:
  customize:
    sensor.foo:
      state_class: ""

So I added this to my customise.yaml:

sensor.dining_h_t_humidity:
  state_class: ""
sensor.dining_h_t_temperature:
  state_class: ""

Then I restarted…no repairs or errors…it seemed to do nothing. :frowning:

Seriously you post in a Feature Request, For Retention Periods By Entity And This is mend for the Native SQLite DB
And for some reasons you have both MariaDB AND InfluxDB

Please open a separate Topic for your “Issues”

Beside you might be better of just using “Include” as i mentioned !, as you Basically says

However you should still not post your Custom Integration " Issues " in a FR for Native Components

1 Like

True. My bad. Forgot this was a FR… :frowning:

1 Like

I have half a mind to ditch some integrations and then re-add them but I don’t know if that would clear the data…

Removing entities does not delete entity history/statistics data (unless you do it manually)

This seems like a useful feature. There are many sensors that I want to retain for longer periods, like battery status, leak status, temperature, humidity.

I would also suggest the ability to toggle this at the device level, or even integration level. I have 1370 entities currently, and it would be a lot of work enabling them one by one. But of course this depends on the UI - if it allows filters & multiple selection, then it may not be as much of a problem.

Good point. If the UI could group, filter and/or sort them it would make things easier. Of course none of that can happen until the underlying database structure changes to support retention periods per entity.

In the mean time, a lot of people have found it’s easier to use include rather than exclude in their configuration for Recorder. If I knew then what I know now, I might have gone that route when setting up HA. Admittedly it’s still a very blunt instrument, and something like this FR would still be extremely helpful.

A very useful use-case for this functionality is for device_tracker/person entities: it will be possible to see a track on a map for some particular period in the past. Of course if period is defined by some “start” & “end” data, not by “hours_to_show”. Some custom map cards (like this great card) already have this method to define a period implemented.

3 Likes

I’d love to have retention period by entity, and also the ability to downsample it after that.

1 Like

Moreover, it would be great to be able to configure include, exclude and retention periods using labels.
For example, defining a label “label for excluded entities”, associate this label to every entities you want to exclude and just configure in the recorder configuration:

recorder:
  exclude:
    label: 'label for excluded entities'

4 Likes

WTH - Option in recorder yaml config to specify different keep_days by domain/entity

1 Like