@tobixen I’m also interested in your script
Another rather simple way to reduce the database size somewhat would be to drop all no longer used columns after they’ve been replaced: null columns still use up some storage in each row (*). For example in the states table are 8 out of the 15 columns never populated in my database: entity_id, attributes, event_id, last_changed, last_updated, context_id, context_user_id and context_parent_id. Dropping these unused columns from my states table -with 5.2 million rows in it- frees up almost 40MB of data, and most likely even more disk space because a smaller row size can increase page fill efficiency. And states is not the only table that has such columns that are always null.
(*) Storing a null value requires 1 byte per column per row according to this: (https://stackoverflow.com/questions/14934488/is-disk-space-consumed-when-storing-null-data).
Completely agree. I need to store data for central heating over the winter. The rest of data I don’t care much - 10 days or even less is probably enough.
Please let’s get this retention period per entity done.
Thanks
Yes, I would love to see this in the future HA version!
I guess this would be the easiest way to setup:
I think that I would like to see days retention implemented.
My personal preference would be to be able to bulk apply recorder strategies to entities.
Temperature strategy…
max_days: 10
precision: 1
rounding: ‘half’
I implement this already with template sensors for precision and rounding, and with a combination of labels and scripts that purge entities at midnight.
So I know I’m late to the party, but my backups are becoming a problem as my MariaDB is full of long term stats. The actual DB is 370MB, but of that 300MB is long terms stats…my HA is connected to a super slow Internet connection to backups take ages to complete.
Most of the LT stats data is junk - for example I have an insane number of temperature and humidity readings, specifically 25,000 records for each sensor. That’s 3 years of HA data but I only need a week or a month at most. 99% of the data is unwanted junk… The recorder config is set to 2 day history but obviously that changes nothing in the LT space.
Technically all I want is my energy stats…
I really need a way to shed this dead load…is there a way to do this?
As an aside, and worse still, all this data also exists in InfluxDB (so I can get it if I need it), but here too there is no way to purge entities I no longer want or even no longer have…everything just grows and grows…it’s completely out of control!
Help?
CP.
Where integrations define a stats class you don’t want, you can override it in your customize.yaml
by setting it to an empty string. You’ll then get a “fix issue” notice under the stats tab. One of the options will be to remove the data. Where you have defined a stats class for sensors, remove it and fix the resulting stats issue as mentioned.
recorder:
include:
above will exclude anything not specified by You
Beside there are great topics in “handling” Recorder and DB ( Including Purge )
So I already have a recorder.yaml for these settings and I have specified a long list of items to exclude. For example:
db_url: !secret mariadb_url
db_max_retries: 10
db_retry_wait: 3
purge_keep_days: 2
auto_purge: true
auto_repack: true
commit_interval: 5
exclude:
domains:
- camera
- group
- device_tracker
- media_player
- input_text
- input_number
- input_boolean
- input_number
- input_select
- weblink
- updater
- sun
- timer
- weather
- person
- automation
entity_globs:
- binary_sensor.*_battery_low
- binary_sensor.*_charger_type
- binary_sensor.*_charging
- binary_sensor.*_led_indication
- binary_sensor.*_mqtt_room
- binary_sensor.*_overheating
- binary_sensor.*_overpowering
- binary_sensor.*_tamper
- binary_sensor.*_update_available
- binary_sensor.*occupancy*
- binary_sensor.*vibration*
- binary_sensor.dryer*
- binary_sensor.espresense_*
- binary_sensor.home_alarm*
- binary_sensor.motion_exterior*
- binary_sensor.opnsense*
- binary_sensor.washing*
- binary_sensor.weatherflow_is*
- binary_sensor.zigbee_router*
- climate.*inverter*
- number.*_occupancy_timeout
- number.inverter_fan*
- select.*_power_on_behavior
- sensor.date*
- sensor.time*
- sensor.*_battery*
- sensor.*_ble
- sensor.*_current
- sensor.*_illuminance
- sensor.*_linkquality
- sensor.*_lux
- sensor.*_price
- sensor.*_sensitivity
- sensor.*_timeout
- sensor.*_update_state
- sensor.*_voltage
- sensor.*device_temperature
- sensor.*extractor*energy*
- sensor.*wifi_rssi
- sensor.*last_seen*
- sensor.*lights*energy*
- sensor.*power*
- sensor.*rail*energy*
- sensor.*rx*
- sensor.*motion*temperature
- sensor.*motion*humidity
...and so on...
However, it still merrily generates long term statistics…I think I need more elaboration from @parautenbach as to exactly what he means. Override it how? Where?
It’s not necessarily the case for example that I never want temperature statistics, but I certainly don’t want long term stats for all of them. So, excluding a whole class isn’t necessarily the answer.
I have half a mind to ditch some integrations and then re-add them but I don’t know if that would clear the data…and to be fair it isn’t really a sustainable approach.
homeassistant:
customize:
sensor.foo:
state_class: ""
So I added this to my customise.yaml:
sensor.dining_h_t_humidity:
state_class: ""
sensor.dining_h_t_temperature:
state_class: ""
Then I restarted…no repairs or errors…it seemed to do nothing.
Seriously you post in a Feature Request, For Retention Periods By Entity And This is mend for the Native SQLite DB
And for some reasons you have both MariaDB AND InfluxDB
Please open a separate Topic for your “Issues”
Beside you might be better of just using “Include” as i mentioned !, as you Basically says
However you should still not post your Custom Integration " Issues " in a FR for Native Components
True. My bad. Forgot this was a FR…
I have half a mind to ditch some integrations and then re-add them but I don’t know if that would clear the data…
Removing entities does not delete entity history/statistics data (unless you do it manually)
This seems like a useful feature. There are many sensors that I want to retain for longer periods, like battery status, leak status, temperature, humidity.
I would also suggest the ability to toggle this at the device level, or even integration level. I have 1370 entities currently, and it would be a lot of work enabling them one by one. But of course this depends on the UI - if it allows filters & multiple selection, then it may not be as much of a problem.
Good point. If the UI could group, filter and/or sort them it would make things easier. Of course none of that can happen until the underlying database structure changes to support retention periods per entity.
In the mean time, a lot of people have found it’s easier to use include rather than exclude in their configuration for Recorder. If I knew then what I know now, I might have gone that route when setting up HA. Admittedly it’s still a very blunt instrument, and something like this FR would still be extremely helpful.
A very useful use-case for this functionality is for device_tracker/person entities: it will be possible to see a track on a map for some particular period in the past. Of course if period is defined by some “start” & “end” data, not by “hours_to_show”. Some custom map cards (like this great card) already have this method to define a period implemented.
I’d love to have retention period by entity, and also the ability to downsample it after that.
Moreover, it would be great to be able to configure include, exclude and retention periods using labels.
For example, defining a label “label for excluded entities”, associate this label to every entities you want to exclude and just configure in the recorder configuration:
recorder:
exclude:
label: 'label for excluded entities'