Questions about recorder, statistics & influxDB

hi
i am using HA now for a few months
and i am tiptoeing in and around most of the beginners mishaps
started with an apt-get-install on my rasPi3, now it’s a docker image on it’s own rasPi4
sqlite db soon grew to 800mb
i configured recorder to keep 3 days, and it was great
i copied an example from the forum and adjusted it
that worked well until the db file was growing again unexpectedly beyond 200mb
i analyzed the contents and found statistics tables, but those were not documented on the home assistant sql pages.
i executed delete-sql manually on the statistics and “reclaimed” more than 100mb
can i turn these statistics off?
or could it be that i installed something via hacs? i had it installed for some time but removed it again.

and to see more data than just 3 days i added influxDB & grafana

but do i unterstand that right:
influxDB is not a replacement for the sqlite db
if i want to see graphs with data beyond sqlite’s capabilities (3 days in my case) i need to go the grafana way and re-export the graphs from there to home assistant as pictures?

and my final question
if i have a stable number of sensors etc, shouldn’t the diskspace used by the db be somewhat stable instead of constantly growing?

1 Like

seriously? no replies at all?

I don’t have experience with Grafana or InfluxDB. There is information on the influxdb integration page that says, that it doesn’t replace it, but runs paralel indeed.

However you can set a lot of entities to be excluded by the recorder integration. I personally didn’t set the purge_keep_days, so it is automaticallly just the default, 10 days, for me that is fine. You didn’t by accident disabled auto_purge for the recorder integration?

This example is from the Recorder integration page.

recorder:
  include:
    domains:
      - alarm_control_panel
      - light
    entity_globs:
      - binary_sensor.*_occupancy
  exclude:
    entities:
      - light.kitchen_light

Statistics are long term statistics, but they are not taking up a lot space since the long term statistics are designed to only save the mean, top, and low value per hour. So that’s 24 * 3 = 72 values per day. Instead of all the data that is kept by the recorder. A lot of values won’t be added to your Long term statistics, because the entity has to have certain properties, to be seen as a entity suitable for Long term statistics.

It is visible within https://yourHomeAssistantURL/developer-tools/statistics for which entities statistics are being saved

More info on this is to be viewed in the developer documentation on sensors, section Long term Statistics.

.

thank you for your answer

recorder is properly configured for a 3 day retention and purges every night
and i exclude pretty much everything where i just want to know the current value
for this are plenty good examples in this forum available

and regarding the statistics:
i have seen the developer-tools, nearly all my entities are compatible
but if i click on one, i just get the “regular” graph from lovelace
then i found the lovelace-card statistics-diagram
let’s say it did not excite me, but i don’t have a usecase for it, if am more concerned about why the statistics are auto-enabled

and i read (skimmed) the page about the long term statistics…
the right state_class auto-opt-ins the sensor for statistics
most of my sensors are remote esphome-nodes or mqtt
which means i could probably change their state_class to opt-out of the statistics

i am still wondering:
is there no other way to disable the statistics?
the recorder can be configured so detailed
but the statistics are just auto-opt-in based on state_class?
which i cannot override in home assistant but in the esphome-nodes themselves?

really?

yeah that’s a good question. I don’t know about that. At least you have an option to change stateclass, but I can imagine that not being very convenient to have to flash all your nodes.

Did you by the way check if excluding it from the recorder, will exclude it from the long term statistics too? Although I can imagine that you would like the 3days worth of data for some entities, but not the long term stats.

Maybe some alternatives for changing the state_class:

There is also an property ‘internal’ for ESPHome sensors, see Sensor Component — ESPHome. The description for this is:

Mark this component as internal. Internal components will not be exposed to the frontend (like Home Assistant). Only specifying an id without a name will implicitly set this to true.

or disabled_by_default

If true, then this entity should not be added to any client’s frontend, (usually Home Assistant) without the user manually enabling it (via the Home Assistant UI). Requires Home Assistant 2021.9 or newer. Defaults to false.

Maybe that also gives a desired outcome.

I am not sure if the statistics for your nodes cause the database size to be so large. My knowledge about this is limited.

If you don’t find any other ways within home assistant to exclude entities from long term statistics, then maybe you can consider to submit an issue on the Home Assistant Github?