Statistics: how often data are collected? (wrong docs?)

Assume some sensor with specified “state_class: measurement” is updated every 30 sec.

According to this:
statistics of hourly min, max and average sensor readings is updated every 5 minutes

Means:

  1. Every 5 minutes HA analyses gathered data and calculates minimal, maximal & average values for the last hour.
  2. Then these 3 calculated values are stored in DB and are kept forever.

Then these calculated (“statistical”) data may be displayed by statistics-graph card with a required precision (5minutes, hour etc).

But - here it is specified differently:
Each hour it will take a snapshot of supported entities and track different things about the entity state. ... Because they are summarized every hour, they only create 24 entries per day.

So, what is correct:

  • statistics is calculated every 5 minutes;
  • statistics is calculated every hour

?

Assume that there are only 24 entries per day.
Then how are “5minutes” data available for the statistics graph card?

I guess that these words are wrong:
Each hour it will take a snapshot of supported entities and track different things about the entity state. ... Because they are summarized every hour, they only create 24 entries per day.

What is a real picture then?

I think (but do not know for sure) it is both.

For one of my statistics graphs that has months of data I can only see back 7 days of 5 minute data. Hourly data goes back the full record length. So does daily and monthly.

That means that:
– at least these data are collected every 5 minutes;
– these data are “purged” to see only the last 7 days - similarly to “conventional non-statistical” data which are kept for the last xxx days (accordingly to the Recorder’s settings, 10 days by default).
And this does not match to:

Now I am a bit confused.
What I believed is:

  1. Assume there is a statistics for some sensor which is calculated every 5 minutes (i.e. “version 1” from my 1st post).
  2. The “statistics-graph” card when used “period” option = 5minutes → shows all kept statistical data for this sensor w/o any processing - from the beginning (i.e. “all statistics”), not “last 7 days”.
  3. When used with “period = hour” → then from the gathered statistics only “hourly” data are used - and here I do not know whether this is “just let’s take every 12th (60/5) readings” or this is “let’s recalculate min/max/avg for every hour based on 5-minutes-readings” - and again from the beginning.

I.e. the “statistics graph” card may either show all gathered stats (period=5minutes) or recalculated (period=hour,…). So, statistics is always gathered every 5 minutes based on the actual readings within last hour.

Btw, here it is also written about “every hour”:


which does not match with this:

and does not match with the possibility to select “period = 5minutes”:
изображение

What I think now:

  1. Statistics is re-calculated every 5 minutes.
  2. These “5 minutes” results are stored in DB (let’s call it “short-term statistics”).
  3. Every hour these values (i.e. calculated at this moment) are stored in a “long-term” part of DB.
  4. The “short-term statistics” persists within “purge interval” - i.e. only last 10 days (by default) may be displayed in a “statistics-graph” card.
  5. The “long-term statistics” persists forever - i.e. whole “hourly” data may be displayed in a “statistics-graph” card.
  6. If bigger periods (day, week, …) are selected for the “statistics-graph” card - then consolidated (i.e. recalculated) data based on that “hourly” data are displayed.

These docs (one, two) are still unclear in this part. That why there is a confusion “why 5minutes statistics is only displayed for last 10 days”.

I had the same question. I looked at my database and there is in fact a table called statistics_short_term which holds 5-minute-interval data going back to the time I have my recorder purge_keep_days set to (in my case 14 days), and there is a separate table called statistics which holds the hourly data going back to the date and hour I installed my first measurement sensor in home assistant. I did not see any other relevant statistics tables, so I suspect you are also correct about HA re-calculating the hourly data to come up with daily, monthly, yearly info.

1 Like

Recent PR added some good corrections in Docs, now both short- & long-term statistics are mentioned.