Problem with hourly averages and long-term statistics

Hi,

I’m planning to use Grafana and influx DB to visualise statistics for my heat pump over the course of a year. I found a pretty good instruction that I can certainly execute. One aspect that is recommended is to really review which data needs to be captured. This is a good approach in general, and so I took the opportunity.

But in this context, I discovered something weird:
I have a sensor that captures the COP for the heat pump once a day. This works as expected every night at 23:55. Configuration:

- trigger:
    - platform: time
      at: "23:55:00"
  sensor:
    - name: "COP WW täglich"
      state: >
        {% set heat = states('sensor.erzeugte_warmemenge_ww') | float(0) %}
        {% set energy = states('sensor.energie_verbrauch_ww') | float(0) %}
        {{ (heat / energy) | round(2, default=0) }}
      availability: >
        {{states('sensor.erzeugte_warmemenge_ww')|is_number and states('sensor.energie_verbrauch_ww')|is_number}}
      state_class: measurement

However, when looking at the history for that sensor it has actually more data points than originally captured. This is because the long term statistics table is inflated by hourly average values which are pointless for such a sensor. Check the data in this table:

Sensor Value Date
sensor.cop_ww_taglich 4,7799844145 2024-09-25T21:00:00.000Z
sensor.cop_ww_taglich 4,89 2024-09-25T22:00:00.000Z
sensor.cop_ww_taglich 4,89 2024-09-25T23:00:00.000Z
sensor.cop_ww_taglich 4,89416018554167 2024-09-26T21:00:00.000Z
sensor.cop_ww_taglich 4,94 2024-09-26T22:00:00.000Z
sensor.cop_ww_taglich 4,94 2024-09-26T23:00:00.000Z
sensor.cop_ww_taglich 4,94582424200556 2024-09-27T21:00:00.000Z
sensor.cop_ww_taglich 5,01 2024-09-27T21:55:00.467Z
sensor.cop_ww_taglich 4,75 2024-09-28T21:55:00.466Z
sensor.cop_ww_taglich 4,94 2024-09-29T21:55:00.467Z
sensor.cop_ww_taglich 4,71 2024-09-30T21:55:00.466Z
sensor.cop_ww_taglich 5,15 2024-10-01T21:55:00.467Z
sensor.cop_ww_taglich 4,6 2024-10-02T21:55:00.466Z
sensor.cop_ww_taglich 5,01 2024-10-03T21:55:00.466Z
sensor.cop_ww_taglich 4,86 2024-10-04T21:55:00.466Z
sensor.cop_ww_taglich 4,82 2024-10-05T21:55:00.467Z

So instead of reducing the stored data, the transformation of the long term statistics inflates the data.

Has anyone an idea how to circumvent this?
Some ideas from my end:

  • Ignore it since Influx DB does anyway not depend on the long term statistics
  • Change the state_class of the sensor from measurement to ???

Never used Grafana, so my question could be silly.
Where do you take this table from? In Grafana?
Is it historical data?
If you think that this table contains LTS - then why almost all days have 1 reading, and some days have more readings? (but anyway not hourly)

This is how HA works, and it can cause problems for things like daily maximums from long-term statistics (for example). If a sensor is 200 at 9am and then 100 at 3pm, it is assumed that it stayed at 200 for the period between 9am and 3pm. If these were on separate days, the maximum for the second day would (so far) be 200 because of the value the day before.

If your numbers increase as time progresses, you could use a state class of total increasing which would overcome a lot of the problems. However, it sounds like you don’t need long-term statistics anyway, so why not just omit state class to not store them?