Odd behaviour of riemann integral over templates, different values for direct/indirect sensors

I have a working installation of HA, which e.g. monitors my solar installation (and counts PV generation, battery load/disload and so on). That works as expected, the values i get are fitting to the ones of the PV manufacturer native tools/apps (Fronius, Solarweb).

I wanted to setup a second installation as test instance in addition, so i started rebuildung some stuff there from scratch, also replicating the PV part as mentioned above.

One day a came over the thing, that the new system calculates different (sum) values, but only for battery load/disload, but not for grid load/consumtion or PV generation (i use rieman sums for all, in both installations).

After some digging in, i found out that on the new system the riemann integration has some jumps in its values. I digged further and it seems, that the difference between both systems is that

  • the original system uses integration over the “normal/direct” sensor
  • the second system uses integration over a template (which simply returns the value of the “normal/direct” sensor (so it is a kind of wrapper)

Both the sensor values (direct one and templated one) are identical, i downloaded both as CSV and did a diff, the values match (the templated values are ~ microseconds later in timestamp as one would expect).

But the riemann integral over both these sensors sum up to different values, there are some jumps for the templated one within which i canot explain.

After these findings i made a template sensor on the first system and added an integral over that one - the result is the same here, the templated integration value differs also and has these jumps.

Oddly enough, the other sensors (PV generation, grind consumption) are configured the same way on both systems and there are no glitches. I have really no idea what could be the reason.

* "SolarNet Entladeleistung" is the "direct" sensor
* "wrapper_Hausspeicher_entladeleistung..." is the template version of it
* "test_integral...direkt" is the rieman sum of the 1st
* "test_integral...wrapper" is the rieman sim of the 2nd

At first glance, it sounds a lot like you’re using the default method of trapezoidal for the integral helper instead of left sum, which is what you want/need.

I though of that, but both integrals do use it that way (so the same), so why is the summation result different then (and why in special does the glitch take place at these specific locations, and not all the way through)?

(and as i mentioned, the other PV sensors like grid in/out or PV generation work identical without having these glitches; i use trapezoid there all the way through, for the direct sensors and also for the wrapped versions of these).

Trapeziod will give very wrong results for solar after the sensor has been 0 for a long time. the higher the first measurement is after a long period of no change to the sensor value, the worse the effect.

Your template sensor is made in the helpers section. Those do not have an availability template. That means differences may occur when the source sensor is unavailable.

Well, both CSVs for both sensors (direct, templated) are identical, if trapezoid would generate errors then, should it not be the case for both integrals, so both summed values being identical (maybe wrong in principle, but identical)?

I am not sure if i can see unavailibility in the CSV, but from what i could tell is that there are 1:1 matching lines (beside the ~ microseconds timestamps)

Nevertheless i will give it try with the left sum.

The wrong values that trapezoid introduces could be way different if unavailability causes differences in stored values when it is dark. If that is the case, left Riemann should eliminate that effect and the values should come significantly closer. Slight varying delays in updating the template can cause tiny differences too but I would not expect them to be significant.

Unless you add a max time interval. If you add a max time interval, long non-changing results will slowly progress to zero over the max time interval. Well, likely not slowly, it will probably just jump to zero.

I understood that, but this is not what i meant. I see the point that trapezoid might result in wrong overall values (and that left might be better for the precise result), but that is not my point.

I was wondering if both sensors share the same/identical data, and both use trapezoid, why is it that not both sumations are equal. I don’t see it.

And what i was asking alos myself: I see the possibility of values diverting from each other, but why is it a 3 times effect (there are three jumps in the graph). I mean, if the direct sensor is unavailable, then the templated sensor will also? And if the templated sensor would be unavailable, why would it them sum up to a higher value compared to the direct sensor. (again, both CSV exports of both sensors are identical)

Are there any jobs running within HA at 8am (for instance), that might “clean up” data somewhere in the database?

Your wrapper isn’t accounting for invalid states where your original is. This is a common problem in HA with template entities. Based on your template, I would expect errors in your logs with each jump complaining about non-numeric values.

Will check for this and also examine for unavailibilty. But why is then the templated integral higher in its value?

Hmm, not that much in there?

There can be a number of reasons. Typically it’s because the integral adds the change in value. If your value is zero and goes to a number, that change is added and will be reflected in the calculation.

I know what you meant, but you fail to grasp what I am saying. I’m explaining what differences in both sensors there are, that lead to differences in what trapezoid calculates.

A bad drawing, but this is how significant differences are caused by trapezoid, that are different because an error might prevent a value to be stored in the recorder, even if that value is 0.

1 Like

Yes i know, but aren’t the sensors CSV export values the values, over which is calculated, or does the recorder/integral use other/extra data?

(because both CSV export are identical in number of entries and values and still the result is different)

Well, the code doing the calculating is also the same. So the difference is in the data if the settings are the same. You are ignoring the time axis in your comparison. The data point I took as an example might be at a different time. It is just an example of how the differences occur.

I know two important things:

  1. Trapezoid Riemann is bad when applied to almost all HA data, even more so for solar data.
  2. Template sensors missing availability templates are bad news for integration sensors and utility meters. You get errors, errors mess with the code.

If both sensors provide wrong answers there’s no point in comparing them.

1 Like

This is why i am so insisting here:

Sensor 1:

sensor.solarnet_entladeleistung,397.6117,2025-05-23T22:00:00.000Z
sensor.solarnet_entladeleistung,372.081,2025-05-23T22:00:04.171Z
sensor.solarnet_entladeleistung,414.6036,2025-05-23T22:00:14.171Z
sensor.solarnet_entladeleistung,398.8657,2025-05-23T22:00:24.169Z
sensor.solarnet_entladeleistung,356.0881,2025-05-23T22:00:34.177Z
sensor.solarnet_entladeleistung,443.5077,2025-05-23T22:00:44.206Z
sensor.solarnet_entladeleistung,382.0541,2025-05-23T22:00:54.178Z

Sensor 2.

sensor.wrapper_hausspeicher_entladeleistung_aktuell,397.6117,2025-05-23T22:00:00.000Z
sensor.wrapper_hausspeicher_entladeleistung_aktuell,372.081,2025-05-23T22:00:04.172Z
sensor.wrapper_hausspeicher_entladeleistung_aktuell,414.6036,2025-05-23T22:00:14.172Z
sensor.wrapper_hausspeicher_entladeleistung_aktuell,398.8657,2025-05-23T22:00:24.170Z
sensor.wrapper_hausspeicher_entladeleistung_aktuell,356.0881,2025-05-23T22:00:34.178Z
sensor.wrapper_hausspeicher_entladeleistung_aktuell,443.5077,2025-05-23T22:00:44.207Z

and so on for both, there are completly equal in time and values. This is why i can’t see it.

I’m out, this is pointless. Do you even know the differences in the Riemann sum stem from the period you show? The likeliest place where your difference occur is not where measurements come in quickly. Look in places where the increase of both integrals differ over the same time period. Focus on periods with long time between measurements.

Please output all the data for both sensors, not just a snipit over 45 seconds.