The type of Riemann sum describes exactly how to estimate (or invent) values that aren’t present in the data. My point was that that trapezoid cannot be “fixed” because that would stop it from being trapezoid. Trapezoid is the data invention method, if you want to call it that way. It works this way by definition.
The only way trapezoid will work right on Home Assistant data is when you have extra (real) data points in the database to prove the value hasn’t changed. There’s no other way. If these data points are not there, then trapezoid is designed to interpolate with the last known datapoint.
For solar power trapezoid will work reasonably well because the first value is close to 0. Solar power inverter integrations are also close to polling behavior. But the fact remains that as far as I know the recorder has recently been changed to ignore the extra datapoints. It will hurt for solar data too.
I am also convinced that left Riemann won’t be far off on solar data. Sure, sometimes it will underestimate a bit (on the rise) but then it will overestimate a bit (on the fall). Because the graph mostly shows the typical normal distribution curve, the deviation will be very small. Both methods will be off a bit depending on the polling frequency. If the solar inverter gives kWh it should) then use that, it will be way more accurate.
But because there is a very long period of 0, the first watt coming in may be counted for half of all the hours it was dark if there is only one 0 in the database. And most inverters have a cutoff, they wont start or stop at a fraction of a Watt, rather a multiple. That is a way more significant error than the perceived gain in accuracy of trapezoid on solar data that is fluctuating during the day.
If you have integral sensors for both methods, you can see how much the trapezoid it is off in the night by the peek at the start of the solar production. You can also compare with a left sum if you substract the error peek when comparing. Then you’ll see how much they differ during actal priduction after a day.
I used the seemingly binary light behavior because it illustrates the best way how the errors occur if the value does not change. It will be less on fluctuating data, I mentioned that.
There’s only one way I know that people use to get more datapoints in the recorder, and that is to introduce very slight fluctuations in the data using a timer. That won’t make the data more accurate though. It could help for the derivative sensor, which is no longer usable for sensors with large constant periods for the same reason that I described here. There is a bug report for that, because the derivative is at times not 0 while the value is constant. That illustrates it is a real problem when datapoints are ignored.