Fix bad data in energy tab

Its the excact senor:

As you can see from the picture, I changed the integration and from now on, i can probably correct the issues, as this integration offers energy sensor as well.

My guess is that i cannot correct, beaduse the old sensor was the integration sensor, which cannot be corrected:

  - platform: integration
    source: sensor.sun2000_10ktl_m1_sun2000_12_bt2150240752
    name: pv_energy
    unit_prefix: k
    round: 2

Bad thing is that i can’t delete the old sensor from Energy Dashboard, because i will lose all the previous data. So i just have to keep in mind that there is this 300kwh difference. One thing i can probably do is change the pv_energy senor to some static number and correct the bad data in db.

Have you changed the minutes as well? Because every time when you show this picture you show only the whole hour.

The data will be stored somewhere during the hour, not just at the start or end of the hour.

So i commented out the integration sensor in configruation.yaml , corrected the DB, switched to new pv integration and looks like it’s solved.

And yes, i changed minutes as well.

If it shows a spike there on the Energy Tab, that means it has a huge change there recorded in the long term statistics. It should be visible through the developer tools as well.

By the way, have you set the state of the sensor to the correct value in developer tools as well, before you started all the modifications?

My SUM tabel is ruined (maybe I did it myself using the query incorrectly).

But now need to fix it:

How to rebuild the SUM column based on the difference in de STATE column?
Any query to do this?

Thanks to this topic I managed to edit/recalc all my sum values by updating the sqlite stat tables. Now my reports finally make sense.

However, as soon as the hourly integration runs on my Electicity Meter Reader digitizer value, it adds the huge offset again once more in the ‘sum’ (not in the state, that one is correct, see below).
Even if I turn HA off at that time. Is this value stored somewhere? If so, where to change it?

## DIGITIZER METER READERS ###
- device_class: "energy"
  state_class: "total_increasing"
  name: Electricity Meter Reader
  state_topic: "digitizer/elecmeter/main/value"
  unit_of_measurement: "kWh"


image
image

If your sensor still has the bad value when you look at it you will need to also set the correct value with:

@ORi I had the problem… You can just edit the sensor value in the development tools → statistics

You search for the sensor you want to update and then click on the far right “sum adjustment” (not sure it is the exact wording as it is translated from the my french version “ajuster la somme”)… Then you look at the date and time where your sensor is going “banana” and adjust the value to the correct number… Done !

Normally all the statistics where this sensor is involved are automatically updated…

1 Like

Thanks for the suggestion @browetd. That’s actually not the problem here as you can see from below screenshot:

In fact the sensor value is completely correct, also the ‘adjust sum’ value is correct at that time. My best guess is that the integration that calculates the energy consumption over the entire hour somehow stores the incorrect sum somewhere.

EDIT1: i’ll experiment with the correcting the statistics_short_term table and provide feedback if it works.

EDIT2: it works, updating the ‘statistics_short_term’ table in the same way (SET sum = sum - $offset) resulted in the values being correct and the integrations calculate the power consumption correctly in the energy dashboard. No need to stop HA if it’s done this way.

Hopefully this helps anyone in the future as well.

image
My smart meter had some hiccups not reading the total consumption.
This led me to a messed up energy dashboard.

Is there a way to smooth the missing readings filling the gaps with a calculated mean?

Hi! I built a query (mysql - mariadb based) usefull to fix all data set from first row.

My first row was at 11.00 so start from a posive value, you have to customize this dataset to adapt to your situation:


@original_total := 11.25, // starting value, first row, you can set what you want here
@first_state := 11.25, // use same value of previous row or a higher value
@prev_state := 0,
@first_id := 1492, // id of your first row, with sum and state equal to 11.25 in this example
@meta_id := 116, // metadata id

you can adapt it for statistics and short_term, just change table name and fix variables! :smiley:
Hope it will be usefull!

UPDATE
    statistics AS s
INNER JOIN(
    SELECT
        myid,
        cumulative_sum AS mysum
    FROM
        (
        SELECT
            t.id AS myid,
            t.start,
            t.state,
            t.sum,
            IF(
                t.id = @first_id,
                @running_total := @original_total,
                IF(
                    (t.state - @prev_state) < 0,
                    @running_total := @running_total + t.state,
                    @running_total := @running_total +(t.state - @prev_state)
                )
            ) AS cumulative_sum,
            @diff :=(t.state - @prev_state) AS diff_state,
            @prev_state :=(t.state) AS prev_state,
            IF(
                t.id = @first_id,
                @risultato := @first_state,
                @risultato :=(t.state - @diff)
            ) AS risultato
        FROM
            (
        SELECT
            *
        FROM
            statistics
        ) t
    JOIN(
        SELECT
            @original_total := 11.25,
            @first_state := 11.25,
            @prev_state := 0,
            @first_id := 1492,
            @meta_id := 116,
            @diff := 0
    ) r
WHERE
    metadata_id = @meta_id AND id >= @first_id
ORDER BY
    t.id
    ) d
) m
ON
    s.id = m.myid
SET
    s.sum = m.mysum
WHERE
    metadata_id = 116;

I’ve just updated the full sum using:

update statistics
set sum = new_sum
from (
  select id,
    sum(increment) over (
      partition by metadata_id
      order by created_ts asc
    ) as new_sum
  from (
      select id,
        metadata_id,
        created_ts,
        state - coalesce(
          lag(state) over (
            partition by metadata_id
            order by created_ts asc
          ),
          state
        ) as increment
      from statistics
      where metadata_id = 15
      order by created_ts asc
  ) as new_statistics
) as update_statistics
where statistics.id = update_statistics.id
;

Note: my metadata_id = 15; see select id, statistic_id from statistics_meta;

I’m having a similar issue, BUT the sums seem to be correct. In my case it’s very easy to find since it’s from the first day I re-connected my Sense monitor, with 157.6 kWh used in the first hour.

As far as I understand the energy dashboard data comes from the statistics table. But the energy usage chart does not align with what I see in the statistics table. The only thing which I think might be part of the problem is the state does not align with the sum, but I don’t know if it should or not.

Any idea what I need to correct?

Looks like my chart is fine except the total cost for the day. It’s set as a static price at 0.41.


No idea why the difference. Where should I update this in the db so that the cost for that day is 4.7?

EDIT: so I found this in the statistics table for 2023-11-06% date:

state | sum | metadata_id
225.413899999905|9266.67585000579|1
225.508199999905|9266.77015000579|1
225.631199999905|9266.89315000579|1
225.696799999904|9266.95875000579|1
226.016599999905|9267.27855000579|1
226.332299999905|9267.59425000579|1
0.0983999999999105|9494.04135000569|1
0.262399999999761|9494.2053500057|1
0.770799999999671|9494.7137500057|1
1.37350000000015|9495.31645000569|1
1.95570000000017|9495.89865000569|1
2.4354000000002|9496.37835000569|1
2.5625|9496.5054500057|1
2.69779999999997|9496.6407500057|1

I imagine it has to do with the state column? Why would it reset?

Hi all,
I input the prize per kWh in Cent/kWh instead of €/kWh and therefore the prize in the energy daskboard is to high… (I made the mistake on 06.02.2024 until today). Now i am trying to find the values in the database and divide them by 100…

first i check the metadata_ID, which seems to be 128:

Then I tried the following:

but it didn’t find any entry in the database.
without the last query line i get only one entry?!

What I am doing wrong?

EDIT:
when i try the following command then I get the same result with one one entry:

**select id, created, state, sum from statistics**

**WHERE metadata_id = (SELECT metadata_id FROM states_meta where entity_id = "sensor.energieverbrauch_total_kwh_cost")**

In the ‘Entwicklertools’ of HA I can change the values without problems, but it would take long time to change all of them manually… Could it be somehow related to the useage of influxdb? as far as I know they should run in parallel?

Hi. I have a similar issue with my homewizard water meter. I have a data entry on 13feb24 of 67000 L consumption in the energy dashboard. I installed the SQLite web addon but have no idea how to locate the faulty entry?

Isn’t this resolved automatically nowadays? If you look under repairs. Maybe you can see something there?

Hi,

Thanks for the reply. Nothing was in my repair. The issue what caused the spike was a faulty powersupply for my water meter. Exchanged it and it runs now again. I also switched a couple of days ago to mariaDB on my NAS. unfortunately I couldnt find a solution to transfer the current database from my SSD of my Pi to the synology NAS so I lost a lot of data. Thanks anyway. the database is still on my SSD so if someone has a solution for that it would be great

Best way is to fix this is “manually” with sqlite3. The database file we’re looking for is home-assistant_v2.db. Have a backup first of that file before continuing.

First you have to “limit” your queries to the sensors you want to delete the wrong values (in my case i deleted the erroneous values from the tmeperature sensors), so you need to find out the ids of the sensors you want to in the ‘statistics_meta table’ table, in the ‘metadata_id’ column, so:

in sqlite3, we can lookup for a name in that column like that:

select * from statistics_meta where metadata_id like “%{name_to_look_for}%”;

an example for my sensors having “temperature” in its name would be like this:

select * from statistics_meta where metadata_id like “%temperature%”;

that will give me a list with ids of those sensors. note them (metadata_id column). Next i’ll search for wrong values in the statistics table, using those noted ids like that (imagining that my sensor’s metadata_id would be 6,10 and 20):

select * from statistics where metadata_id in(6,10,20) and min=0;

(above, i’m searching for values with 0 in ‘min’ column - i know they’re wrong… and tipically when the battery of those sensors fail, they will output 0 as min and max) - adjust that value for what youre looking for.

also, do the same search for max=0:

select * from statistics where metadata_id in(6,10,20) and min=0;

Check both queries and look for what you want… if everything is ok… then we will delete those values in ‘statistics’:

delete from from statistics where metadata_id in(6,10,20) and min=0;
delete from from statistics where metadata_id in(6,10,20) and max=0;

now do the same thing in the tables ‘statistics_short_term’:

delete from from statistics_short_term where metadata_id in(6,10,20) and min=0;
delete from from statistics_short_term where metadata_id in(6,10,20) and max=0;

Now those values with min=0 or max=0 were deleted. if you look at your graphs, now you can see there are no spikes anymore. you can also adjust the parameters min and max to adapt to your case. in my case, those 0 values in ‘min’ and ‘max’ represent errors from the readings, and therefore, deleting them will give me correct stats.

hope that could help someone. cheers.