Recalculate statistics table sum to fix wrong energy data

Did you solve this meanwhile?

I’ve tryed to change hte value but sum values after this value was not recalculated.

If you are on SQLite then it might be a bit complicated. Here are example SQLs that need manual adjustments - entity_id, date range and start value for recalculation in the last query:

-- database: c:\tmp\home-assistant_v2.db

--backup
CREATE TABLE IF NOT EXISTS statistics_backup AS 
SELECT * FROM statistics;

-- Restore from backup
DROP TABLE IF EXISTS statistics;
CREATE TABLE IF NOT EXISTS statistics AS
SELECT * FROM statistics_backup;

-- Use this query to see results
-- Select date range is 1 hour before the update range
-- In the first row you will see the initial 'sum' value and can use it instead of 1000 in the step 2.
SELECT id, state, sum, datetime(start_ts, 'unixepoch') as readable_time
FROM "statistics"
WHERE 
metadata_id = 172
AND datetime(start_ts, 'unixepoch') >= '2024-09-30 22:00:00'
AND datetime(start_ts, 'unixepoch') <= '2024-10-17 12:59:59'
ORDER BY start_ts;

UPDATE statistics
SET sum = 0
WHERE
metadata_id = 172
AND datetime(start_ts, 'unixepoch') >= '2024-10-01 00:00:00'
AND datetime(start_ts, 'unixepoch') <= '2024-10-17 12:59:59';


-- Step 1: Calculate sum deltas
WITH ordered_statistics AS (
    SELECT id, state, start_ts,
           LAG(state, 1, 0) OVER (ORDER BY id) AS prev_state,
           LAG(sum, 1, 0) OVER (ORDER BY id) AS prev_sum
    FROM statistics
    WHERE metadata_id = 172
    AND datetime(start_ts, 'unixepoch') >= '2024-10-01 00:00:00'
    AND datetime(start_ts, 'unixepoch') <= '2024-10-19 23:59:59'
)
UPDATE statistics
SET sum = ROUND(
    (SELECT CASE
        -- If current state is lower than previous (reset occurred)
        WHEN ordered_statistics.state < ordered_statistics.prev_state THEN ordered_statistics.prev_sum + ordered_statistics.state
        -- Normal cumulative sum calculation
        ELSE ordered_statistics.prev_sum + (ordered_statistics.state - ordered_statistics.prev_state)
    END
    FROM ordered_statistics
    WHERE ordered_statistics.id = statistics.id), 3)
WHERE metadata_id = 172
AND datetime(start_ts, 'unixepoch') >= '2024-10-01 00:00:00'
AND datetime(start_ts, 'unixepoch') <= '2024-10-19 23:59:59';


-- Step 2: Convert sum deltas into incremental value
WITH ordered_statistics AS (
    SELECT id, state, created_ts,
           LAG(state, 1, 2230.060000000319) OVER (ORDER BY id) AS prev_state,
           LAG(sum, 1, 2230.060000000319) OVER (ORDER BY id) AS prev_sum -- Start from the sum of the first row outside the range
    FROM statistics
    WHERE metadata_id = 172
    AND datetime(created_ts, 'unixepoch') >= '2024-10-01 00:00:00'
    AND datetime(created_ts, 'unixepoch') <= '2024-10-19 23:59:59'
),
deltas AS (
    SELECT id, 
           CASE
               -- If the current state is less than the previous state, it indicates a reset
               WHEN state < prev_state THEN state
               -- Otherwise, calculate the delta as the difference between the current and previous state
               ELSE state - prev_state
           END AS state_delta,
           prev_sum
    FROM ordered_statistics
),
cumulative AS (
    SELECT id, 
           -- Only add the hardcoded value to the first record in the range
           CASE 
               WHEN id = (SELECT MIN(id) FROM ordered_statistics) THEN state_delta + prev_sum
               ELSE SUM(state_delta) OVER (ORDER BY id) + 2230.060000000319
           END AS cumulative_sum
    FROM deltas
)
UPDATE statistics
SET sum = ROUND(cumulative.cumulative_sum, 3) -- Update the sum column with the calculated cumulative sum
FROM cumulative
WHERE statistics.id = cumulative.id
  AND metadata_id = 172
  AND datetime(created_ts, 'unixepoch') >= '2024-10-01 00:00:00'
  AND datetime(created_ts, 'unixepoch') <= '2024-10-19 23:59:59';

In his SQLs following values are hardcoded:

Replace this hardcoded valus in the SQLs:
2230.060000000319 - is sum value that should be used as a start
2024-10-01 00:00:00 - start date time
2024-10-19 23:59:59 - end date time, can be in future
172 - metadate_id of your sensor.

I know as I also have one incident of the earlier past where I experienced this.

I was hoping the SQL hacking „on the open heart“ wouldn’t be necessary anymore. Unfortunately it seems like it is though.

No idea what the dev tools section statistics fix does. It probably just ignores the sum column…for whatever reason -.-

1 Like

No did not manage to get the peak out. In the dbase I removed all but it kept on coming back. I just have to live with it.

I think there should be an addon / feature to recalc the statistics sums after the outliers have been fixed in the statistics. Something that automates this hassle of finding a weird sensor reporting and then manually fixing the sums on different time windows (weekly, monthly etc.)

Guess what… I have not been able to fix all these outliers and they might just happen once in a while … unfortunately .