Well this is a bit of a hacky solution, but it will hold me over for now. Hopefully this will help someone else!
In your configuration.yaml, add this:
shell_command:
install_sqlite3: apk add --no-cache sqlite
run_sqlite_query: sqlite3 /config/home-assistant_v2.db < /config/FixStatisticsSummation.sql
Save, then restart HA. Important note, I am running in a docker container which does not come with sqlite installed. So every time I update the docker container I need to install sqlite to be able to run my query.
Go to Developer Tools, then Actions. Search for Shell, and you should see “Shell Command: install_sqlite3”, click that then “Perform Action”.
Now you have access to running sqlite commands from HA.
Create a file called “FixStatisticsSummation.sql” in the same directory as your configuration.yaml. This is the contents:
-- Attach a new database file
ATTACH DATABASE 'FixStatisticsSummation.db' AS fix_db;
-- Create table in the new database if it doesn't exist
CREATE TABLE IF NOT EXISTS fix_db.fix_statistics_counts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
created_ts INTEGER NOT NULL,
updated_table TEXT NOT NULL,
updated_rows INTEGER NOT NULL
);
-- Process for statistics table
WITH MetadataIDs AS (
SELECT id AS metadata_id
FROM statistics_meta
WHERE has_sum = 1
),
StateDeltas AS (
SELECT
s.id,
s.metadata_id,
s.created_ts,
s.start_ts,
s.last_reset_ts,
s.state,
CASE
WHEN s.state >= LAG(s.state) OVER (PARTITION BY s.metadata_id ORDER BY s.start_ts)
THEN s.state - LAG(s.state) OVER (PARTITION BY s.metadata_id ORDER BY s.start_ts)
ELSE s.state
END AS power_used
FROM statistics s
JOIN MetadataIDs m
ON s.metadata_id = m.metadata_id
),
CumulativeSum AS (
SELECT
id,
metadata_id,
SUM(power_used) OVER (PARTITION BY metadata_id ORDER BY start_ts) AS cumulative_sum
FROM StateDeltas
),
MismatchedSums AS (
SELECT
s.id,
c.cumulative_sum
FROM statistics s
JOIN CumulativeSum c
ON s.id = c.id
WHERE s.sum != c.cumulative_sum OR s.sum IS NULL
)
UPDATE statistics
SET sum = (
SELECT cumulative_sum
FROM MismatchedSums
WHERE MismatchedSums.id = statistics.id
)
WHERE id IN (SELECT id FROM MismatchedSums);
-- Log the update operation
INSERT INTO fix_db.fix_statistics_counts (created_ts, updated_table, updated_rows)
SELECT
strftime('%s', 'now'), -- Current timestamp (in seconds)
'statistics', -- Updated table name
changes() -- Number of rows updated
WHERE TRUE; -- Always Log
-- WHERE changes() > 0; -- Only log if rows were updated
-- Process for statistics_short_term table
WITH MetadataIDs AS (
SELECT id AS metadata_id
FROM statistics_meta
WHERE has_sum = 1
),
StateDeltas AS (
SELECT
s.id,
s.metadata_id,
s.created_ts,
s.start_ts,
s.last_reset_ts,
s.state,
CASE
WHEN s.state >= LAG(s.state) OVER (PARTITION BY s.metadata_id ORDER BY s.start_ts)
THEN s.state - LAG(s.state) OVER (PARTITION BY s.metadata_id ORDER BY s.start_ts)
ELSE s.state
END AS power_used
FROM statistics_short_term s
JOIN MetadataIDs m
ON s.metadata_id = m.metadata_id
),
CumulativeSum AS (
SELECT
id,
metadata_id,
SUM(power_used) OVER (PARTITION BY metadata_id ORDER BY start_ts) AS cumulative_sum
FROM StateDeltas
),
MismatchedSums AS (
SELECT
s.id,
c.cumulative_sum
FROM statistics_short_term s
JOIN CumulativeSum c
ON s.id = c.id
WHERE s.sum != c.cumulative_sum OR s.sum IS NULL
)
UPDATE statistics_short_term
SET sum = (
SELECT cumulative_sum
FROM MismatchedSums
WHERE MismatchedSums.id = statistics_short_term.id
)
WHERE id IN (SELECT id FROM MismatchedSums);
-- Log the update operation
INSERT INTO fix_db.fix_statistics_counts (created_ts, updated_table, updated_rows)
SELECT
strftime('%s', 'now'), -- Current timestamp (in seconds)
'statistics_short_term', -- Updated table name
changes() -- Number of rows updated
WHERE TRUE; -- Always Log
-- WHERE changes() > 0; -- Only log if rows were updated
-- Commit the transaction
COMMIT;
-- Detach the new database file
DETACH DATABASE fix_db;
This query will recreate the sum column in the statistics and statistics_short_term tables, and create/update a tracking database to add a row every time it runs on each table to show how many records it had to fix.
Now create and automation to run this every hour 5 minutes after the hour:
- Go to Automations
- Click Create Automation
- Click Create New Automation
- Click Add Trigger
- Select Time Pattern
- Enter 5 in Minutes Field
- Click Add Action
- Select “Shell Command ‘shell_command.run_sqlite_query’”
- Click Save
- Give it a name like “FixStatisticsSummation”
You can now go back to automations, click the ellipses on the new automation, and run the action to trigger it manually. Once you run it, you should see a new db file in the same folder as configuration.yaml. If you open this in a database viewer, you should see 2 lines in the only table, one for statistics and one for statistics_short_term, each row will have an epoch for time run and a number of rows in the table it had to fix.
If you had an issue and run this back to back 2x, you should have 2 rows with the 2 tables with some number of fixed rows, then 2 rows with the 2 tables with 0 fixes. That tells you it is working. Every hour at 5 minutes past, you should get 2 new entries showing you how many it had to fix.
What I have found using this to scrub my database is that even if you are not seeing huge power spikes like in my first post in this thread, HA is accumulating summation errors all the time and I have no idea why. This will fix them as they go so you don’t end up with useless statistics.