Home-assistant_v2.db corrupt - How to recover long Term Statistics

Hi,

Sorry for the late reply - I somehow got not notified about your post!

at the step

insert or ignore into statistics
select c.id, c.created, c.created_ts, m.new_id as metadata_id, c.start, c.start_ts, c.mean, c.min, c.max, c.last_reset, c.last_reset_ts, c.state, c.sum
from corrupt.statistics c
inner join (
	select o.id as old_id
	, n.id as new_id
	, case when row_number() over (partition by o.id) > 1 or row_number() over (partition by n.id) > 1 then 0 else 1 end as test
	from corrupt.statistics_meta o
	inner join statistics_meta n on n.statistic_id = o.statistic_id
) m on m.old_id = c.metadata_id and m.test = 1 order by id;

I get the same error again:

Execution finished with errors.
Result: no such column: c.created_ts
At line 1:
insert or ignore into statistics
select c.id, c.created, c.created_ts, m.new_id as metadata_id, c.start, c.start_ts, c.mean, c.min, c.max, c.last_reset, c.last_reset_ts, c.state, c.sum
from corrupt.statistics c
inner join (....

Do you have an idea to fix it? Do you need a sample or something?

Thanks for much!

Maybe your DB schema version is old and not compatible. You may need to make your own script to make your old tables compatible first.
Sorry, I can’t help with that.

You may try to make HA open that old DB file, I think it may be able to update the schema for you.

Thanks for your reply!

I am not good enough in sql, so I guess I’ll just leave it and live with the missing data.

Hey, this is super helpful - greatly appreciate you posting this.

I was able to import all of my historic energy data from my corrupt DB using this. One thing though, I have a gas sensor setup and the values showing in my dashboard are now astronomically high for last year “1,989,615 ft^3”. Even if I divide the 1.9million by 100 (ft^3 → CCF), that value doesn’t seem right, way too high.

My gas meter reads in CCF, I guess there is some unit conversion that went bad when I ran the query. Any idea on where to start troubleshooting?

work for me too, thanks a lot. I can’t access to my HA with ssh so i make it with gparted and scp to take the file and repair. Thanks again :wink:

This is amazing, thank you so much @monstrenyatko, my long term statistics are finally restored. Instead of using the UI tool, I just use sqlite3 command on my installation. Basically do:

sqlite3 home-assistant_v2.db

sqlite> attach "home-assistant_v2.db.corrupt.2025-02-08T03:38:56.628435+00:00" as corrupt;

Then paste your queries.

Worked for me! Thanks a lot!

I updated @monstrenyatko code to match Home Assistant 2025.09 schema (just two added fields: c.mean_type and c.mean_weight):

-- insert possibly missing statistics_meta records
insert into statistics_meta
select row_number() over () + (select max(id) from statistics_meta) as id, c.statistic_id, c.source, c.unit_of_measurement, c.has_mean, c.has_sum, c.name, c.mean_type
from corrupt.statistics_meta c
left outer join statistics_meta m on m.statistic_id = c.statistic_id
where m.id is null;

-- move existing ids, as all corrupt records are older
update statistics set id = - (id + (select max(id) from corrupt.statistics)) where id > 0;
update statistics set id = - id where id < 0;

-- import statistics from corrupt db
insert or ignore into statistics
select c.id, c.created, c.created_ts, m.new_id as metadata_id, c.start, c.start_ts, c.mean, c.mean_weight, c.min, c.max, c.last_reset, c.last_reset_ts, c.state, c.sum
from corrupt.statistics c
inner join (
	select o.id as old_id
	, n.id as new_id
	, case when row_number() over (partition by o.id) > 1 or row_number() over (partition by n.id) > 1 then 0 else 1 end as test
	from corrupt.statistics_meta o
	inner join statistics_meta n on n.statistic_id = o.statistic_id
) m on m.old_id = c.metadata_id and m.test = 1 order by id;

-- update sum values causing mega negative values for day of corruption in Energy Dashboard
update statistics
set sum = sum + a.add_sum
from (
	select s.id, a.statistic_id, sum(a.lag_sum) as add_sum
	from statistics s
	inner join (
		select *
		from (
			select m.statistic_id, s.*, lag(s.sum) over (partition by s.metadata_id order by s.start_ts) as lag_sum
			from statistics s
			inner join statistics_meta m on m.id = s.metadata_id
			where s.sum is not null
		) t
		where t.sum < t.lag_sum
	) a on a.metadata_id = s.metadata_id and s.start_ts >= a.start_ts
	group by s.id, a.statistic_id
) a 
where a.id = statistics.id;

-- rebuild statistics_short_term
delete from statistics_short_term;
insert into statistics_short_term
select row_number() over () as id, c.created, c.created_ts, c.metadata_id, c.start, c.start_ts, c.mean, c.mean_weight, c.min, c.max, c.last_reset, c.last_reset_ts, c.state, c.sum
from statistics c
where date(c.start_ts, 'unixepoch') = DATE('now')
order by c.start_ts, c.id;

I had to stop HA, save the current database elsewhere, and start HA with the corrupted database in order to have its schema updated to the latest version. Then I stopped HA, saved the updated corrupted database, open the current database (home-assistant_v2.db) file and then attach the updated corrupted database (home-assistant_v2.db.corrupt) as "corrupt" (it asks for the name when you select the file). Finally, execute the script above and press the "Write Changes" button. At this point, the current database (home-assistant_v2.db) file can be used by HA with all the history recovered.

Of course, I expect the script above to require new changes as HA evolves over time and the schema changes.

Is this schema still the same in 2025.12?
I just realized Home Assistant dumped my old db as corrupted and started from a fresh one 1 week ago… Now I have to recover the old one and merge both into a new one.