How to import historical energy data (Toon and others)

Update 01-2024

The script has been rewritten and is now setup to be more generic, robust and easier to use. The import functionality has been split from getting the data from the energy provider. Hopefully together we can add support for additional energy providers by providing sample dataexports so that data preparation scripts/how-to’s can be created.


For Home Assistant users that would like to import historical energy data from Toon (Eneco - Dutch) or other energy providers into Home Assistant I created a SQL script that does that for you.
I looked into several options and decided in the end to write a SQL script that imports data directly into the Home Assistant database. I tried to document the steps as good as possible but it still requires some technical knowledge and should therefore not be used by novice users.

The script has been successfully tested on several installations and can handle also cases where there was a meter switch (reset of measurements) or that the Toon was replaced (import multiple export files).

The script can (easily) be adapted for use with other providers. The Toon data provides two fields (field1: Unix timestamp, field2: value of the sensor). Depending on the file the data is hourly or daily.

Latest data still correct after import (short_term_statistics work)

Data of 2019 (imported from Toon using hourly interval data - statistics work)

Data of 2015 (imported from Toon using day interval data - statistics work)

Repository containing the script

10 Likes

Awesome! Will this also work for people running MariaDB?

1 Like

I haven’t tested it with MariaDB only for SQLlite.
But in the end it just is SQL so the basic functionality should work.

@Lasoul: could you do me (and others) a favour and post a 1 or 2 day extract from Toon? Want to see how I should prepare the source data.

Bit of background: have been running Emoncms for a decade. Emoncms can give me average consumption per hour (so in W), an ever increasing kwH sensor (which I can export in hourly values) and kWh per day value. Additional challenge is that I need to split the files into high vs low tariff to match the structure of Homeassistant (luckily in NL high vs low is quite binary as you might know :-))

@fversteegen sure, I uploaded some sample files for all the sensors (electricity, solar, gas).

The file format is pretty simple. The CSV file contains two fields:

  1. Unix epoch time stamp

  2. Sensor value at that time stamp.
    This value has to be divided by 1000 to get kWh or m3.

1534312800,2459
1534316400,2585
1534320000,2939
1534323600,3656
1534327200,4684
1534330800,6058
1534334400,7612
1534338000,8468

Would it be possible to expose the SQL scripts themselves? That should allow it to be used against any type of DB

Sure, I extracted the SQL from the DBBrowser project and added it to the Github repository.

2 Likes

Many thanks, this helps loads :slight_smile:

Thanks for that, unfortunately I fall into that latter category of user and know enough that I should NOT attempt this as I’m likely to irreversibly screw it up, especially as I am wanting to import a different set of energy data to this. In my case energy consumption data for individual devices.

But keeping an eye out for any way to help people like me import historical data. For now there seems to be no option at all.

Sir @Lasoul, Thank you so much for this awesome Project!

Finally, i was able to migrate/merge all my Energy-Statistics from previous Energymeters to their new Entities. A long Time i was looking for a Solution. Your Script was a promising, and finally working starting Point :+1:

Of Course, it took me several Evenings to get through, adopt it, get used to it. But it was really great structured and well documented, so it was difficult, but yet never got lost.

For all others wanting to do this, two personal Learnings:

  • Always. Always test it first on a second machine, never use your productive System in first Place :smiley:
  • Don’t try to be smarter than the Instructions, never skip a Point. If it says e.g. ‘Stop Recorder’ - do it :smiley:

Yes, it is not a Task for two-three Hours. But at least, as HomeAssistant doesn’t assist you in any Way if your entities change for whatever reason, this is a Solution.

Now, my next Task is to adopt the Script to Statistics, where there is no Sum, but Mean/Min/Max Values. But i will focus on that some time later… :wink:

Thanks again! :+1:

1 Like

Thank you!
It is good to read that it helped you in getting historical energy data into Home Assistant.

Do you have any valuable additions that can be shared in the repo for the benefit of other users?

This is great. I can’t wait to try this out with the CSV from ESB Ireland.

I get data for every 30 mins. It is kW active import. Hopefully I can make it work.

It seems like that your sample file contains the usage per 30 min. To use this with my script you should convert it to sensor “measurements” and the timestamp should be converted to unix epoch time.
This can be done by importing the data and process it with a sql script and let the script create the needed table.

1684468800   2.046000   // 19-05-2023 06:00
1684467000   1.829000   // 19-05-2023 05:30
1684465200   1.538000   // 19-05-2023 05:00
1684463400   1.238000   // 19-05-2023 04:30
1684461600   0.937000   // 19-05-2023 04:00
1684459800   0.712000   // 19-05-2023 03:30
1684458000   0.488000   // 19-05-2023 03:00
1684456200   0.272000   // 19-05-2023 02:30
1684454400   0.000000   // 19-05-2023 02:00
1 Like

Thanks for the reply! I am a little stuck at this part of the README:

Lookup in the “statistics_meta” table the ID’s of the sensors (Browse Data → Table: statistics_meta; You can use “filter” to find the id of the sensor)

I am not seeing any sensor.electricity* entries in the meta table. Note that I am only importing electricity and have no gas.

Note that I have enabled the “consumption” part of the Energy dashboard and it still doesnt help.
Schema version is 41 as expected.

The names provided in the README are the default names that my energy provider “Toon” integration uses for the energy sensors. Other suppliers use different names for their sensors. You can lookup the right names in the setup of the Energy dashboard (Settings → Dashboards → Energy).

Below the screenshot of my setup with the sensornames that are configured for the “Electricity grid”.

Ah ok I see thank you. There is no integration for my provider, so that might be a blocker for me then! My provider (ESB Ireland) will be enabling the P1 port late 2024, so maybe then I can try and backfill data from the CSVs.

What you could do is create a “template energy sensor” for that. The “template energy sensor” would be empty at first and you would add the statistics data with the script.

1 Like

ok great. I added the following template sensor:


template:
  - sensor:
      - name: ESB Networks Consumption Total
        state: "1"
        unit_of_measurement: kWh
        device_class: energy
        state_class: total_increasing

and have selected that in the Energy Dashboard config page:

Then I shut down HA and opened the DB, and I am seeing some more things now, so this is good.

Ok I think I have it working well now. Took a lot of pairing down of the sqbpro script but I got there in the end.

<?xml version="1.0" encoding="UTF-8"?><sqlb_project><db path="/projects/esb_smart_meter_stats/homeassistant/config/home-assistant_v2.db" readonly="0" foreign_keys="1" case_sensitive_like="0" temp_store="0" wal_autocheckpoint="1000" synchronous="2"/><attached/><window><main_tabs open="structure browser pragmas query" current="3"/></window><tab_structure><column_width id="0" width="406"/><column_width id="1" width="0"/><column_width id="2" width="100"/><column_width id="3" width="3448"/><column_width id="4" width="0"/><expanded_item id="0" parent="1"/><expanded_item id="1" parent="1"/><expanded_item id="2" parent="1"/><expanded_item id="3" parent="1"/></tab_structure><tab_browse><current_table name="4,15:mainstatistics_meta"/><default_encoding codec=""/><browse_table_settings><table schema="main" name="event_data" show_row_id="0" encoding="" plot_x_axis="" unlock_view_pk="_rowid_"><sort/><column_widths><column index="1" value="47"/><column index="2" value="76"/><column index="3" value="300"/></column_widths><filter_values/><conditional_formats/><row_id_formats/><display_formats/><hidden_columns/><plot_y_axes/><global_filter/></table><table schema="main" name="statistics_meta" show_row_id="0" encoding="" plot_x_axis="" unlock_view_pk="_rowid_"><sort/><column_widths><column index="1" value="23"/><column index="2" value="235"/><column index="3" value="53"/><column index="4" value="122"/><column index="5" value="61"/><column index="6" value="54"/><column index="7" value="39"/></column_widths><filter_values/><conditional_formats/><row_id_formats/><display_formats/><hidden_columns/><plot_y_axes/><global_filter/></table></browse_table_settings></tab_browse><tab_sql><sql name="Import Toon data">/* 
Import Toon files

How to:
NT = normal tarriff
LT = Low rate tarriff

*/

/* Remove the temporary tables if they exist */
DROP TABLE IF EXISTS NT_ORIG_NEW;
DROP TABLE IF EXISTS LT_ORIG_NEW;
DROP TABLE IF EXISTS NT_PROD_NEW;
DROP TABLE IF EXISTS LT_PROD_NEW;
DROP TABLE IF EXISTS SOLAR_NEW;
DROP TABLE IF EXISTS GAS_NEW;


/* Create temp tables that can hold the difference between the measurements and create a new sum */
CREATE TABLE &quot;NT_ORIG_NEW&quot; (
	&quot;ts&quot;		INTEGER,
	&quot;value&quot;		FLOAT,
	&quot;diff&quot;		FLOAT,
	&quot;old_sum&quot;	FLOAT,
	&quot;new_sum&quot;	FLOAT
);


/* Insert the hourly records from Toon - max 5 years */
INSERT INTO NT_ORIG_NEW (ts, value)
SELECT field1, round(field2 / 1000.0, 3)
FROM elec_quantity_esb;

  
/* Remove any overlapping records from Toon which are already in Home Assistant */
DELETE FROM NT_ORIG_NEW
WHERE
ts &gt;= (SELECT MIN(start_ts) FROM statistics WHERE metadata_id = 2); /* Change */

/* Insert the data from Home Assistant so that we can adjust the records with the new calculated sum */
INSERT INTO NT_ORIG_NEW (ts, value, old_sum)
SELECT start_ts, state, sum
FROM statistics
WHERE metadata_id = 2; /* Change */


/* 
Calculate the difference from the previous record in the table 
  - For the Toon values calculate the diff from the previous record from the imported values (use value column / old_sum column is empty)
  - For the Home Assistant values calculate the diff from the previous record from the existing sum column (use old_sum column / old_sum column is not empty)
*/
WITH CTE_DIFF_NT_ORIG_VALUE AS (
	SELECT ts, round(value - (lag(value, 1, 0) OVER (ORDER BY ts)), 3) AS diff
	FROM NT_ORIG_NEW
	ORDER BY ts
)
UPDATE NT_ORIG_NEW
SET diff = CTE_DIFF_NT_ORIG_VALUE.diff
FROM CTE_DIFF_NT_ORIG_VALUE
WHERE
  NT_ORIG_NEW.ts = CTE_DIFF_NT_ORIG_VALUE.ts AND
  NT_ORIG_NEW.old_sum IS NULL;

WITH CTE_DIFF_NT_ORIG_SUM AS (
	SELECT ts, old_sum - (lag(old_sum, 1, 0) OVER (ORDER BY ts)) AS diff
	FROM NT_ORIG_NEW
	ORDER BY ts
)
UPDATE NT_ORIG_NEW
SET diff = CTE_DIFF_NT_ORIG_SUM.diff
FROM CTE_DIFF_NT_ORIG_SUM
WHERE
  NT_ORIG_NEW.ts = CTE_DIFF_NT_ORIG_SUM.ts AND
  NT_ORIG_NEW.old_sum IS NOT NULL;
  

  
/* Cleanup possible wrong values:
        - Remove the first record if no diff could be determined (Toon data)
        - Diff is null  =&gt; The point where Toon data goes over to Home Assistant data 
		- Diff &lt; 0		=&gt; Probably new meter installed (measurement should be low)
		- Diff &gt; 1000	=&gt; Incorrect value 
   First handle the first two cases and then correct to 0 when incorrect value
*/
DELETE FROM NT_ORIG_NEW
WHERE
ROWID IN (
  SELECT ROWID FROM NT_ORIG_NEW
  WHERE
    old_sum IS NULL
  ORDER BY ts
  LIMIT 1
);

UPDATE NT_ORIG_NEW
SET diff = round(old_sum, 3)
WHERE (diff IS NULL);

UPDATE NT_ORIG_NEW
SET diff = round(value, 3)
WHERE (diff &lt; 0.0) AND (value &lt; 10);

UPDATE NT_ORIG_NEW
SET diff = 0
WHERE (diff &lt; 0.0) OR (diff &gt; 1000.0);



/* Calculate the new sum
   It is calculated by calculating the sum until the record that is currently processed
*/
WITH CTE_SUM_NT_ORIG AS (
    SELECT ts, SUM(diff) OVER (ORDER BY ts) AS new_sum
    FROM NT_ORIG_NEW
)
UPDATE NT_ORIG_NEW
SET new_sum = round(CTE_SUM_NT_ORIG.new_sum, 3)
FROM CTE_SUM_NT_ORIG
WHERE
  NT_ORIG_NEW.ts = CTE_SUM_NT_ORIG.ts;

  
/* Copy the new information to the statistics table
id			=&gt; primary key and automatically filled with ROWID
sum			=&gt; calculated new_sum value
metadata_id	=&gt; the fixed metadata id of this statistics (see top)
created_ts	=&gt; set to the timestamp of the statistic
start_ts	=&gt; timestamp of the statistic
The sum is updated in case the record is already in Home Assistant

&quot;where true&quot; is needed to remove parsing ambiguity
*/
INSERT INTO statistics (state, sum, metadata_id, created_ts, start_ts)
SELECT new_sum, new_sum, 2, ts, ts FROM NT_ORIG_NEW WHERE true /* Change */
ON CONFLICT DO UPDATE SET sum = excluded.sum;


/* Also update the short term statistics. 
We calculate the delta with which the sum was changed and add that to the current measurements
*/
UPDATE statistics_short_term 
SET sum = sum + (
  SELECT (new_sum - sum) as correction_factor
  FROM
    statistics_short_term as SST, NT_ORIG_NEW AS SN
  WHERE
    SST.metadata_id = 2 AND /* Change */
    SST.start_ts = SN.ts
  ORDER BY state DESC
  LIMIT 1
)
WHERE
  metadata_id = 2; /* Change */

  
/* Remove the temporary tables */
DROP TABLE IF EXISTS NT_ORIG_NEW;
DROP TABLE IF EXISTS LT_ORIG_NEW;
DROP TABLE IF EXISTS NT_PROD_NEW;
DROP TABLE IF EXISTS LT_PROD_NEW;
DROP TABLE IF EXISTS SOLAR_NEW;
DROP TABLE IF EXISTS GAS_NEW;
    
DROP TABLE IF EXISTS elec_quantity_esb;</sql><current_tab id="0"/></tab_sql></sqlb_project>

1 Like

Great, that looks good!
Can you share the conversion script that you wrote for the ESB data so that we can share it in the repository?