Custom Integration to import long term statistics from a file like csv or tsv

Thanks for creating this integration. Two questions:

  1. What date time format is being used for import? I’m not aware of a standard that formats the date with periods. Is there some easy option to make pandas datetime64 output that format?

  2. What are people using for exporting long term statistics in order to fix them up and re-import?

Hi @TomMarg,

did you manage to do the import? Maybe you can share here what went wrong so that others can benefit.

Great that it finally worked!

The time format is “26.01.2024 00:00” (TT.MM.JJJJ hh:mm), as you probably found in the examples (used in some areas of europe, at least in mine :wink:). As of today, you have to convert it on your own with whatever tool. I thought already about making this more flexible, but did not find the time yet. Feel free add this feature on your own.

Since a few versions, you can export from the standard history graph to csv. I was also already thinking about adding this option to the integration.

I am exporting it from a web page at the grid/energy company. I imported it into Excel and did all the fixes there before importing it to HA.

So overall I was successful to import using a tsv file with this format. My failure: I used the wrong sensor name.

statistic_id unit start sum
sensor.stromzaehler_verbrauch kWh 11.12.2023 10:00 2540.0

However there is still one behaviour that I could not solve. When I import data the very next hour starts with a value of minus counter:

Historic data is accurate. Data following that hour as well but I do not know how to solve that, yet.

in one of my trials I had similar issue. it seems it is related to ‘sum’ column

@klaj mentions this in his readme

Initially I just left 0 in all rows in column ‘sum’, then tried to copy the ‘state’ value to the ‘sum’. in both cases I had similar spikes in the junction of imported data and newer data.

I then did a calculation:

  • first I looked up in the database what is the first original 'state; value in the HA ‘statistics’ table. (used phpmyadmin (I am using MariaDB))
  • I then used a google sheets formula to substract the state value from that first state. take a look at the pic.
  • it resulted in an increasing negative ‘sum’

by the way I also used googlesheets to format the date from the one HA exported (used ‘Replace-all’ to cut off the seconds and unnecessary characters).

oh and one more thing. the statistic_ID is different from the one that I exported from HA. in other words I truly merged two different entities history :grinning:

That fixed it.

Next job will be an export of all the data to InfluxDB. Current data is written by the sensor. :slight_smile:

Hello Klaus, all,

I finally got to try your integration. It took me a while to setup a second home assistant, with a copy of my database to safely mess with the measurements.
Thanks for this integration!

I got old data from my domoticz, going back to 2017. Especially the energy readings would be nice to transfer, but also some temperature measurements.

I set up a libreoffice calc sheet to transfer data to the right format, and for the electricity measurements it worked. But with the gas consumption measurement, which for all I can see, I did the same way the integration gives the following error:
(I couldn’t copy the line, so I made a screenshot.

From what I gather , it complains about the statistic_id being invalid. But I never use a colon.

What I am a bit worried and unsure about is the unit. It is cubic meter, and I copied the string from home assistant, but the superscript 3 can cause some trouble.
These are the first 2 lines from the file (badly copied, admitted)
|statistic_id|unit|start|state|sum|
|sensor.dsmr_consumption_gas_delivered |m³ |03.12.2017 00:00|31.782|-14420.305|

Any idea what goes wrong here?

Regards,

Bert

does your sensor.dsmr_consumption_gas_delivered entity exist in your system…that might be your issue as this Integration will only import to existing entities, it doesn’t create them.
(if you don’t have the sensor you want to import to already in your system just create it using a template that never updates and you can import against that)

Hi Gav_in,

Barring a typo, yes, the entity exists. It is part of the Digital Smart Meter Reader(dsmr) integration. That is the strange thing. And the electricity history input went without a hitch, so I thought I had it figured out.

Again, I’m a bit nervous about the m3 going right. typing in kWh is easy, but with the superscript “3” in m3, you never know between file encodings. But then the error is indicating something else of course.
In short, I’m at a loss.

Bert

Can you please try if it works without the space between …delivered and the ‘|’?

If the content of the file above is not correct (I saw in your screenshot that you are using a tab as delimiter, above is a pipe), and you do not have a wrong “space”, please attach the real file (the first two lines are enough, if the error occurs with the 2 rows only as well).

Klaus

Hi Klause,

Thanks for the catch. There was indeed a space character after “delivered”.
When I removed that, the import went OK. Thanks again for the great integration!
I still hope it can be made more userfriendly. E.g. if Homassistant gets a facility to recalculate the sum value, so we don’t have to struggle with that.
Also, is the unit column really needed? Or can you get that from the entity and fill it in during the import?

Anyways, still very happy and importing some more data from domoticz.

Kind regards,

Bert

Fine that it is working now :slight_smile: . The sum column is on my list already, added “take the unit from the existing entity”. Lets see if/when I have time … in the meantime, here is the list of possible improvements I have collected so far:

  • Fix bug with seconds in timestamp (only support minutes, and add tests for some “crazy” timestamp)
  • Calculate sum automatically, based on the oldest existing value
  • Allow different dateTime-formats
  • Take the unit from existing entity
  • Export data

@all: Feel free to implement, I’d be happy to review and merge :wink:

Hi Klaus,

Yes, I finished migrating my data. Your integration was very nicely timed for me. And appart from the typo with the space, on my side, it went smoothly. The integration worked flawlessly.

In the process I created an libreoffice calc sheet to support the migration of my Domoticz data to Home Assistant. Maybe this can help someone that is trying the same thing. GitHub - brjhaverkamp/Domoticz2HomeAssistant: Migrate data from Domoticz to Home Assistant
It is my first ever github repository, so bare with me :slight_smile:

Feedback is welcome, and I will try to include suggested improvements. Although this type of thing is something that you use only once, so I probably never use it myself again. But it can be a starting point for someone.

@Klaus, thanks again for your work. From my point, to add to the wish list is of a utility to upload your entire domoticz db, select the domoticz device, select the Home Assistant Entity, and some other data and migrate fully automatic. But I know this is a very elaborate request. So take it with the apppropriate amount of grains of salt.

Kind regards,

Bert

I’d like to ask for some clarification regarding the timestamp (start column) before I do my first import.

I exported data for the entire month of May for an existing sensor in Home Assistant, with the objective of importing that into a newly created template sensor (als in HA).

The timestamp in the exported data as seen in the exported data looks like this in the first couple of rows:

entity_id,state,last_changed
sensor.sensor_name,7043.864,2024-04-30T22:00:00.000Z
sensor.sensor_name,7043.877,2024-04-30T23:00:00.000Z
sensor.sensor_name,7043.89,2024-05-01T00:00:00.000Z
sensor.sensor_name,7043.904,2024-05-01T01:00:00.000Z
...

Technically these timestamps are correct, they’re noted in UTC and since I live in The Netherlands, UTC+1 and in Daylight Savings Time, the first ‘human’ hour in May starts on April 30 at 22:00 because we currently live 2 hours ahead of UTC.

I can easily turn that timestamp as Home Assistant exports it into the format required for this integration, using some search & replace work and a formula in a spreadsheet.

However, this sentence in the readme on Github is a little confusing to me:

Keep in mind that the times are local times of the HA server.

Does the integration automatically take care of the time zone offset of 2 hours (in my case) by specifying the timezone identifier in the import screen?

Or do I first have to alter the source data in the csv file so that the timestamp of the very first row show as 01.05.2024 00:00 rather than 30.04.2024 22:00?

my understanding is that it’s this route you need to go…

the line you quoted I think is telling you that the Service here is using your local HA time to import and any manipulation to the database time (UTC) is done by the Service/HA rather then by you…
(just to say that when I imported my data I was in UTC timezone so didn’t need to make any adjustments or considerations here so might be off in my thinking)

one of the nice things about this Service is that a reimport of the same data overwrites the existing database entries so you can fairly easily correct/adjust if you need to

Did you try it, did it work?

To be honest, I am not sure anymore what I meant with “Keep in mind that the times are local times of the HA server.” You have to specify the timezone anyhow when importing the data, so if you specify there the timezone where your HA server is running, and the times are in local time of this timezone, everything should be fine.

I haven’t gotten around to trying it yet, I’m planning to update my Home Assistant version this weekend, and to take advantage of that reboot to also get this integration installed. That way I can reduce how often I need to restart it, since that also interrupts any running automations and sensor processing.

Once I know how the timestamps react to timezone settings, I’ll report back.

Just to be sure, when the instructions are to add ‘import_statistics’ to the configuration.yaml file, do you literally mean to simply add a new line somewhere (perhaps near the top) of the file, that looks like:

import_statistics:

with nothing before, after, or under it?