Custom Integration to import long term statistics from a file like csv or tsv

Hi @TTr0n
sorry, I do not understand your question. The timestamp must be available, it must be an exact hour (no minutes, no seconds), and it must be formatted in a ways that python/pandas understands it. When you use the timestamp as in the example files, you are on a safe path

Your example shows 07.01.2024 23:00 , my “Timestamp” is 1710601200
Do i have to convert this to 16.03.2024 16:00 ?

Yes, you have to.

Thx for the answer.
For using this, i need hour based data, but what to do, if i have some hours without any data? Do I really have to add every single hour from 0-23?
What about the missing hours? Do I have to fill them in manually?

And second question:
Your integration puts the data into statistics table and there i need a sum value, because its a counter value with state/sum.
Whats the best practice in m case? Delete the last 4 days in statistics table and generate this data new with the data from statistics_short_term combined with the old data from 2023 till 13.03. ? Or does it matter and my old data can also start by zero ?

Regarding the first question, look at Custom Integration to import long term statistics from a file like csv or tsv - #30 by kaandorp.

Regarding the second question, I would keep the existing data, adjust the sum value accordingly (just because typically you want to keep your data). There is also some discussion above.

As I am just testing anyhow:

When you do not set values every hour, the standard historical graph will look like here on the right side:

whereas on the left side there is data every hour. So, its up to you

I just released version 1.0.0, not marked as beta/pre-release anymore.

The main changes are that now there are much more checks in the integration, so you if something is wrong, you will get an error in the GUI, instead of in the recorder logs.

And importing external statistics works now (not sure if this is needed anyhow …)

And the readme is much better now.

I had some trouble with HACS when updating the integration - first it showed me the version 1.0.0 (in the HACS-update list), then I clicked on cancel, and then my integration was not shown anymore in this list. However, I removed the integration and added it again. HACS installed 1.0.0, but it still shows an old Readme. Probably not important …

1 Like

Hi Klaus, i tried to use your integration. First 2 tmes it works, no problems … now i get unknown errors ( after update ) and shortly before

websocket_api script: Error executing script. Error for call_service at pos 1: path /config/Bezugszaehler_test.csv does not exist.
Fehler beim Aufrufen des Diensts import_statistics.import_from_file. Unknown error


Logger: homeassistant.helpers.script.websocket_api_script
Quelle: helpers/script.py:485
Erstmals aufgetreten: 22:24:19 (2 Vorkommnisse)
Zuletzt protokolliert: 22:26:35

websocket_api script: Error executing script. Unexpected error for call_service at pos 1: unconverted data remains: :00
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 485, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 723, in _async_call_service_step
    response_data = await self._async_run_long_action(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 685, in _async_run_long_action
    return long_task.result()
           ^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2319, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2363, in _execute_service
    return await self._hass.async_add_executor_job(target, service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/import_statistics/__init__.py", line 42, in handle_import_from_file
    stats = _prepare_data_to_import(file_path, call)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/import_statistics/__init__.py", line 83, in _prepare_data_to_import
    stats = _handle_dataframe(df, timezone_identifier)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/import_statistics/__init__.py", line 165, in _handle_dataframe
    new_stat = _get_sum_stat(row, timezone)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/import_statistics/__init__.py", line 234, in _get_sum_stat
    "start": datetime.strptime(row["start"], "%d.%m.%Y %H:%M").replace(tzinfo=timezone),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/_strptime.py", line 554, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/_strptime.py", line 336, in _strptime
    raise ValueError("unconverted data remains: %s" %
ValueError: unconverted data remains: :00

Sorry
Never mind, my fault … wrong time format 00:00:00 instead of 00:00

Just released and immediately afterwards the first bug … I am checking for a timestamp with seconds, but not in all places … I will fix this in the next version. At least the documentation is fine :wink:

1 Like

Hi @klaj thank you very much for your integration. It allows me to import all my old electricity usage.
The only issue I’m having is that I have to process the whole history every time I want to import anything new, to calculate the correct sum.
Would it be possible for the integration to autocalculate the sum based on the existing entries already in HASS?
For example:
My power company gives me an CSV for each day with the kWh used every hour. (so that would be the state). If I want to import the last day, I need to calculate the sum based on what I already imported. If I set the sum to 0, I get a negative usage.

I hope my explanation makes sense.
Cheers,
Liviu

Hi @liviu,

I do not understand completely. Fur further imports, you should not need the whole history, but only the newest not yet imported value. Lets say, you have imported until Mar, 1st, 23:00. And the sum is 1.000 at this timestamp.

Then you just have to add the hourly kWh to 1.000.

Does that make sense?

Of course I could do that also in the integration, but the main usecase I have in mind is to import old values before collecting data in HA, and then its only a one-time-action. Lets see if/when I have time. Maybe somebody else wants to do this? :wink:

Regards,
Klaus

I’ll try and find up with a easier use case for this scenario.

Let’s say that every hour I want to import the amount of electricity used last hour and add it to a “Total Increasing” sensor.

Every time I import it, I need to know the last sum and add the new value/state to it. As the energy dashboard always shows me usage equal difference between this and the previous sum.

I can save the previous sum in a script and always use it to calculate the new one, but I thought it might be easier for the integration to get the previous sum from the hass database and calculate the new sum based on the new state.

Cheers
Liviu

In that case, wouldn’t it be easier to create a sensor and somehow set it’s value each hour?

Hi Everyone,

Glad I found this integration :slight_smile: like some others I have switched some integrations, in my case I switched from Fusion solar to Huawei solar to report the inverter generation. This happened around new years. I kept old Fusion solar entity in the Energy dashboard config to be able to “rewind” to previous year and analyze istory. But one day I made a mistake and deleted the Fusion solar integration. which in turn removed the old entity… :man_facepalming: and to make the matters worse I noticed this after several days which meant I would create a gap in all other statistics if I just restored… So I just cursed a bit… and went on.

But when I stumbled on this integration I thought I would try to add my old entity history to my new entity history: restored a January backup, updated HA to 2024.3.3 (to get the history download button), saved last years history of the old entity to a .csv file

Reformatted the data in the file to correspond this integrations requirements. it took me several trials until I got it right but eventually I have the statistics restored!

Thank You @klaj :+1:

Thanks for creating this integration. Two questions:

  1. What date time format is being used for import? I’m not aware of a standard that formats the date with periods. Is there some easy option to make pandas datetime64 output that format?

  2. What are people using for exporting long term statistics in order to fix them up and re-import?

Hi @TomMarg,

did you manage to do the import? Maybe you can share here what went wrong so that others can benefit.

Great that it finally worked!

The time format is “26.01.2024 00:00” (TT.MM.JJJJ hh:mm), as you probably found in the examples (used in some areas of europe, at least in mine :wink:). As of today, you have to convert it on your own with whatever tool. I thought already about making this more flexible, but did not find the time yet. Feel free add this feature on your own.

Since a few versions, you can export from the standard history graph to csv. I was also already thinking about adding this option to the integration.

I am exporting it from a web page at the grid/energy company. I imported it into Excel and did all the fixes there before importing it to HA.