Custom Integration to import long term statistics from a file like csv or tsv

The most probable cause of the history_stats disappearing after a while, even when they’ve been visible for a short while, is that somehow Home Assistant doesn’t allow/recognize long term statistics in its database that it hasn’t created itself using the history_stats platform. And then it simply removes what it considers to be invalid data. I don’t think this integration is to blame for the disappearance.

However, there is also the possibility that long term statistics from the history_stats platform are somehow stored in a different way than those for regular template sensors. And thus that upon importing from a csv file they need to be stored differently?

Perhaps someone with more intimate knowledge of how history_stats are stored could comment on this?

In an attempt to answer my own question, I queried the database directly, and I looked at the long term statistics of 2 sensors. One is a regular template sensor that records the total amount of energy consumed by a device, the other comes from the history_stats platform and it records the time the device was running on a day by day basis.

There is one row per hour, as is to be expected. Both sensors have a timestamp in the start_ts column (and some other housekeeping columns), but the way the actual stats are stored is different.

The sensor for total amount of energy stores the data in the state and sum columns, but the running-time sensor has the data stored in the mean, min, and max columns.

I can also see that the data I tried to import for the history_stats sensor actually did end up in the database table (which is called “statistics”). However, since I labeled the columns in the CSV file as ‘state’ and ‘sum’, that’s also where they ended up. So the table has null for mean, min, and max, and some values in state and sum.

It seems that Home Assistant ends up deciding to ignore these state and sum values for a history_stats based sensor, since those values aren’t supposed to be there to begin with. So even though the data is in the database, in the end it doesn’t get displayed. Which makes sense, it wasn’t supposed to be there in the first place.

So in conclusion, rather than importing the history_stats data in the state and sum columns, I should have used a mean, min, and max column.

I need to figure out now how to structure those, but more importantly how to get my hands on them. When I get a graph of this sensor that comes from history_stats in the history screen of Home Assistant, and then use the Download feature, it gives me the data as ‘state’ (which might be a bug in Home Assistant actually).

How about importing a log file as json? Would it be big job to add json support to this custom integration?
Background: I have build using python some basic home automations at home. I have been using JSON as logging and now I am trying to start using home assistant. It would be nice to add couple of years worth of log’s from my home in to home assistant.

Great you found that out!

As I said, I have no idea about history_stats sensor, but is it possible to set the state_class for them?

For “normal” sensors, I set the state_class for total_increasing for counters (state, sum) and to measurement for analog values (mean, min, max).

Its also interesting that there is no error when historical values are written which do not fit to the sensor …

Did you try to convert your JSON data to CSV? There are online converters as well, and I suppose you anyhow have to format your data to fit to the specification of the integration.

Apart from that, i am using pandas read_csv. There is also a method to read JSON in pandas, so it should not be that big deal … do you have programming skills? Then you could try it on your own, starting point would be here.

I’m trying to import the last 3 months of electricity from my meter (I have the kwh values ​​of each hour in a csv file). I would like to point out that it cannot be integrated into HA, so I have to create a sensor template from scratch (total_increasing with model state equal to 0).

What steps should I take to avoid incorrects spikes? I’ve made several attempts, but I always get these spikes. See image (today).

Please check Custom Integration to import long term statistics from a file like csv or tsv - #63 by TomMarg, does it solve the issue?

Yes. Thanks.

I was successfully able to import using Spook. I used the following format for the service call:

service: recorder.import_statistics
data:
  has_mean: false
  has_sum: true
  statistic_id: sensor.test_spook_import_energy
  source: recorder
  unit_of_measurement: kWh
  stats:
    - start: "2024-07-02 00:00:00-05:00"
      state: 1
      sum: 1
    - start: "2024-07-02 01:00:00-05:00"
      state: 1
      sum: 2
    - start: "2024-07-02 02:00:00-05:00"
      state: 2
      sum: 4

I downloaded my intervals from Xcel Energy’s website. Opened in Excel and created a simple concat function to generate the individual hourly starts, states, and sums. Pasted into the Statistics field (or you can just copy the full YAML above).

I was capped at roughly 2,000 entries, but I think it varies depending on the total length of the individual entries.

Do NOT attempt without doing a full backup prior to using it though. Run the import on a test sensor first, make sure it works the way you want, then do an import on the sensor.

I did have an issue with the final hour on the import. Usually it was some negative value. Easy to fix a single entry though. Just look at the outliers in the edit stats window to see if there are any after importing.

Hi Klaus.

I try to use your extension to import “Energy.Total” data logged from a Tasmota plug from a csv to HA.
I have problems installing your extension.
I have HA running in a docker container in Ubuntu 22.04.

I did the manual install, creating the custom_components and homeassistant-statistics folder.
My configuration.yaml I manually edited that it looks like this:

# Loads default set of integrations. Do not remove.
default_config:

import_statistics:

# Load frontend themes from the themes folder
frontend:
  themes: !include_dir_merge_named themes

automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml

However, when I go to developer tools in HA and I let check my config, there is an error:
Component error: import_statistics - No module named 'custom_components.import_statistics'

Can you help me what I am doing wrong?

1 Like

I do not think you can export information for statistic. History display only one value for a sensor like

sensor.temperature_entree,20.450629604166668,2024-06-01T12:00:00.000Z
sensor.temperature_entree,20.5,2024-06-01T13:00:00.000Z
sensor.temperature_entree,20.5,2024-06-01T14:00:00.000Z

therefore I have no idea to output existing statistical data ?

I did not manage to fix the manual installation and still don’t know what was wrong.

But I moved to the installation method with HACS and now it works. :slight_smile:

Thank you for your work!

OK, this is probably the mean value only. Then I also do not know how to export min and max, sorry.

I might have missed this one, but if i export my history via the energy dashboard page (which is csv) Would that allow me to upload the data back in (as it seems to be gone at the moment, but i have an old backup which still has it)

I’m hoping someone can help me understand a few things. I had emporia tracking kwh import/exported, and integrating into the energy dashboard. That stopped working (due to utility), and i’m trying to get the same info with greenbuttondata. I exported the data for a period, and massaged it in Python, where i have the correct datetime format with a 60 minute period, sensor name, and kwh value (depending whether it’s consumption or generation).

Looking at existing generation/generation sensors from emporia integration, the displays seem to be a running sum (ie 5000khw from the beginning).

Can i make periodic imports of this data work with only 60 min readings, and not the sum?

Any tips around importing older stats into the existing energy dashboard?
How would you consider the sums?

If I just do the sums for the imported data, I realise some of the entries go a bit crazy :frowning:

I never look at the csv-data exported from the engergy dashboard, but I would say yes, that should work. For sure you will need to reformat the data.

Counters (and the energy dashboard probably only has counters) are much more complex than “normal” sensors. Make sure to have a look at the readme, search for the bullet point “To understand what sum and state means”.

No, using only the 60min readings is not possible. You will have to calculate the sum out of them, and then do an import.

Counters are much more complex than “normal” sensors. Make sure to have a look at the readme, search for the bullet point “To understand what sum and state means”.

Ah, and afaik the energy dasboard calculates values from the input data, so you have to import the input data, not the calculated values (could also be interesting for @riemers)

Does that help?

I am working on this point " You have to align the imported values with the first current value in your database, otherwise there will be a spike, as the difference between e.g. to energy values at 00:00 and 01:00 is the used energy for the hour starting at 00:00" so that the import does that automatically, but that takes more time …

Thanks for the help @klaj

Yea saw the statement in the readme and followed it actually but still ended up with some weird spikes in the graph I had to manually fix HA.

I was trying to import data from last year so hence why I was confused because it would mean then I have to basically reimport everything to make sure the sums are good otherwise I have to start with a negative value?

Wondering if there is a simple way of recalculating the sums when we do an import