Backfilling energy statistics for dashboard

I’m wondering if there would be a way to back fill the stats (similar to influx db filling) for the sensors I have for the energy dashboard. I keep a year of raw sensor data in mysql, so technically converting that into statistics for the energy dash would be feasible. Any one ideas ?

4 Likes

There is no facility for performing this. Home Assistant records data in real time.

Wouldn’t it be possible to directly edit it in the sqlite database? Have been looking into that, but am far from a database expert…

Sure you could write a python script to edit the recorder database.

any idea where we can see the code that does the current logic in HA? i’m no developer, but seeing the functions and inserts, might give some idea

I’m looking into this also.

I’ve been putting my meter data into an influx db to generate overviews. So i’ve got almost 1,5 years of history i would like to add to the energy graphs.

1 Like

thx… having a look at db too now. Peaking around one sensor and seeing the statistics table data compared to states. aggregating per sensor per hour should be SQLable

though… the running total column starts at 0 on august 4th… so backfilling would also require updating current records.

1 Like

It would be interesting to know if backfilling could be done by using negative numbers of the sum.

not sure what you mean, but making period before the 0 , a negative value, would not create proper reports per time period probably.

Would be a good idea to make it possible to import data in some way for exemple for data like this.

Any luck with this?

Just starting to look into this and was hoping someone already had a solution. This is interesting detail and I guess somewhat of what I would expect.

To avoid having to rewrite current records, which would require some downtime, one thought is to create a new dummy sensor that contains the historical data. I believe there is support for more than one sensor in some of the categories like solar and consumption.

Hopefully at some point the Home Assistant functionality will expand in this area to support their goals on energy. I’ve got historical data for solar production from my inverters, electric utility provided data for net metering (both flows from and to the grid for 15 minutes intervals) for over two years.

I also wonder if they’re working on something for storing this time series data more efficiently. I’ve got to imagine the statistics table is going to get pretty big over time.

2 Likes

having cooked the ideas a bit longer. I think filling a v2 table for stats would be best approach. then when all is good, rename normal stats table with _org then rename v2 table to normal stats.
Building the data is not that easy for me. finding the right 2 values each hour. subtract values, perist and then next hour, I do have not the developer capabilities for. This can be done in either stored procedure/plsql or in script outside db (python or whatever someone has tricks on).
This way you could run from start date recorder data and have real short down time when renaming table and make multiple attempts if there is some fuck up.

Even safer is for development is to just take a copy of the DB and work on that until you are confident with your procedure. I copy it from the RPi 4 that is it running on to my windows laptop, that way there isn’t any way you’ll block home assistant from writing to the DB.

Hopefully the home assistant devs will come up with something so that many of us won’t have to try to kludge something together.

3 Likes

IT seems that simply adding new rows to the statistics table doesn’t work, the ids of the rows should be in the same order as the start dates, so adding old start dates with high ids causes problems. I ended up doing this:

  1. Pick a sensor which already has stats going back in history, but you don’t care about it (for me, temp reported by a motion sensor somewhere in the house was a good choice).
  2. This sensor already has rows in the stats table, so simply update the state column.
  3. Change the metadata id for the affected rows to that of the power meter you would like to backfill data for.
  4. Fix the sums, for a total increasing sensor, update statistics set sum = state - where metadata_id=xxx

So I had a few years of stored energy data in CSV files that I wanted to import, so I wrote a small Go program to do this, at GitHub - aamcrae/ha-backfill: Utility to backfill Home Assistant energy data

It took some trial and error to get it right, since it required rewriting both the statistics and statistics_short_term table records for the relevant sensors. Trying to selectively insert old records into an existing database was just too tricky because of the summing etc.
I found that the records in both tables must match in terms of values and sums, otherwise the historical data gets quite out of whack.

If you didn’t have CSV data that is up to the minute, it may be useful to extract the current records and merge them with historical records before deleting and rewriting the records in the tables.

I found that once you have the data, it’s easy to create an SQL script that can be applied, to minimise the downtime. I found it helped to stop the container, copy the database to /tmp (a fast disk), apply the script, then copy the database back. Of course, it’s easy to make a copy prior to the change, so that if you find something has gone wrong, you can just revert to the save database.

So now I have historical data going back several years:

5 Likes

Hi,

First off thanks for the utility. I have a few questions.

  1. My data is in 30min intervals, if I am reading your notes I need to have this every 5mins, would just splitting it to every 5 mins with entries between the 30mins to just be 0 be sufficient?
  2. Would this work with MariaDB or SQLLite only?
  3. Can I do the import one type at a time? example just the grid import then do the others?

I tried the script you have done, and I see the en titties in the DB but my energy dashboard is still empty

Somewhat related @frenck has a new custom integration of “things/features, that will never end up in Home Assistant itself”, as part of that there is a service recorder.import_statistics:

The description in the README.md is only “Advanced service to directly inject historical statistics data into the recorder long-term stats database.”

Without installing it to see the dialog from Developer tools → Services, the services.yaml gives some clues about parameters and how to use it:

recorder_import_statistics:
  name: Import statistics 👻
  description: >-
    Import long-term statistics.
  fields:
    statistics_id:
      name: Statistics ID
      description: The statistics ID (entity ID) to import for.
      required: true
      selector:
        entity:
    name:
      name: Name
      description: The name of the statistics.
      required: false
      selector:
        text:
    source:
      name: Source
      description: The source of the statistics data.
      required: true
      selector:
        text:
    unit_of_measurement:
      name: Unit of measurement
      description: The unit of measurement of the statistics.
      required: false
      selector:
        text:
    has_mean:
      name: Has a mean
      description: If the statistics has a mean value.
      required: true
      selector:
        boolean:
    has_sum:
      name: Has a sum
      description: If the statistics has a sum value.
      required: true
      selector:
        boolean:
    stats:
      name: Statistics
      description: >-
        A list of mappings/dictionaries with statistics to import.
        The dictionaries must contain a "start" key with a datetime string
        other valid options are "mean", "sum", "min", "max", "last_reset", and
        "state". All of those are optional and either an integer or a float,
        except for "last_reset" which is a datetime string.
      required: true
      selector:
        object:

The code is here:

The code looks very straightforward/minimal and uses the recorder integration so it should be database agnostic.

There is this community thread for the Spook integration, but so far no clarifying details usage of import statistics:

1 Like