How to store historical utility billing data (actual and estimated)

I am trying to figure out a way to store historical utility bill data in home assistant. I have the bill estimating logic set up and working very well, but I need to store historical data to compare to the actual bills.

It would also be great to be able to store, view and use the historical bill data. For the moment, I have 4 bills per utility being saved and referenced.

  • current month’s bill (estimated)
  • previous month’s bill (estimated)
  • 2 months previous bill (estimated)
  • actual latest bill data from retailer (manually entered)

I am wondering if there is something simple I can do with available core or add-on stuff to store the billing data or if I need to look into creating something new.

Here’s a screenshot of my billing dashboard to get a feel of the data. The last bill template is how I would like to store the data, so the line items match to my actual bill.

Home Assistant is great at running a state-model of entities from various integrations, and these can be used to easily create dashboards and automations based on state values and change events. What HA is not designed to do is store ‘user-data’ ad hoc. This means that reporting, summarizing or analysing the (long-term) historic entity-state is not a simple task.

I run HA for monitoring of my Solar PV system, and also pull in my energy use and billing consumption data from my UK utility company. I use historical utility billing data in several different ways, including composing historic billing information.

There are a couple of ways to approach getting at historical data. Since HA does not keep user-data, other than permitting input_entities which are auto-restored after restarts, any historical data older than the default 10-day window has to either come from the long-term statistics, or has to be stored independently to HA.

Long-term statistics
These are hourly samples of entity state values, saved together with appropriate statistical values (sum, average, min, max etc) and this data lasts forever. The downside is that not all entities are saved, no attribute data is saved, and you need to perform SQL enquiries on the long-term stats table directly so as to get data back. It does have the advantage of not requiring any other data storage.

HA has an SQL integration, which adds an entity based on an SQL query. As long as you can write the necessary query to return just one value, it works. Here I am computing the total (electrical) energy cost for the past week, taken from long-term data.

It is just a case of adding the SQL (sensor) integration and adding the SQL required.

SELECT SUM(state) AS total FROM
(SELECT id, state, udate FROM
    (SELECT metadata_id AS id, state, date(created_ts, "unixepoch") AS udate, time(created_ts, "unixepoch") AS utime
    FROM "statistics"
    WHERE metadata_id = 115 AND unixepoch()-created_ts < 86400 * 7 AND utime > "20" AND utime < "23"
    ORDER BY udate, utime DESC)
GROUP BY udate)

This is pulled from the HA statistics table, which uses ‘metadata_id’ as a reference for the entity, and this entity to metadata_id mapping can easily be read from the statistics_meta table by inspection.

The SQLite Web add-on makes access to the standard HA database very easy (just don’t edit the tables or contents!)

If you are happy to experiment, you may well find that you can create a new SQL entity value for each field (historic value) you want, each being taken from the long-term history of the relevant ‘bill estimate’ sensor you already have.

Own database
Since HA is not intended for user data storage, you have to go off piste for this one, although it is actually easier to do than should be admitted to.

Using the main HA database by adding additional tables is clearly risky, so I went down the route of purchasing a second HA machine (can’t break my main machine), setting up MariaDB (can’t break SQLite), and using Node-RED to collate a set of entity values each hour, which is pulled from the main machine to the second machine, and then stored in the DB every 24 hours. I now have 580+ days worth of hourly records for Solar PV, battery, grid and load energy use, together with battery SOC, solar PV forecast, weather, electricity energy tariff (agile) costs and utility energy consumption. This still requires SQL to extract and build the information I want, but with Node-RED I can generate tables of data and use it as I wish.

MariaDB comes as an HA add-on and can be run additionally and entirely separately to the standard SQLite setup. The phpMyAdmin add-on works like SQLite Web but for MariaDB, so I created and maintain the database tables manually using that.

Node-RED, although not as popular as templating, is easy enough to use with the mysql node set. Since MariaDB is server based rather than direct file access (SQLite) off-machine access is also possible.

I hope this helps!

That’s great advice. I’ve been thinking of using mqtt, storing the data in a sqlite db to start (can use sqlalchemy and db drivers for other db set ups) and using a Hass action to enter/retrieve data. I haven’t learned JS, so I can’t make a nice little add-on to display this data nicely (was thinking of using an image of my bill as a background to display the data on, just programmatically adding line items for display)

I think just having access to read/write data will suffice for a start. You’re right that it’s hard to work with historical data like this in Hass, having access to the billing data (estimated and actual) is crucial for dialing in energy usage and should, imo, be a core integration. Adding supplemental data (like weather forecasts, tarrifs, PV, etc) is critical as well (I’ve recently been tracking weather temp Delta, RH and pressure to see how much of an effect it has on heating/cooling energy)

I haven’t worked with custom actions yet, but I do have experience making mqtt add-ons for Hass. So hopefully this idea has some fruit to bear soon.

I am also working on an mqtt add-on that pulls data from my provincial energy operator to keep an eye on the grid, let me know if load shedding is required and when grid alerts are broadcast.

I haven’t used node red yet, so am unsure of how to use it to display or categorize the data. I’m still in the thinking about/planning phase of this project and am looking for ideas like you have supplied. Eventually I would like to share the project with the community so they can also store energy data.