Improved support for long-term historic data

Currently historic values are only available if they are recored from entities by HA itself in realtime.
Especially for the recent additions concerning energy support the need for inclusion of existing historic data from other sources (e.g. historic smart meter values) or fix wrong historic values increases a lot.

This and the recurring recommendation to cleanup or delete the database to fix some errors that comes up every now and then (which is absolutely not acceptible if you want a reliable and safe way to store certain long-term historic data esp. for energy consumption but maybe other data as well) can be seen on many forum topics, e.g.:

It would be great, if HA would provide a better way to handle long-term historic data (e.g. attach and/or import data from data sources with historic values; possibility to fix wrong or missing historic data) to make HA much more useful for a complete smart home itegration solution.

Here’s a list of (my personal) requirements and expectations to start a discussion about how support for long-term historic data could look like in Home Assistant:

  • Never ever lose historic data (unless explicitely removed)
    • keep data when removing/renaming entities or re-integrating them which may cause them to get new unique IDs
  • Pull/push/import historic data from different data sources
    • e.g. entity values, local systems, cloud services, CSV files, form to manually add data, …
  • Freeze data of certain time ranges to mark them as confirmed and non-mutable
    • e.g. after comparing the data with the real energy consumption of the energy provider invoice
  • Merge data from different sources to the same metric
    • e.g. replacing a power meter results in a new entity and may starting to count from a different offset - still it’s the same logical power consumption
  • Fix wrong data
    • It often happens due to misconfiguration or device misbehavior that wrong historic data has been recorded. It should be able to fix this data (in case the correct values are known) or remove it
  • Provide a way to integrate traditional (non-smart) consumption data
    • e.g. manual meter readings once in a year

Any additional suggestions?

3 Likes

Great collection. Only little to add:

Real-life example how (for beginners pretty - and also quite dangerous) difficult it is to fix energy dashboard (statistics and statistics_short_term data):

One thought: Maybe expanding the purge_keep_days option could be a starting point.
Other thought: you currently need to workaround the renaming/reading scenario to re-link data with an entity_id, as the data is still there (recorder DB). This also needs to hack the recorder database and/or the core entity registry - which is some kind of dangerous of course, nothing for normal users.

1 Like

Thanks, @e-raser, I’ve added it to the list.

Great suggestion. I bring solar and energy use data into HA, but sometimes the inverter may go off line or data is not received in HA for some reason. I would like HA Energy to have a feature similar to PVOutput Bulk Loader Bulk Loading — PVOutput documentation where you can upload missing readings from a csv file of data.

3 Likes

This would be a great addition to the power section. I’m not sure how other utilities companies provide their data but my local company provides all the usage data in CSV and via an API, but about 24H delayed. Being able to import this data would mean that anyone with a utility company that uses a similar setup could view their power usage without any extra hardware. Although it’s not as granular as something like a per-outlet meter, it sure is a lot cheaper and still gives good data.

Hopefully this gets a little more attention because the power section as of now is a bit lacking imo.

My gas company provides me hourly consumption data but data becomes available at 13:00 and newest datapoint is from 07:00. So data shown in energy panel is 8h - 32h old, which makes energy panel almost useless without historic data import.