Support for multiple recorder components

Would be great to have multiple recorder components.
One for short term storage and to be used with hass.
Second one for long term storage, maybe even offsite.

It would be nice to be able to configure which one of the recorders should hass be using to fetch data.
Also per recorder purge / exclude configuration would be really nice.

There is an old thread about this here

Wow. Looks like there isn’t a big demand for this (or did I miss anything in the meantime?). Strange, because also in my opinion it is crucial to have a short term history with all data and a long term history with selected data. @devs: Can we maybe get a feeling if this is something hard to change or not? Thanks!

I also think this would be useful. Like long time storage for weather data, energy data …

If the recorder interface is properly I think this shouldn’t be to hard to implement

I’m quite surprised only 12 people (including me) voted for that feature!
I just switched from OpenHab and this is actually one of the extremely rare things that were handled better there.

1 Like

Yes, this or something similar would be very good.

Most things I only care about a couple hours of data for on-the-spot troubleshooting (wait, what just happened)
Some things I want a couple days of data to review/analyze regularly
A few things, as needed, I want a few weeks of data (for longer term troubleshooting)

Not everyone wants an additional influxdb for long time storage. So this feature is very important for a lot of users.

2 Likes

I’d also be very keen on this.

I just had a thought.

I might create another empty HASS database on my MSSQL server, and set up a nightly agent job to copy data that I’m interested in across.

My Grafana instance will then pick that up for reporting purposes.

Could possibly also do it trigger based, but I don’t need real time, and that will potentially slow things down a bit to do filters on the triggers and such.

It’s pretty easy to spin-up an influxdb instance in a container. (Either as part of a Home Assistant Supervised add-on, or just a plain docker container.) Influxdb is better optimized for larger volumes of time-series data than how the recorder database seems to store data.

What would be nice is a way to abstract out the storage with a layer that could combine and produce a common view of the historical data that’s stored in multiple databases. Or to be able to use influxdb to feed information into lovelace UI elements that display time-series data, like all the various graphing cards. Of course, you’d also want to be able to dynamically select the interval for the graphing cards… and now you’re most of the way to grafana. Maybe we just need a grafana lovelace card?

I had the same idea when first saw the recorder component. Why its not possible to configure duration of days or even minutes per sensor.

For example sensors raw data used in template and the can be deleted. Duration of 1 (one!) minute is more than enough for this sensors.

Another group of sensors is totals that should be kept for long time and probably filtered… but its another story and full use case described here:

This type of thing keeps coming up. This thread is from 2018!

I think Recorder desperately needs a significant re-think. My own proposal, which aims to improve things without a major change, is here:

2 Likes

Hi All,

I’ve quite limited Python experience, but I’ve spent some time to add a parameter to recorder, which lets you specify entities to keep when it’s running a purge. So using this, you could set your purge duration to say 7 days, and still keep things like temperatures in the database forever.

Please take a look at this commit, which contains the changes: Implement Entities to Keep in Recorder · KiLLeRRaT/home-assistant@4db4d66 · GitHub

Here is an example of the configuration.yaml file using this new param:

recorder:
  db_url: "mssql+pyodbc://User:Pass@ip/homeassistant_dev?charset=utf8;DRIVER={FreeTDS};Port=1433;autocommit=True;"
  purge_keep_days: 1
  keep_entities:
    - person.albert_gouws
    - sun.sun
  auto_purge: false

Note that auto_purge is false above, I’ve just been using the purge service to test this.

I call the purge service from the dev tools using the following:

service: recorder.purge
data:
  keep_entities: 
    - person.albert_gouws
    - sun.sun
  apply_filter: true

Someone smart should be able to extend this, to introduce a level under my keep_entities, so that you can specify an entity, and a duration for that entity. It should be quite easy to then just update the SQL queries I’ve got in place there to take into account the duration you’d want to keep it for.

Anyone keen to look at my code, and basically wrote snippets of what I need to update to support this? Or if you’re keen, just do it and create a PR into mine, and we can then submit it as a PR together into the Core project.

Cheers!

4 Likes

Nice! This is a step in the right direction, thank you!

And that would be the next step.

I envision a way to specify a “keep” duration for each entity. Anything not specified would still take the default, but it’s obvious that events and state changes each have different retention requirements.

The final step would be to add the ability to specify the retention period for each entity in the GUI, both at the time it’s created and later, maybe as a list on a dedicated screen just for adjusting retention periods of all events and state changes.

@KiLLeRRat did you make a pull request for your changes? would be nice to have this officially.

Hi @MarH,

I have the fork, and I think I’ve done enough to get it working, but I haven’t tested it thoroughly.

Since then, I went down the keep only 30 days in MSSQL, and the rest goes into influxDB. Seems to be working quite well for me so I won’t be spending any more time on the recorder component in the near future.

You can find a link to the commit in one of my earlier posts :slight_smile:

Regards,
Albert

1 Like

Voted. But before a 2nd recorder option is added, the first one needs to be rewritten to use the database as a relational database with native datatypes, not just a dumping ground pseudo flat file full of blobs of JSON.

2 Likes

I’d love this to be added.

Re-reading this, it’s striking how little love Recorder has gotten from the Dev community since this FR was first opened four years ago.

In fact, not only has more demand been put on Recorder with the advent of the “Statistics” and related tables, but that table is designed to hold data permanently, with no purge option or ability to opt out of populating it. I’m a big believer in keeping HA simple and efficient enough to run on a RPi with an SD card. Yet we still see things done with Recorder which make that impractical for beginners.

I’ve done DBA work in my career. Like @KilleRRat, I’ve got a solution which works for me. I just hate to see HA getting more and more unfriendly to beginners and non-experts.

2 Likes

Yeah. I’ve been using my server as my DB for a while now. The HASS DB could only keep a few days worth of data before filling up.

But now it’s slow as shit. Taking a minute to load a weeks data. It’d be great if it could keep 2 weeks of data on the SD Card and then keep archival data on a slower networked server.