Recorder Retention Period By Entity

CaptTom · January 13, 2021, 1:36pm

I’d like to be able to set a Recorder keep_days value for each entity.

Background: New and non-technical users are steered toward running HA on a RPi with an SD card. Yet, if they take the defaults, every event and state change is recorded in the SQL database on the SD card, leading to a bloated database file and shortened SD card life.

Yes, I know there are lots of ways to change these defaults, including using other databases and database locations. But for most beginners, it’s a steep learning curve and this isn’t going to be high on their list - until something fails.

Not all new users have experience with database management. Forcing them to install software, download the database, run SQL commands, figure out which events and state changes they want to include or exclude, and creating the appropriate config lines is a lot to ask.

I can do all that, but even so I’m frustrated by the fact that I can specify only one retention period. There are things I want 24 hours of log data for, and others I want to keep for 7 days.

The whole recorder process seems very primitive compared to the rest of HA.

To me, the best solution would be to allow the (new) user to establish a keep_days for each entity. This should be easy to do when the entity is first defined, and wherever its properties are displayed for edit.

I see this has come up before, but didn’t get much traction. Anyone else?

mmiller7 · March 3, 2021, 4:01pm

Yep. I’ve got a similar usecase…

Most things I only care about a couple hours of data for on-the-spot troubleshooting (wait, what just happened)
Some things I want a couple days of data to review/analyze regularly
A few things, as needed, I want a few weeks of data (for longer term troubleshooting)

mamoel · April 12, 2021, 9:15am

For me one of the most important features that should be implemented as soon as possible.

Peeetr · July 7, 2021, 6:39pm

Exactly this feature is the only thing I miss.
I would need to choose the data of selected entities for long-term storage. For example, store others only for 24 hours.
If it is built directly into the system, the setup will be easy even for beginners like me who do not want to work with multiple databases and do not use the advanced capabilities of other databases such as influxdb.
Now the size of the database is growing unstoppably. Because I want to store the history of temperature and the history of the amount of water in the well in the long run, I have to store everything in the long run.

denilsonsa · November 12, 2021, 3:09pm

If this ever gets implemented, let me list a few ideas and questions for consideration. These are almost a brainstorm (i.e. needs more thought and refinement).

Limiting to “x days” or “y entries”. In other words, old entries will be deleted if they are older than x days, and/or there are more than y entries already.
- For most (or all) entities, having a minimum limit of entries means there will always be a little bit of history. For entities with frequent updates, that means at least a couple of minutes of history; while for mostly stable entries we can get a better overview of how they change.
- Having a maximum limit of entries means that no single entity will be able to spam the database and consume a lot of history.
- Having a time limit is more tangible than number of entries. It also helps to make sure at least a certain amount of time is always visible in the history.
Discarding attributes instead of deleting entries.
- Whenever a state is saved in the database, all the attributes and some extra metadata is saved together with the state itself. Some of these attributes can be quite large (a very long string), and many are not relevant or useful in the history (even more so because there is almost no UI for viewing or using old attributes).
- Thus, instead of deleting an entry, it can be stripped down to the bare minimum (entity id, datetime, value, and possibly the user/action that caused the state change), and that by itself can save a lot of space.
Reducing the time-resolution instead of deleting old entries.
- All data that goes into Graphite is stored in a Whisper database (which is written in Python) and can have multiple retentions. A similar strategy is also available on InfluxDB. This means we can be as precise as one second for one day, and downsample it to every minute for a week, and downsample it further to every hour for one year. This same kind of logic could be embedded into Home Assistant history.
- When downsampling, which strategy should be used? Min, max, avg, sum, last? All of those together?
- How about states that are strings? How to downsample those?
This advanced history management logic could be built by the community before being integrated into HA Core.
- Someone could write an integration (or an add-on, or a script) that would do this advanced history manipulation. This would allow for quick iteration and experimentation before it gets mature enough to be included into the main HA project.

CaptTom · November 12, 2021, 4:05pm

Yes, we need to do some brainstorming! The more minds working on this the better.

Great idea! This alone may have avoided all the DST crashes we had recently, which (from what little I know) seemed related to massive unintentional spamming of the database.

Another great idea! This brings in the overall database design, which is embarrassingly poor for such a great product as HA. Just tossing a long string of attributes out in one record because one of them changed is pretty sad. It’s almost as bad as including the unit of measure with every state change value. But I digress.

Nice. A bit more complex, but establishing a different precision for recording vs. real-time sampling makes perfect sense to me.

Your point 4. goes for all of this. Brainstorm, discuss, build consensus and model the ideas before baking them into the finished product. Not the way things are always done around here, but I’m a strong advocate for this sort of process.

denilsonsa · November 12, 2021, 7:56pm

TL;DR: I acknowledge that specialized tools are better suited for more serious big data analysis, and that HA shouldn’t try to reinvent those tools. And I share my thoughts on the current database design, while also mentioning this is becoming off-topic.

Let me add that there are already tools specialized in long-term storage of large amount of metrics (InfluxDB, Graphite), aiming for both manageable size on disk and fast queries. And although those tools already provide some graphing UI, there is also Grafana that can connect to many different kinds of storage (including MySQL/MariaDB) and provide very a powerful graphing interface. Whatever HA provides (either now or in the future) will likely be inferior to those specialized tools; and that’s fine, because that’s not the main objective of HA, and because we are trading advanced features of such tools for a simple storage inside HA that is seamlessly integrated with lovelace.

So, for advanced users who really want the full power of big data in their home, those tools are better suited. Having them as add-ons makes them very convenient to install and use, which is already a great step. (Would be better if their storage path was configurable, though: #13, #21, #120, #179)

That said, this feature request is about improving the out-of-the-box Home Assistant metric recording, while keeping it lean and simple. Maybe the best solution would be to embed one of the existing tools/libraries (such as whisper) into HA, instead of writing lots of new code into HA; but that has the risk to become bloat.

You’ve been complaining about the database design, but not providing concrete examples on why it’s bad, nor providing suggestions on how to address the issues. (And I think that’s off-topic of this thread anyway, so just provide links.)

Sure, the database is not totally normalized, but it is simple to understand and simple to use. And it runs even on low-powered devices. (I have no idea if many joins would affect the performance, because that would need many seeks.) And it evolved organically into what it is today. And any changes require doing a database migration on 0.1M installations, so schema changes have to be done carefully. So, while not perfect, it works. Maybe using Cassandra or HBase would be more efficient than any relational database, but SQLite has the advantage of being extremely lightweight and embeddable into other projects.

So, given all the constraints, it’s not obvious to me on how bad the current design is, and neither is obvious what would be a better solution. But I digress, as this discussion seems off-topic. Heck, my entire comment here seems off-topic.

etoscar · January 12, 2022, 10:10am

I feel exactly like you guys. I agree we are going off-topic though, is there a better place for we to discuss the database/data retention policy of HA? I have a few ideas myself, having written my own home automation software from scratch.

Disk/storage space is cheap nowadays (even in SD cards), and I believe keeping more data never hurt anyone (and could provide an good insight or two). For example, I configure almost all analog values to be written to the database only when they change by a certain amount (and force a write every hour if not changed enough to be written). This catches all sudden changes beautifully, and wastes almost no disk space if the value stays stable for long periods of time. My second strategy is to use two databases in Derby: the normal, and the archive. The normal one keeps the last 3 months of data, the other gets everything that is deleted from the normal one. While using the software it is really transparent, you don’t even know it is there; normal “recent days” queries are almost instantaneous, and if you ask specifically for data from the last 6 months it takes a few seconds to load (I have to query both databases), witch is (in my opinion) expected by the user.

I believe the standard integrated database in HA could be much improved without asking much (anything?) from the normal user (recognizing the specialized tools @denilsonsa and @CaptTom mentioned should implement all my suggestions and much more). A brainstorm on this would be great!

P.S.: I’m in the process of migrating to HA to stop reinventing the wheel, to have access to the integrations HA offers, and possibly contribute with development for the community, instead of developing for myself and a few friends.

P.P.S.: Being written in Java, I selected Derby/JavaDB as the embedded database, and it has worked wonders for me. I have it running in several 15+ year old installations, with 30GB databases and thousands of entities, never deleting anything. I also have a crude replication feature running with sucess.

CaptTom · January 12, 2022, 2:40pm

As the OP, I’d say any discussion of improving the HA database architecture and functionality is not too far off topic from this request.

I’d also agree that a higher-level discussion elsewhere about the database would be welcome. I’m not really sure how such a discussion would even get started.

The feature I requested wouldn’t fix all the database woes of HA. But it would offer large returns for very little effort, without impacting the overall architecture. I think that could be implemented a lot sooner than a major re-write of Recorder, although I’d certainly support that, too.

rogovst · February 17, 2022, 3:42pm

I would propose to approach from the other side - add keep_days option to recorder.purge_entities service(how it’s done for recorder.purge service ). It’ll allow users to control amount of stored data for each device/entity just by their automations.

MarkusLa · March 2, 2022, 3:27pm

I really like this idea. It sounds like it’s simple to implement and won’t interfere with the default recorder functions.
In fact, this was the first spot where I looked for a feature like this.
Currently I don’t really want to dig too deep into something specialized like InfluxDB, I just want to be able to compare a few values to values from last month / last year. So I set purge_keep_days to 400, which is of course totally overkill for most events.
If there was some parameter like @rogovst suggested, I could purge some gigs of junk using a simple atuomation.

abobija · April 25, 2022, 2:15pm

I agree, purge_keep_days is really needed for entities individually. If it is painful process to implement that, please add for now purge_keep_days option to recorder.purge_entities service as it is mentioned by @rogovst so we can improvise purging with automations.

TommySharpNZ · April 27, 2022, 6:10pm

Yes please! This would be a great addition…

mako777 · May 19, 2022, 7:53pm

This is good idea, but I think it could be hard to implement, but something that might be easier to do is to be able to set different retention period for short term statistics (now they are purged together with “states”).
It could be nice if we could set something like 14 days of “states” retention, a few months (or even a year) of “5 minutes” statistics and of course infinite amount of one hour statistics.

yousaf465 · May 19, 2022, 8:09pm

this should be implemented in one form or the other. a 32Gb card isn’t enough within a year.

MasterCATZ · June 4, 2022, 11:37pm

also would like to be able to do this , some things I only need for a few days, some for a month and some permanently , also some just in memory as does not matter if lost in reboot but would be nice to have long term if no reboots …

would also be good to have statistics optimization based of period of time to go from a few seconds to 1 hour averages or something after a set period of time , ie temp / energy monitoring that is being currently being used to live control of devices so need high resolution on the day but do not need high resolution data being saved in db

also a shame we can not use multiple databases or else I would be offloading all my long term to another server

I am currently using MariaDB ,
(tho the latest versions after 2022.5.5 keep breaking it with a failed Migrations)
and InfluxDB

extrapilot · June 14, 2022, 5:15am

If there is a workaround that is generally consistent with recorder:, it is probably an automation that just calls some group of recorder.purge_entities, where each would define a domain, an entity_globs, or an entity_ID. And, there would be some mechanism where you would define different intervals for certain domains/globs/entities as required.

So why not just extend recorder with another filter class (i.e. include, exclude, and purge)? Yea, the purge section would need to have some mechanism for intervals, but couldn’t that just be as simple as a -XX for days? With that, you could leave the existing default purge global in place, and this addition would be backward compatible with the data in the system- it would just be a bit heavy on the first execution.

But yea, it seems very natural to just have a structure like:

Purge:
  Domains:
-	Media.player -10
  Entity_globs:
-	Sensor.weather_* -30

Yea, there is potential to see an entity duplicated (probably always the case so long as groups of entities can be created), but domains/globs are already enumerating entities, so why not just deduplicate- and process as the least conservative retention value of the duplicates?

Anyway- this seems much more natural to me- since it is already the structure we know/use for recorder, it would permit copy/paste with just the addition of -XX per entry, it isnt a breaking change, it can be applied retroactively, and it is probably not too much a stretch to repurpose the parsing code for recorder:

CaptTom · June 14, 2022, 12:29pm

Good ideas, they would certainly work. At this point any improvement would be welcome, so I’d be open to other suggestions, as well.

This FR has now (just barely) gotten into the top 30 vote count. I suspect we’d have a lot more if people really understood how Recorder works, and the impact purge and exclude settings can have.

I’m encouraged by all the improvements we’ve been seeing in Recorder over the past two releases. Perhaps addressing some of the ideas suggested here will be next?

extrapilot · June 17, 2022, 5:27am

My assumption is that most people fall into the ‘big data’ world with external server on real hardware, where 10GB SQL is nothing, OR more on the 'purge everything every week or day, energy is retained, so no need etc. But a huge amount of functionality is lost absent a controlled purge by sensor or type, where big data concepts can be retained for people without the cost/complexity of an external database (which itself needs maintenance anyway- where with larger datasets you benefit from indexing specific fields, and not overindexing, and with query analysis (some queries that seem efficient are really not at all- I have seen nightmare query logic in $10M new installs of SAP for example). Can someone tell me- does repack work as a service call in some list of commands, where maybe weekly I execute purge_entities for 10-20 entities I dont need to persist (but benefit from at least some eval now and then), and then call a purge for say 10,000 days with the repack option? I just havent seen any database size reduction in used that route- even where I included a very talkative sensor which reports basically every second and grew the database 100MB.day… Purge_entity on that removed it from history, but repack did nothing to recover space (on MariaDB)… Best to all-
Rob

CaptTom · June 17, 2022, 12:04pm

I can only say it works for me, on SQLite. I haven’t gone this route, but I see no reason you couldn’t set up an automation to purge specific entities on a regular schedule. Or even multiple automations on differing schedules.

That said, I think the vision for HA is that it should work for a relatively novice user, out of the box. Doing some database analysis and setting up automations probably aren’t among the first things they should be forced to learn.

As is, saving every state change and event to an SQLite database on (as recommended) an SD card in a RPi, is setting them up for failure. The default purge time, 10 days, is way too long for some entities, and nowhere near enough for others. It seems to me that decision should be made when the entity is created.

Which is all this FR asks.