Disable long term statistics?!

i’m not using this feature and it’s just increasing the db size.

is it possible to disable it?!

thanks

1 Like

You can’t, and LTS uses very little space.

It depends on how long you’ve had your instance. In my case, LTS is 3 times larger than my 32 days “stats” table (measured in kilobytes). The oldest items in LTS is in October last year. So in 11 months, your LTS is almost 3 times larger than the stats table. Just imagine what will happen in a couple of years.

There should definitely be a way to disable LTS on a per-entity basis. Also, there should be a per-entity setting for how long to keep the LTS. In some cases, perhaps 3 months are fine. In some cases, 1 year. In some cases, you don’t need it at all. And it is definitely not necessary to have have 10 year old data there and now the default-and-not-possible-to-change value is to keep everything forever.

Excluding it entirely from the recorder stops the item from being in the LTS, but in most cases I have you WANT it in the recorder and a couple of days history, but not in LTS.

For reference, there are multi-million rows in the LTS table.

3 Likes

Remove the state_class (or set it to none) for the sensors you don’t want to collect LTS for.

What do I set it to if I want 1 year, but no more? That’s right, not possible.

Setting the state class to none is simply a work-around for the shortcomings of the statistics configuration/implementation. There should be a way to set retention policies per entity, just like there should be such thing also for the recorder. Sometimes you want something for 24h, sometimes 1 month. Sometimes 1 year. Now it is binary, either not at all or indefinitely (or purge_keep_days for the recorder). There are workarounds, in some cases. But those are inconvenient, inconsistent, and unintuitive.

4 Likes

How big is your database?

I don’t remember exactly. I installed the SQLite Web addon and ran quite a few cleanup queries. I do remember, however, that after decimating the database size (by differentiating how long to keep the data in the state and statistics table respectively) then everything in HA went much smoother, saved some disk space, speeded up backups, etc. Having those settings in configuration.yaml would have done the work, but due to the above mentioned shortcomings I am now looking into making some custom queries running on some kind of schedule.

My point was that LTS does take up considerable space, and there is no way to reduce it unless you completely remove it. So it is either indefinitely or nothing.

3 Likes

No, it really doesn’t.

Oh it really does. After less than one year it takes upp 3x of everything else combined. And as there is no cap, it will take up 6x everything else combined after two years, and so on. Other tables like the states table is regularly purged and will not grow indefinitely. But this will.

And cleaning the database of a few million rows of data in there made my instance a lot quicker and a few other benefits too as described earlier. You cannot say no to that. You might have a feeling it does not or should not affect you or other people, but it did to me.

You seem to have a strong belief that it should not be configurable, yet other people than me is also asking for this. What is your argument for not having it configurable (and that it is not necessary is not an argument as I just explained it was. Perhaps it is not necessary for you but it may be for others)

And a solution would be something like DELETE FROM statistics WHERE created < 'some date' WHERE metadata_id=(some id of the sensor in question) once per week or so. It is not difficult to implement. If i would be more familiar with Python I would do this myself properly and submit a pull request, but for now I will try to find a way to schedule my SQL statements.

4 Likes

You still have not said how big your database is. Three times nothing is still nothing.

I’m not against it being configurable, I just do not see an urgent need.

This is my database size since the last efficiency improvements (which resulted in a reduction from 2.25GB 1.25GB):

Not exactly skyrocketing in size, especially when you consider I have been adding new entities over that time.

Anyhow, judging on your metric name you are not using the default database SQLite, but have migrated to a better performing “real” database which typically solves a lot of performance issues over SQLite. Perhaps that would have solved my performance issues too, making it less urgent. MariaDB is much better in handling bigger datasets. Perhaps migrating to it makes sense for me too, if my Raspberry Pi 4 have enough memory for it alongside of Home Assistant. Buying new hardware to solve the software issue isn’t an option for me at the moment :slight_smile:

Create a sensor to keep track of it.

sensor:
  - platform: filesize
    file_paths:
      - /config/home-assistant_v2.db

I did, but only after the optimization. Should have done it before, of course. The fact that I did not remember was what triggered me to implement such a thing already. I guess I perhaps have an old backup laying around that I could restor and see the db file, but as I don’t think it matters I will not bother. If it is 1GB or 20 GB, there should still be a per-entity retention period configurable.

This topic should be rewritten and moved to the feature request section.

Because summary: next to shooting with DELETE FROM statements at the recorder there currently is no option.

First collecting and deleting stuff afterwards is not very smart. Having a config option in advance (OR at least an easy to use cleanup option like a recorder service) certainly is.

Here is a summary also containing a link to a good how-to for removing LTS from the database:

2 Likes

Just stumbled on this topic.

While 1-2 GB for a DB doesn’t seem much, it always depends on the kind of hardware it’s run. HA is more often than not probably run on something like a raspberry pi and rather rarely on super-computers or Core i9 servers :wink: Thus, computational resources are limited.
This leads to another issue with growing databases: backups run into timeout issues. My backup is currently 1.5GB in size (incl. compressed DB backup). Since the backup takes a couple of hours there appears to be an internal timeout (which can also not be configured) and this leads to Backup Failed notifications that are unnecessary since it is still completed successfully. Now, backup is taking so long because there are tons of entities I really don’t care about their long-term history (just about short-term). However, there are other entities where I do.

So agree with @erik3 to have something configurable for long-term statistics/history would be nice :slight_smile:

2 Likes