WTH Can not set a entity as long term statistics easily?

MiguelAngelLV · November 30, 2024, 9:19pm

The users must could set if save or not the data of each entity easily.

For now, the unique way is using customization and change the state class. This must be more simple and intuitive and it could be changed from entity configuration as the name and icon.

ArveVM · December 1, 2024, 8:55am

From entities means messing with each integration?

I’ve proposed a coordinated override dashboard to both add/remove LTS,

(edit; my proposal of both add and remove was deemed duplicate - so removed the link)

MiguelAngelLV · December 1, 2024, 9:08am

My propose is can change if save or no save LTS from dashboard as other data (name, icon, id…).

In entity setting should be a switch to save or no save LTS.

Not by integration, by entity. It is possible I want save temperature of my bedroom but not the temperature of my fridge.

ofirm · December 1, 2024, 12:16pm

Big +1 , I couldn’t wrap my head around it, today it is implicit and derived but should be explicit and easy to manage (both the defaults and specific overrides)

terba · December 1, 2024, 1:11pm

I have hundreds of lines with state_class: None in my customize.yaml. Basically I disable LTS for all my entities. It’s a joy to maintain. Please make LTS configurable (to turn off completely for example). I need all the data, even Recorder autopurge is turned off. I purge what is not needed for longer periods via automation.

balloob · December 1, 2024, 1:17pm

If an entity does not have a state class but should, it’s a bug and should be fixed. That way we don’t have to ask all users to workaround the bug.

Can anyone explain why you wouldn’t want things to be stored in LTS? It takes up minimal space (24 data points per day)

petro · December 1, 2024, 1:26pm

There have been reports from users that removing LTS dropped their database sizes by upwards of 70%. These are people who have recorder trimmed to only a sub set of entities and have no need for long term statistics. They would like the ability to turn off long term stats. There are multiple threads on the forums trying to find out ways to turn this off. Most use customize.yaml to set the state_class to none to turn off LTS. Some people give up with that route because it can’t be edited from the UI.

The example above (70% drop) was a guy who has a 1 gig database with roughly 4 years of LTS that was removed, dropping his DB to ~250ish mb. There are other’s with less % drop. The point is, old databases could benefit from having the ability to disable LTS similar to recorder settings. FYI he had to manually delete his LTS information because it’s not automatically removed.

terba · December 1, 2024, 3:00pm

Can anyone explain why you wouldn’t want things to be stored in LTS? It takes up minimal space (24 data points per day)

Here is my list:

I store most of the data forever in the recorder database. My DB is around 8 years old and is only 13GB in PostgreSQL but once it was 30 then I made some automations to purge what is not badly needed after a year or a month. Why would I “double” that in LTS?
The more-info charts are annoying if based on statistics. I don’t need min/max/mean, I hate that the current value is not shown in the graph, sometimes the graphs are chaos, sometimes there is no graph displayed at all etc. I need the good old boxy line.
Data is the most valuable in my system. I don’t want to compress it with loss. I need it in detail. I use the data to check my ideas, make decisions, optimize things in the house and so on.
From a comment by me on a github issue about LTS: I don’t get the concept either. You spend hundreds or thousands of dollars on home automation devices and the core component of the system runs on a $35 computer with an SD card storage? And Home Assistant engineers degrade their fabulous software to support this scenario and make the life harder for those who use this software as it should be (low-power but powerful HW, redundancy, UPS). If you have a lot of data where long-term statistics should help, then you have a lot of devices for a lot of money, most probably the system is important to you, so you won’t be running Home Assistant on a Raspberry Pi and an SD card, ergo you don’t need long-term statistics as the hardware is able to handle the data easily. If you have two light bulbs and a LED strip using a rPi, than you don’t need LTS either as there is not much data.

MiguelAngelLV · December 1, 2024, 4:31pm

The inputs helpers don’t have state class and it is not a bug. The most people don’t need save the helper data, but other people would need.

If I create a template sensor from a physical sensor, I don’t need save the physical sensor, only the template.

The best part of Home Assistant it the control of own data, why the developer are how decide what data must be save forever and what not?

ArveVM · December 1, 2024, 9:15pm

Eventually it will become an issue for everyone?, As a concept, why is there an oportunity to exempt recording or extra purging of states - and not the same for LTS?
24 data points per day quickly adds up to million records when someone brillitant develop a system where I can add thousands of entities and some/most of them create LTS

see my stats, +5 mill rows this year in LTS, 4,6gb sqlite, +5 next year etc etc

(And I know the recorder purge is useful for removing noizy entities - used it myself)

So for the smallest of installations it can be useful to skip LTS to save on storage,
and for the “larger” like me the overview of what is creating LTS and de-select some/many would be useful. I’d like something between state class and LTS, for example on the list of entities I can override the default that comes when adding entity with relevant state class

ofirm · December 3, 2024, 10:23am

I really don’t get it
all the examples of why LTS is powerful are “if we keep all statistics forever, after two decades it will consume 100GB”.
Who cares? a 500GB SSD is less than 50$, what problem are we trying to solve here? Why not store all data without any aggregation forever? Or if useful, keep raw data alongside daily aggregations if it accelerates reports spanning many weeks / months. Why are we suggesting adding many more options to endlessly micromanage every 50KB of disk space?
If there is a performance issue (fetching data), there are many database -levels solutions (like properly indexing or transparently partitioning by year or year-month), lets improve it for everyone.
If there is a fear that HA will run out of space, add a default alert when space is less then 10%… Maybe add a parameter limiting max size of recorded data (ex: 20GB or 200GB etc) and purge historical data (one month at a time) when it is reached (add an alert to catch it way before, so people could increase it, or selectively purge data).
If the default database is problematic (ex: unstable) when data grows beyond a few GBs (not implying it is), then have a simple way to upgrade it to something better, including all the automations needed to keep the alternative running well.
ANYWAY - just having a default of not storing telemetry more than 10 days is absurd. At least raise the default to a 100-day so everyone could visualize and compare last several months of data without having to learn all about HA internals, and make it easy to control.
Just my two cents, sorry for rant

ArveVM · December 3, 2024, 11:45am

rant away, apriciate you came with one good suggestions - why don’t you add this as a new WTH? :):

“If there is a fear that HA will run out of space, add a default alert when space is less then 10%”

(from other posts the DB-guy stated it was a requirement to have 125% size of DB available for DB-actions - so perhaps that should be the trigger for warning)

ArveVM · December 3, 2024, 11:53am

rant aside…on the WHY, here is my opinion:
HA is a state-engine, so it is not nessesary to save any history to view states and run automations. The Recorder-integration which saves states etc is optional - so there are use-cases for no history at all.
Not all use cases ar big installs - and I apriciate the devs are keeping things so that small installs are an option.

Default HA is installed with Recorder set up, and saving “raw data” as states/state-attributes for 10.days (default purged after 10-days, but easy configurable)

In addition the Recorder-int creates Statistics, which are “aggregations”:

Short time statistics (STS) are "5-minutes aggregates of data from the states tabl
Long term statistics (LTS) are “Hourly aggregates of the data from the statistics_short_term table” (source: Long- and short-term statistics | Home Assistant Data Science Portal )

The Recorder-int is heavily configurable, so States and STS is something you can set days/year sto store, include/exclude pr type/each entity, and you can purge selected entites on different intervals etc
But the LTS is hardly configurable. It is decided by state_class on entity, so if an integration has a defined state_class for an entity you have to manually change it pr entity (or in customize.yaml) - so way off the same level of counfigurability as you have on the states/STS - and why the OP created the WTH (i guess)

I started with HA three years ago,and have seen lots of good improvements on Recorder in that time - so guess the devs can thank themselves when we ask for some of the functionality for states/STS to be applied for LTS as well

PS; have a look at my graph in other post, #states have been reduced significantly in two rounds of Recorder-int configuration where I removed noizy entities and excluded some other trivial stuff - so the option to tune what is stored is greatly apriciated. And the red line is on a steady growt, worthy of this WTH.

terba · December 3, 2024, 12:21pm

HA is a state-engine

That is the main problem here. That’s why I’m suggesting a redesign and rewrite of this project. Nearly all of us are using HA as a home automation software, not a state-engine. In my opinion all the main and never solved decade long bugs are coming from this statement, from this design decision.

First and most serious problem is the restarting of HA. If it doesn’t store data by itself it can not restart correctly. Just see the 2nd most voted WTH here.

petro · December 3, 2024, 12:53pm

Every point you’ve made throughout these WTHs are design decisions, not bugs. They may be seen as bugs in your eyes, however they were actively decided on by the main development team. Restoring states is not held back because HA uses a state machine. There’s a restore state already built into entities that could easily be expanded on, and having a persistent last_change can easily be managed. A PR was made in the past and the owner of HA shot it down. I’m not saying I agree with these policies, however they are not bugs and more importantly the state machine is not the cause of the limitation. These calls for redesigns are completely unfounded. There are multiple systems in HA, and the state machine is a small part of it. I urge you to look over the code base before you cry for a full redesign of a system.

For others seeing this comment, this is a reply to terba only. My comments are unrelated to this topic. I will likely break this topic in the future into a separate thread as it is off-topic.

terba · December 3, 2024, 1:25pm

You said these are design decisions. I said these are design decisions which create bugs in the users view. I said I recommend a redesign to change these design decisions to solve these problems which are bugs is the users view. You said I’m crying for a redesign.

If it’s not a bug that the software shows totally wrong data to the user (window was closed when HA restarted), than you are right. It this is a bug, than I’m right.

I stop here.

petro · December 3, 2024, 1:27pm

My point is not about them being bugs, it’s about you claiming in multiple threads that HA needs to be rewritten to address these bugs.

Bugs or not, HA does not need a full rewrite to address any concerns you’ve listed in all these WTHs you’ve posted on.

ChrisWarwick · December 4, 2024, 6:47pm

I would really like much more granular control over what goes into the database (LTS or otherwise) - and if info is being added to the database, what its (individual) retention policy should be.

The current method of include/exclude is too much of a hammer without going into silly levels of detail - with entries for each and every state. It would be much better imho to be able to look at an entity and set its retention attributes as required. Should it be recorded or not? How frequently? How long should the information be retained?

Gonioul · December 15, 2024, 10:50pm

My Z2M devices batteries have long term statistics, but not my Z-Wave JS ones?