Improved handling of repeated sensor values (which are presently ignored)

mekaneck · November 7, 2023, 4:53pm

Background:

Home Assistant currently ignores sensor updates when the value that is reported is the same as the previous value. This saves space in the database since there is presumably no value in storing redundant data. Some integrations have a force_update option which overrides this behavior; normally with the stated goal of having “meaningful graphs” (e.g. MQTT).

Problem:

Many issues are caused by the choice to ignore repeated values:

Sensors that look at a defined history window or max age or sample size (statistics, history stats, filter) will have incorrect values calculated and/or can change to unknown when their source sensors are reporting normally
Derivative sensors can never result in a value of zero
Reimann sum integrals will yield incorrect results when trapezoidal or right is chosen for method:
State triggers (whether in automations or trigger-based template sensors) will not trigger when a new but duplicate value is read. Often triggering is not desired, but if it is, there is no workaround.

Problem Examples

Example 1: A derivative sensor whose source sensor is a temperature sensor, and the last 3 temperature readings are 5, 10, and 10 deg C, each reported one minute apart. The user would expect to see a derivative of zero since there was no change between the last two readings. The current implementation results in a derivative of 5degC/minute because the last reading is ignored.

Example 2: An average sensor which averages a water meter’s reading over the past two hours. Assume no water has been used for several hours, and therefore the water meter sensor reports the same meter reading of 15,850 gallons every two minutes. The expected average would be 15,850 gallons. The current implementation results in “unknown” because there is no data in the past 2 hours and therefore no data to calculate an average.

Example 3: A sensor which takes a rain gauge sensor’s reading to calculate total rainfall. (This could be done with a trigger-based template sensor, or with a utility_meter using the delta_values setting.) If the rain gage reports two readings of 10mm rainfall, the template sensor / utility_meter will calculate a total rainfall of 10mm, instead of the correct value of 20mm.

Proposed Solution

Rather than having each of the many affected integrations trying to address this problem in unique ways, my proposal is to have a single common way of dealing with this issue:

Add Expected update interval as an option to devices and entities.
Default Expected update interval to infinite so that this is not a breaking change
Implement it similarly to areas, where it can be assigned to a device or to individual entities.
Update the developer documentation so that integration developers know how to utilize this parameter (see explanation below).

Explanation:

This change would do nothing upon implementation, but integrations and visualizations would be able to read this parameter and utilize it. It would not affect when the entity is marked unavailable. This setting would only be used as an input to other integrations which consume the data of the entity in question. The logic for graphs and integrations would be: If there is a gap in data points larger than expected_update_interval then assume a repeated data point exists every expected_update_interval. Sensors which refer to other source entities would be updated upon every update of that source sensor (just as they are today) and also every time the expected_update_interval passes without a source update. Since the default to this value is inifinte, it makes no difference to the current handling of updates.

Before/After Visualization, using ‘derivative’ as an example:

Caveats & Issues

Since timing is never perfect, a sensor which is supposed to push an update every 2 minutes may in fact push an update after 2 minutes and 1 second. There may need to be a hard-coded heuristic “extra delay” before the logic assumes the first data point is missing. Something like min(expected_update_interval + 30 sec, max(expected_update_interval * 0.25, 1 sec)). My thinking is that if the sensor is supposed to push every 24 hours, then a 30 sec buffer seems reasonable. If the sensor is supposed to push every 1s, then a 1s buffer seems reasonable.
The “last updated” attribute of an entity should ideally include the logic using the expected_update_interval however this would waste database storage space since this value would be stored every time it is changed. Suggest leaving “last_updated” unchanged.

Benefits

Does not increase the size of the database
Enables a common, logical way of handling this known issue for all integrations
Is not a breaking change

Related Issues and Requests that this would address:

boheme61 · November 7, 2023, 8:06pm

Or implement this behavior at our local Pub, so Bill doesn’t reflect every time we buy/drink 1 beer

jumper · November 8, 2023, 12:42am

Thank you for a very well thought out and documented problem, and proposed solution.

In short, sequential, but identical readings are often legitimate NEW readings, NOT to be ignored as HA does now.

Ignoring these data is clearly a problem in need of a solution. My poster child is a rain gauge that often reports the same (incremental) rainfall value in successive 10-minute intervals, e.g., 0.01, 0.02., 0.02, 0.01, 0.01

HA reports for this series, 0.01 + 0.02 + 0.01 = 0.04, instead of 0.01 + 0.02 + 0.02 + 0.01 + 0.01 = 0.07

As a workaround, I use a template sensor that adds a random, not significant, positive or negative lambda value to each reading to (largely) make each reading unique and thereby bypass the problem. This forces every 10-minute reading into the database.

HA can be better. Let’s figure this one out!

boheme61 · November 8, 2023, 11:56am

True, “issues” and similar like this have been and will continues to rise, do to the ever increasing features/integrations etc. Size of DB-State/Event-Machine, is “named” as “cause”. I have been/seen other issues ( Forecasts, Attributes etc )
My initial thoughts and still “valid” in my opinion is … Let people Choose/Mandatory or by Option, whether which Entities they want/need in their DB, i believe most people are “aware” of what’s important for them,
And if they don’t they also have to be aware of the fact that. Every thing comes with a Price.
If people want it all, take this into account when dimension their Solution, Or be prepared for an MEM/DISK/CPU expansion by time.
Current architecture Is, record ALL, ALL states/states-changes/events , By Default ( Making people (some) not aware of the consequences, i.e ( some want, or “accidentally” , create a template, or set an entity to “update” every second, instead of i.e every minute, or 10min (Math), … some are not even aware how much is actually “Stored/recorded”, maybe because they have no use of this/certain Data.

So (to make a long story short) Let people choose, what they need/want to store/record, initially during setup of a device/entity/integration, or whenever. This make People “Aware” of what they are “dealing with”
By default storing/recording ALL, make people Unaware, of how things works, and what the consequences of their behavior/actions will result in

PS: Yes, I have only “Include” in my Recorder settings ! (so far ), so no arguments here , im referring to the fact that many people have NO idea how much OR what is actually Stored/Recorded, And most likely just have a “weak” idea of what a DB(+recorder) is/do, it just sounds like a good/practical thing ( until it’s causes problems)

jasonmp85 · November 8, 2023, 9:23pm

I didn’t even read this and voted for it, since any amount of thought put into this will be better than the current implementation.

mekaneck · November 10, 2023, 3:04am

I’ve been thinking about this some more, and I think a more intuitive solution is just to have an option for each entity to allow storage of repeated values.

The downside compared to my original suggestion is that the selected entity will take up extra database storage space when the option is enabled (in the original proposal there was no effect on the original sensor).

The benefit is that integration authors won’t have to do anything to their code. And it might be easier for users to understand what the option does.

Thoughts? And is it allowable to modify the FR after some people have voted on it?

My main goal for this FR is to get the people that control the direction of HA to agree there is a problem and to choose a strategic direction. I’m not asking anyone to implement anything. What I want is for maintainers to agree in principle on what content they’d like to see in a PR.

boheme61 · November 10, 2023, 4:02pm

Sounds like maybe an “easier” way to implement in current code(as they already for some reason found the current “solution” as an “effective” way to reduce usage/db.size( Not that i know ! ), and in regards to the slight increase in DB-size, recorder-usage, this will cause ( In my opinion ), it still will be an important “improvement” for people who actually/definitely having issues with current “model”, and have to take into account/choose, … same as how often people want "retention/purge on their DB, or/and Updates interval of various sensors.

Anyhow i hope your FR will/have got attention already

tom_l · November 11, 2023, 5:51am

It’s not just about reducing the database size. Home Assistant is a state machine. If the state does not change nothing happens.

What you are asking for would require rewriting the way the core of home assistant works.*

* I’m not a developer, but that is my layman’s understanding.

boheme61 · November 11, 2023, 8:31am

True, and i also have no idea howto "solve" or "get around" the "issue", nor do i know which state-engine model HA uses.
Same as how various Integrations/Sensors is build, using "push" or "pull", i've seen some "adding" i.e 0.001 to "reflect a change ( i.e +0.001 vs -0.001, shifting, to "level out" the result over time ) , another possible solution/workaround could maybe be a "time-sensor" where the state is the "time"( i.e hour/min or every 10 second ) and an Attribute is the "Value" i.e temperature

I know, the last example sounds "weird" :) , and both examples is for the integrations to "solve", or the user, through i.e a template-sensor

Im also not sure how i.e InfluxDB handles the entities(recorder) inputs (been awhile since i "tried" Influx) , maybe it "adds" an identical value(as previous)/in the graphs, when there is a gap in time (or maybe that was Grafana :) 

However, combining timestamp and value, make a state unique, same as adding/subtracting an insignificant value, and this is up to the developer/user of i.e an integration/template-sensor

PS: Tom, is there an “issue” with the new Forum ?, above text i kept getting message regarding “code formatting” and when i clicked fix or post-anyway, nothing happened … so i put it in “formatted mode”
Ahh now i think i see what/where, the part that is “greyed” down maybe ?

11.11.2023_09.47.12_REC

Seems like it “converts” the sign here in " i’ve " … (nope not that who causes it)

mekaneck · November 13, 2023, 4:52am

I’m clinging to the hope that since some integrations have the force_update: option that it isn’t a complete tear-up to add that functionality to all entities. But you may be right.

Regardless, dev feedback is what I think is truly needed here. To be frank, I’d be happy with no code changes at all if we could just get the devs to agree this causes problems in certain scenarios and then to also agree on the preferred way for integrations to deal with it, and for that recommendation to be added to the developer documentation.

My end goal is for a common solution. Forcing that solution on each integration is ok as long as each integration handles it the same way.

ThisWayToo · November 13, 2023, 12:55pm

Thinking about it, I like the option setting to force the retention of all duplicate values, including all zero values. In certain use cases every zero value may be entirely valid. Rain gauges come to mind.

The various workarounds to either force small non zero values, or adding attributes adds as much or more data to the database as would simply allowing all the duplicate and repeating and entirely valid zero values be recorded.

The key downside of the timestamp attribute workaround is that it increases the latency for when the values captured to when they are presented. Perhaps by a factor of 2X. Depending on the nature of the data, and the desired quickness of response that could be an issue.

boheme61 · November 13, 2023, 1:59pm

Thing is a state-engine “fires” upon changes( no changes, nothing to record ), and an “event” which provide “no change/same value” doesn’t “trigger” … as i understand it … so there is nothing to “force” … unless you trigger/record “time + attr_value”

It wont tell you it’s raining, if it’s not, same as your/a “meter” wont move, if there is no “usage/speed/etc.”

mekaneck · November 13, 2023, 3:45pm

I think this is where some dev knowledge would be useful. In general, yes a state machine won’t do anything when the state doesn’t change. However, specifically for HA:

HA, as programmed today, takes an action to evaluate incoming sensor data when it comes in. It doesn’t know if the sensor data is different or not until it looks at it, so therefore HA must do something upon every sensor update. The state machine doesn’t change, but the program as a whole does take action.
Once HA determines that the sensor state is unchanged from the current state, no further action is taken. State machine is not updated and therefore no integrations that take action upon state changes are triggered.

However, one possible option is something like the following:

Sensor data comes in, and HA evaluates it (no different from what happens in the current code)
If sensor data state matches the current state, then HA changes the last_updated attribute for that entity to now()
Integrations that care about this could monitor for changes of the last_updated attribute instead of monitoring for state changes

If desired, step 2 could be taken only on entities that have some option selected to store repeated values.

boheme61 · November 14, 2023, 1:57am

Sensor data comes in to/through the integration… or Integration request it And evaluated it(i.e every hour ) … and expose it to HA, HA handle the “exposed” data
Same data/value/state (no state-change, no state-trigger, no nothing to record), i doubt you can make the Recorder to add/manipulate a State-change ( last_updated == timestamp , which occur when the “state-value” changed, as “last_updated_ts” is not “recorded” but applied to the “State-change” … when it’s changed , by the sensor/integration/user.
Integrations(or user-templated-entity) monitor “last updated ts” ?

You just described “time(state) +attr_value” didn’t you ?

If i or Integration “get or pull” data/value every hour i can store it with this timestamp(State) (timestamp +value == uniq) So the integration or user have to “manipulate” the sensor-value(state) some way, or use now() as “state” (for the state-change to occur) and sensor-value as attr. ( or combine the 2 )

“Last Updated” is a timestamp, applied to a state-change … so one way or the other , the logic of the state-machine have to be “redesigned” (on the other hand, i dont know which type of state engine is in use here ), if i understand it right ( thou i can’t see howto apply “last-updated” to something that have not changed/triggered to state-engine to store it)

Below you see a wifi-signal-level ( note the state:id vs Old_state_id , and “Last Updated TS” )

Recorded Only doo to a state-change ( im not sure whether this is caused by the sensor/integration or HA, i believe it’s not “change” by the State-Engine or Recorder

This is Router CPU-Temperature

Same update-interval , State-Value change (up and down)… just for the sake of it (not caused by the DB, unlikely by the state-engine or Recorder, unless Dev’s decided which and how much a value should be tampered with… I would say it’s the Integration/device( I could be wrong ), i know i’ts not template-sensors

Below is a simple example of a state-engine ( kind of Door-Open / Door-Closed example )

So a “Force” is only possible with a “State” Change, if no “transition” occurs, nothing to Record
For sensors like i.e temp/humid etc. etc. you have a “State”, and basically nothing in the attributes of use to define “uniq/change” . If the state-value dont change, so the Data provided have to be tampered with (if you use that as state)

Above Door-Open Door-Closed as an example, below is a light, note: the Off state in the bottom the timestamp, been of for awhile ( No state changes for days, or many hours )(therefore no last_updated:ts), up until i click on/off/on/off/on/off (apparently to fast as it “missed” a state) , but for days/hours, No state-changes, No last updated ts

So for you/integration to have a “last_updated” to monitor, you have to use “timestamp” as “state”, IF the sensor-Value dont change
Last_updated_ts == now() , when the (state -change occurred)

Anyway, maybe there are some “feature” in some state-engine-models, or maybe Recorder, ( thou i still think it’s up to the integration or device, or user-template, if the 2 don’t deliver.
I.e i have temperature sensors which delivers 6 digits details, No problems with default update-intervals( as it change, and is “rounded” )
Hmmm, could it be that the sensors/integration this FR is based upon is simply not “Smart” , as i.e the CPU-Temp above, 3 digits pending up and down 1/2 degree
So i guess it about to “expose” a change to HA, and show the “result” with less digits(round up , to the digits you want to see in you UI) … or use a time-template-sensor where “State” changed every ms/m/h, or 10m, and has the now() value as attr
Same as when a State is “missing” or reported late, this has to be handled in the “View/Graph/Card”
Btw.
If you have 15,850 gallons of water ( reported every 2 minutes )(supposed to ) , if the integration/device or a connection glitch misses 1 state-report, you wont have 15,850 gallons in average, and you can’t have a state engine to report/store an “assumption” when no data is delivered , it’s will at best be a Zero , however i doubt you will know because your average count would probably be based upon 60 measure-points , unless it’s the actual amount of “Valid” TimeStamps recorded you base it upon. .However a “count” template could solve that.
As well as a time-template can Record a state_attr every 2 min, regardless if it has the same attr_value

as-well as the "Custom:sensor:energy_data, it reports Price in The UI

- hour: '2023-11-14T12:00:00+01:00'
  price: 1.063
- hour: '2023-11-14T13:00:00+01:00'
  price: 1.063

So there are ways “around”

mekaneck · March 21, 2024, 12:27pm

There is a change coming which will address this FR:

mpx · April 17, 2024, 12:57pm

How can I use State.last_reported to regenerate missings points in graps?

Jon123 · May 14, 2024, 3:04pm

So, last_reported is now available, however the original problems haven’t (yet?) been remedied.

The specific issue I’m seeing still is:

mekaneck · May 15, 2024, 2:24am

Yeah, the change that was implemented simply makes it possible for integrations to utilize that feature to overcome the repeated values problem. But each integration needs to be updated, and that has not happened for any integration that I’m aware of.

mekaneck · August 19, 2024, 11:29am

For those following along still: 2024.7 included the update of the Riemann sum integral integration. One update was to add a trigger from the state_reported event, which solves the repeated values problem (PR here). Another relevant improvement was the addition of the max_sub_interval option which will force a time-based update to the integral calculation if the source isn’t updated in that amount of time (PR here).

Hopefully we can get similar improvements to other integrations (like derivative) soon. Or if anyone is handy with Python, take those PR’s as reference to make similar changes to the integrations of your choice.

tzzy · September 10, 2024, 10:17pm

To me this is a fundamental problem. If the data is not updated just because it hasn’t changed, how can I know whether it didn’t change or whether the sensor lost connection? Additionally, in Grafana some of my graphs are useless, because e.g. cloud coverage hasn’t changed for 10+ hours so in my default past 6h view there is no data and the panel shows it as an error.

Also, I don’t understand why the integrations or the helpers (of which the Riemann integral sensor is one as I understand it) need to solve this. In my head the architecture would look like integration provides data → core decides whether to push it as new data → helpers/recorder/… receive it. So that the core could be patched to solve this problem.

I haven’t worked with the code yet nor delved into the architecture, so let me know if I am missing something.

Edit: I think I misunderstood how the new last_reported column is supposed to work. However, in my case, it is always null for the most current row (which is the one that matters most).