Threshold helper update rate (DB size/state rows)

rct · October 23, 2022, 7:01pm

While looking into what is contributing to the size of the recorder DB, I noticed that a Threshold binary_sensor I created (in the UI) is responsible for a row in the states table every 10-20 seconds even though the the actual value of the threshold binary_sensor hasn’t changed in 10+ days.

The entity the threshold sensor is based on is a power consumption sensor so it updates fairly frequently. I wanted to alert/take action if power consumption is above or below the threshold for a period of time.

There is roughly an equivalent number of rows in the states table for both the base power sensor, and the binary threshold sensor. The main change (purpose?) of the new states for the threshold sensor is recording the last_updated date as the base power entity’s last update.

So even though the state of the threshold sensor hasn’t changed, there is a new state for it, because it has been updated to the same value. Is that correct?

If so, I can somewhat understand that, but it certainly seems less than optimal. All of those extra state rows aren’t providing much value – unless I’m trying to debug and figure out the last time the threshold was evaluated.

Note that there really aren’t any clues for the user in the UI – history doesn’t show any changes other than when Home Assistant was restarted.

I suppose there are some ways I might be able to use trigger based sensors or even an automation with triggers to replace this binary sensor and reduce the DB size. But it would require coming up with a way to create an entity whose last update time isn’t tied to the actual power sensor update.

I’m looking for suggestions, but also wondering if this seems reasonable or should be considered a bug or feature request against the threshold helper.

tom_l · October 24, 2022, 10:46am

Yes that seems like what is happening. Nick will know if this can be optimised.

@bdraco Hi Nick, sorry for the ping but is this reasonable or is it a possible further DB optimisation?

bdraco · October 24, 2022, 1:51pm

While out knowing the exact configuration, and a quick look at the code, it look like the attributes are the cause.

Quoting Entity | Home Assistant Developer Docs

DANGER: Entities that generate a significant amount of state changes can quickly increase the size of the database when the extra_state_attributes also change frequently. Minimize the number of extra_state_attributes for these entities by removing non-critical attributes or creating additional sensor entities.

rct · October 24, 2022, 5:00pm

Thank you for that clue. Now that I’ve looked at the Threshold helper entity in developer tools, I see that threshold is recording the value of the sensor it is based on as the attribute sensor_value.

So it is worse than I thought – for each threshold sensor that gets defined, there will be 2 rows (states and state_attributes) written to recorder DB every time the sensor it is based on updates.

The attributes stored winds up being a long string (251 bytes in this case):

'{"entity_id":"sensor.ice_maker_current_consumption","hysteresis":0.0,"lower":1.5,"position":"above","sensor_value":135.4,"type":"lower","upper":null,"device_class":"problem","icon":"mdi:snowflake-thermometer","friendly_name":"Ice Maker Low Power Use"}'

I don’t see anything relevant I can change about the threshold sensor configuration either through the UI or YAML. This is the config from the UI

      {
        "entry_id": "ecf137fb9c1c3a7c91859b810a8e2d43",
        "version": 1,
        "domain": "threshold",
        "title": "Ice Maker Low Power Use",
        "data": {},
        "options": {
          "lower": 1.5,
          "upper": null,
          "name": "Ice Maker Low Power Use",
          "entity_id": "sensor.ice_maker_current_consumption",
          "hysteresis": 0.0
        },
        "pref_disable_new_entities": false,
        "pref_disable_polling": false,
        "source": "user",
        "unique_id": null,
        "disabled_by": null
      },

bdraco · October 24, 2022, 6:34pm

Having the sensor_value seems like debug information that probably shouldn’t be in the attributes since it is already available in the state machine, and already recorded in the database.

I think it would be reasonable to remove it.

rct · October 24, 2022, 6:49pm

@bdraco - Thank you for your help – I’ve opened an issue against threshold in GitHub.

github.com/home-assistant/core

Threshold state/attr updates when underlying entity changes, but threshold unchanged

opened 08:57PM - 24 Oct 22 UTC

rct

integration: tplink integration: threshold

### The problem The threshold integration appears to generate a state update …and a shared attribute update each time the source entity the threshold is based on changes. _This occurs even if the threshold entity itself does not change_. When the source entity changes, it's value is saved as the attribute `sensor_value` for the threshold entity. If the threshold entity is based on an entity that changes frequently, like power consumption or temperature, this can generate a lot of rows in the recorder DB. While this might be useful for debugging the threshold sensor, the data for the source entity is presumably already stored in recorder. So adding a threshold sensor multiplies the number of updates to recorder. From a UX perspective, in can be surprising that a threshold helper might contribute to DB growth even if the threshold entity rarely changes. Related community thread and comment from bdraco: https://community.home-assistant.io/t/threshold-helper-update-rate-db-size-state-rows/479192/5?u=rct > Having the sensor_value seems like debug information that probably shouldn’t be in the attributes since it is already available in the state machine, and already recorded in the database. > > I think it would be reasonable to remove it. ### What version of Home Assistant Core has the issue? 2022.10.4 ### What was the last working version of Home Assistant Core? _No response_ ### What type of installation are you running? Home Assistant Supervised ### Integration causing the issue Threshold ### Link to integration documentation on our website https://www.home-assistant.io/integrations/threshold/ ### Diagnostics information _No response_ ### Example YAML snippet ```json The threshold sensor was configured via the UI. The JSON from `core.config_entries`: { "entry_id": "ecf137fb9c1c3a7c91859b810a8e2d43", "version": 1, "domain": "threshold", "title": "Ice Maker Low Power Use", "data": {}, "options": { "lower": 1.5, "upper": null, "name": "Ice Maker Low Power Use", "entity_id": "sensor.ice_maker_current_consumption", "hysteresis": 0.0 }, "pref_disable_new_entities": false, "pref_disable_polling": false, "source": "user", "unique_id": null, "disabled_by": null }, ``` ### Anything in the logs that might be useful for us? _No response_ ### Additional information The use case for the threshold sensor is to monitor power usage, via a TPLink HS110 (built in [TP Link Kasa Smart integration](https://www.home-assistant.io/integrations/tplink/)), and alert/take action if the power usage is below the expected range (eg the device has been powered off or has malfunctioned.) The source entity for the threshold sensor is `sensor.devicename_current_consumption` from the TP-Link Kasa Smart integration. That entity is updating every 20 seconds. The threshold sensor itself has not changed state during the history of recorder DB, other than for Home Assistant restarts. The `states` table rows for the threshold sensor: ![image](https://user-images.githubusercontent.com/38770/197626246-4fd2569c-441c-4f57-a47d-e64c8d8041f6.png) The `shared_attr` data written to the `state_attributes` table: ``` '{"entity_id":"sensor.ice_maker_current_consumption","hysteresis":0.0,"lower":1.5,"position":"above","sensor_value":135.4,"type":"lower","upper":null,"device_class":"problem","icon":"mdi:snowflake-thermometer","friendly_name":"Ice Maker Low Power Use"}' ``` (Note that `sensor_value` is the only attribute that is changing if the threshold hasn't been crossed.

frits1980 · October 28, 2024, 8:27pm

Why was this issue never solved?

github.com/home-assistant/core

Threshold state/attr updates when underlying entity changes, but threshold unchanged

opened 08:57PM - 24 Oct 22 UTC

closed 05:07PM - 24 Dec 23 UTC

rct

integration: threshold stale

### The problem The threshold integration appears to generate a state update …and a shared attribute update each time the source entity the threshold is based on changes. _This occurs even if the threshold entity itself does not change_. When the source entity changes, it's value is saved as the attribute `sensor_value` for the threshold entity. If the threshold entity is based on an entity that changes frequently, like power consumption or temperature, this can generate a lot of rows in the recorder DB. While this might be useful for debugging the threshold sensor, the data for the source entity is presumably already stored in recorder. So adding a threshold sensor multiplies the number of updates to recorder. From a UX perspective, in can be surprising that a threshold helper might contribute to DB growth even if the threshold entity rarely changes. Related community thread and comment from bdraco: https://community.home-assistant.io/t/threshold-helper-update-rate-db-size-state-rows/479192/5?u=rct > Having the sensor_value seems like debug information that probably shouldn’t be in the attributes since it is already available in the state machine, and already recorded in the database. > > I think it would be reasonable to remove it. ### What version of Home Assistant Core has the issue? 2022.10.4 ### What was the last working version of Home Assistant Core? _No response_ ### What type of installation are you running? Home Assistant Supervised ### Integration causing the issue Threshold ### Link to integration documentation on our website https://www.home-assistant.io/integrations/threshold/ ### Diagnostics information _No response_ ### Example YAML snippet ```json The threshold sensor was configured via the UI. The JSON from `core.config_entries`: { "entry_id": "ecf137fb9c1c3a7c91859b810a8e2d43", "version": 1, "domain": "threshold", "title": "Ice Maker Low Power Use", "data": {}, "options": { "lower": 1.5, "upper": null, "name": "Ice Maker Low Power Use", "entity_id": "sensor.ice_maker_current_consumption", "hysteresis": 0.0 }, "pref_disable_new_entities": false, "pref_disable_polling": false, "source": "user", "unique_id": null, "disabled_by": null }, ``` ### Anything in the logs that might be useful for us? _No response_ ### Additional information The use case for the threshold sensor is to monitor power usage, via a TPLink HS110 (built in [TP Link Kasa Smart integration](https://www.home-assistant.io/integrations/tplink/)), and alert/take action if the power usage is below the expected range (eg the device has been powered off or has malfunctioned.) The source entity for the threshold sensor is `sensor.devicename_current_consumption` from the TP-Link Kasa Smart integration. That entity is updating every 20 seconds. The threshold sensor itself has not changed state during the history of recorder DB, other than for Home Assistant restarts. The `states` table rows for the threshold sensor: ![image](https://user-images.githubusercontent.com/38770/197626246-4fd2569c-441c-4f57-a47d-e64c8d8041f6.png) The `shared_attr` data written to the `state_attributes` table: ``` '{"entity_id":"sensor.ice_maker_current_consumption","hysteresis":0.0,"lower":1.5,"position":"above","sensor_value":135.4,"type":"lower","upper":null,"device_class":"problem","icon":"mdi:snowflake-thermometer","friendly_name":"Ice Maker Low Power Use"}' ``` (Note that `sensor_value` is the only attribute that is changing if the threshold hasn't been crossed.

tom_l · October 28, 2024, 9:43pm

Because no one could be bothered preventing the issue going stale.

Similar issues have been resolved, e.g. Avoid recording additional light attributes by bdraco · Pull Request #121776 · home-assistant/core · GitHub

If you want this fixed open a new issue or ask one of the github admins to re-open it.

frits1980 · October 28, 2024, 10:50pm

@bdraco would you be so kind and reopen this issue? Thank you in advance.

rct · October 31, 2024, 2:25pm

I find it somewhat discouraging that some of the built in integrations like threshold and history stats have some long standing issues. As users gain more experience they just implement their own template sensors. Fixes things for them, but doesn’t help others avoid the same problem.

The original code owner just suggested working around this by preventing recorder from persisting the entity. Not helpful if the derived entity was intended to provide meaningful data by filtering some noisy sensor.

I feel like there should be a process for deprecating internal integrations that are no longer maintained. If threshold was removed then people would just go directly into searching for posts on how to implement their own using templates.

I think this one boils down to most users won’t be aware of the potential database impact. By the time one knows enough about Home Assistant to understand how threshold is impacting the database, they have the sophistication to implement a workaround.

EDIT:

@frits1980 - Have you dug into this on a somewhat recent version? At some point (last 2 years?) there was some DB schema changes that I believe were intended to avoid the storage of redundant attribute data. Just guessing but I think it would still generate lots of state updates, but not have as much of a size impact if a shared attribute row was being used.

Also before anyone says “just submit a PR”, I think fixing/optimizing this integration needs a decent understanding home assistant internals so this isn’t a good Hacktoberfest/first PR.

rct · October 31, 2024, 2:47pm

@tom_l before you posted this helpful comment did you happen to notice the issue was open from October 24, 2022 until December 24, 2023?

I clearly slacked off since I didn’t notice it was marked stale during the 7 days between Dec 17th and Dec 24th after keeping the issue open for 14 months.

frits1980 · October 31, 2024, 5:57pm

For me personally it’s not so much about the DB impact. I will never have DB issues. But it is more about the fact that I cannot build an automation triggering on state change. Because even if the state is the same at it was before any update it “changes” to the same value.

My workaround is not a template sensor because I hate the templating language. It’s not anybody should have to learn in the world we live in (imho).
I created a boolean helper and an automation that changes it like the threshold sensor should.

Nevertheless I agree with the point of not solving issues in the core integrations. The sensor now is clearly a deviation to the standard so it should be solved to behave the same as other sensor helpers.

rct · October 31, 2024, 6:08pm

The input helper is an interesting solution and I had forgotten that there are really two effects of this problem.

If you prefer not to code logic in Home Assistant templates, have you looked at PyScript?

There is a lot of capability there though the document needs a few updates.

frits1980 · October 31, 2024, 7:50pm

Thank you. But I’m not a programmer, nor coder, not scripter. Everything that looks like code I try to avoid. Not because I probably cannot do it, but because it would give me to much of a headache. I’m a GUI guy, or actually I became one

tom_l · October 31, 2024, 8:45pm

triggers:
  - trigger: state
    entity_id: sensor.foobar
    to:  # null "to" ignores attribute changes. Use "to: ~" if using the UI

frits1980 · November 1, 2024, 7:49am

Nice. But I don’t really understand why this behaviour is different then other sensors to begin with. But that is not a question for you of course. Thank you for pointing this out to me!

tom_l · November 1, 2024, 8:07am

Can you explain what you men by that?

The trigger with a null “to state” works the same for all entities, not just sensors.

frits1980 · November 1, 2024, 9:33am

Ok, well with other sensors it also works when giving nothing as value instead of ~.
So all my other automations are like that.

But I guess I should change all automation to use ~ instead. To be shure it keeps working.

tom_l · November 1, 2024, 9:41am

No there’s no need.

You can use nothing when writing your automation in YAML. The tilde (~) means the same thing (null) and is required when using the UI. Or at least that is what I thought (I’ve never used the UI for automations). Are you saying you can use nothing when writing automation is the UI?

frits1980 · November 1, 2024, 10:03am

Yes it is possible to use nothing as value in the UI. And I only use the UI these days.