Block writing problematic data to the database - using SQL TRIGGER on HA database

e-raser · January 25, 2023, 10:41pm

Motivation:

Thanks to AVMs change in latest Fritz!Box devices firmware (7.50 release), the behavior for data provided using the AVM FRITZ!Box Tools - Home Assistant integration - precisely: sensor.fritzbox_gigabytes_received and sensor.fritzbox_gigabytes_sent changed:

github.com/home-assistant/core

Problem with sensor.fritz_box_gb_received

opened 09:46AM - 05 Dec 22 UTC

closed 01:00PM - 10 Dec 22 UTC

Leopoldpaul

problem in device integration: fritz

### The problem Since the latest Fritz Os 7.50 the sensor.fritz_box_gb_received… and sensor.fritz_box_gb_sent doesnt reset every day anymore. They grow until i restart the FritzBox. Mentioned the potetial Problem a while ago: https://github.com/home-assistant/core/issues/73328 ### What version of Home Assistant Core has the issue? 2022.11.5 ### What was the last working version of Home Assistant Core? _No response_ ### What type of installation are you running? Home Assistant OS ### Integration causing the issue AVM FRITZ!Box Tools ### Link to integration documentation on our website https://www.home-assistant.io/integrations/fritz/ ### Diagnostics information _No response_ ### Example YAML snippet _No response_ ### Anything in the logs that might be useful for us? _No response_ ### Additional information _No response_

Thanks to that change and a quite unreliable behavior of that input sensors, utility meters (new sensors) based on those two input sensors are completely screwed.

Problem:

If input sensors data is set to 0.0 for a second/few seconds and switches back to the former value of e. g. 1000, that rise of 1000 is added to all utility meter sensor.
Affects per utility meter sensor state column in states table as well as state and sum tables in statistics_short_term and statistics table.
In the end, it renders all those sensors data (LTS - long-term statistics) completely useless.
Cleaning up all that mess takes me AT LEAST 3 to 4 hours PER INCIDENT - which could happen once a day During the last days I spent roughly 17 hours in total already on this issue. I am totally frustrated, especially as this could happen (and is happening at the moment) every day.

Possible solutions / workarounds:

The manufacturer (AVM) "...assumes no liability or warranty for any of the functions mentioned or for the correctness of the associated documentation. AVM reserves the right to change or omit functions in whole or in part at any time and without prior notice."
I need to block the source (Fritz! integration) to write 0.0 to the states table for two entities (sensor.fritzbox_gigabytes_received and sensor.fritzbox_gigabytes_sent).
I was thinking of “maybe can the integration do something here” as all information exists. Unfortunately, no response yet. - Update: see:

github.com/home-assistant/core

Unreliable sensor data (data usage) since firmware 7.50 - filter/block that data

opened 10:54PM - 25 Jan 23 UTC

bcutter

integration: fritz

### The problem See - https://github.com/home-assistant/core/issues/83280 for …starting point - https://github.com/home-assistant/core/issues/83280#issuecomment-1399597517 for follow-up issue and - https://community.home-assistant.io/t/block-writing-problematic-data-to-the-database-using-sql-trigger-on-ha-database/525021 for a short summary on this issue Affected entites (possibly renamed): - `sensor.fritzbox_gigabytes_received` - `sensor.fritzbox_gigabytes_sent` ### What version of Home Assistant Core has the issue? core-2022.12.9 ### What was the last working version of Home Assistant Core? _No response_ ### What type of installation are you running? Home Assistant OS ### Integration causing the issue AVM FRITZ!Box Tools ### Link to integration documentation on our website https://www.home-assistant.io/integrations/fritz ### Diagnostics information _No response_ ### Example YAML snippet _No response_ ### Anything in the logs that might be useful for us? _No response_ ### Additional information 1. Unfortunately there was no response from integration maintainers (yet) on ideas how to fix this on the integration side @ https://github.com/home-assistant/core/issues/83280#issuecomment-1399597517 2. The other issue has been silently locked and limited to collaborators https://github.com/home-assistant/core/issues/83280#event-8358044112 (which is fine as no development progress was made there) so as promised this is the follow-up issue on that one. 3. One very specific and highly advanced option to "solve" (work around) this issue is noted at https://community.home-assistant.io/t/block-writing-problematic-data-to-the-database-using-sql-trigger-on-ha-database/525021 - currently not more than just an idea. Hierarchy of fixing: 1. Manufacturer: they wont as stated in https://github.com/home-assistant/core/issues/83280#issuecomment-1400000182 2. Integration: that's why this issue has been created 3. Individually: Block write attempts using SQL TRIGGER As 1 is out and everything after 2 is quite complicated and risky, in my opinion the integration is the perfect place to solve this - for all affected users (which are going to increase with continued rollout of the 7.50 release firmware).

So what can affected users do on their own?
For this I thought of using a SQL TRIGGER (I use the default HA database - SQLite 3.40.1 with schema_version 30 on HA Core 2022.12) to block these write atttempts.

What I currently have

DELIMITER $$

CREATE TRIGGER fritzbox_gigabytes_no_zero
BEFORE INSERT ON "states"
FOR EACH ROW
BEGIN
  IF (NEW.entity_id = "sensor.fritzbox_gigabytes_received" OR NEW.entity_id = "sensor.fritzbox_gigabytes_sent")
    IF NEW.state = 0.0 THEN
      SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Updates with value 0.0 are not allowed in this case - WORKAROUND for BUG 1.31';
    END IF;
  END IF;
END;

$$

DELIMITER ;

Questions

General: Can SQL TRIGGERS be used safely in the HA-DB (SQLite)?
General: Are they persistent there? Or are they at risk of being changed or deleted during HA Core updates (DB schema upgrades)?
Will it have a notable impact on performance (or is the engine applying the filters efficiently so all other INSERTs are not getting slowed down)?
Specifically in regards to the current trigger definition above:
Does the specifically considered one work (syntactically)?
In case you have a completely other approach/idea (not taking care of the data is not an option), please let me know as well.
General: where does that error message text pop up? Will it be logged in the HA log (so I could discover and count „Jep, I hit the issue and successfully avoided it again“?

For now, I feel like (and hope) the SQL TRIGGER could be a smart “solution” for this. At least for buying time. Time urgently needed currently

Regarding 4: I would test this as good as possible, likely by adding the trigger with trigger type UPDATE, use one row for testing purposes and try to write 0.0 to it. As this is a “try & error” I would be very thankful for experts having a look at this.

michaelblight · January 26, 2023, 2:50am

I don’t have this device, so don’t have the issue. However, personally I would rename the problematic entities (eg. stick “_source” on the end) and then create new template entities with the old ids that copy and filter out the unwanted data from the source entities. Then if the integration is fixed, delete the template entities and rename the source ones back - in theory you should not lose data this way.

In my mind, this approach is working with HA, not against it. But I’ve always hated sql triggers because they’re hidden away and forgotten about, so I might be biased.

e-raser · January 26, 2023, 3:10pm

Oh dear. That’s a too obvious solution/workaround - much smarter approach to buy some time / survive the time until an actual persistent fix/solution exists (working for everyone out of the box without the need for individuals to take care of it). Especially as it is a no-brainer, default „feature“ of HA. I like

Only (small) downsides of templated „copy entities“:

Easy to forget/oversee as they are usually not assigned to the device their source sensors are (so not to be found using integration / devices / entities free)
Redundant/needless amount of duplicated data.

I‘ll try to figure out what the ‘value_template‘ needs to look like to ignore/filter out all 0.0 values.

BUT there are similar cases:

e. g. It's time to fix Aqara battery bug - General Support - deCONZ Community where -100.0 degree values indicate low batteries of famous Aqara multi/temperature sensors integrated with deCONZ - also renders all statistics useless so same impact but happening less often)
In those cases it would be pure overkill working with templated copy entities (maaaaaany sensors!).

Therefore…

…I am still highly interested in getting answers to my original questions on the SQL TRIGGER. Please everyone, feel free. Besides to template entities, there’s really absolutely NOTHING in this community/on this forum on SQL TRIGGERS with HA (yet).

Well, they are counted…

…and can be listed using

SELECT * FROM sqlite_master WHERE type = 'trigger';

Of course they are not that much on the spot, but there are options to not forget them that easily

e-raser · January 26, 2023, 10:22pm

When trying to follow this, I came across one totally unexpected issue:

template sensor A which copies source sensor A (but filters unwanted states/values) has its own unique_id: (following the modern way of creating template sensors but I’m sure the legacy method had this too)
So probably because of this, after performing the swap by
- renaming template sensor A to source sensor A’s entity_id and
- renaming source sensor A to *_source
  when now showing the data of those sensors, there’s a confusion:

source sensor A in history view (which uses data from states table):

source sensor A in entity view (which uses data from statistics tables for a few HA Core releases):

template sensor A in history view (which uses data from states table):

template sensor A in entity view (which uses data from statistics tables for a few HA Core releases):

So it’s a quite confusing mix:

source sensor A STATE view is equal to template sensor A STATISTICS view
source sensor A STATISTICS view is equal to template sensor A STATE view
(just visually match the corresponding views/screenshots above)

Reason:

New entity get’s a new metadata_id and a new/own set of statistics data in both statistics tables.

Impact:

So if I understand this correctly, this means:

for the time being while using the template sensors, I simply won’t have the history for them

What I don’t know - and that keeps me frightening - is:

What happens once I would switch back (delete (or first rename) template sensors, only use source sensors again)?

What data/history/view will I loose?

==> Update:
Just tested it, reverted all back.
Former source (now again primary) sensor has all data in all views (states/history view + statistics/entity view) → important, nice
Former template (now “*_test”) sensor has statistics/entity view data, but misses the states/history view data from the timeslot acting as original sensor (because: during that time, no data was recorded for the entity_id “*_test”).
==> In short: that should be fine, nothing essential is lost for the primary sensor once switching back.
==> Only downside is for the time using the template sensors: the statistics/entity view is simply incomplete as it can’t show the history before the sensor’s creation.

By the way: what my template sensor(s) look like

I could not test (and hopefully never will) if the state (former value_template) definition is actually correct. But I think it should filter the unwanted 0.0 values. Looking good to you @michaelblight ?

click to see template sensor definition

template:
  - sensor:
      - name: FRITZ!Box GB empfangen (Copy, BUG 1.31)
        # entity_id:
        # Zunächst (Test)                       : sensor.fritzbox_gigabytes_received_test
        # Final (ursprünglich des Quell-Sensors): sensor.fritzbox_gigabytes_received
        unique_id: abcabcabc-custom
        state_class: total_increasing
        unit_of_measurement: GB
        icon: mdi:download
        state: >-
          {% if states('sensor.fritzbox_gigabytes_received_source') != '0.0' %}
            {{ states('sensor.fritzbox_gigabytes_received_source') }}
          {% endif %}
        # Variante davor/alternativ:
        #state: >-
        #  {% if is_state('sensor.fritzbox_gigabytes_received_source', '0.0') %}
        #  {% else %}
        #    {{ states('sensor.fritzbox_gigabytes_received_source') }}
        #  {% endif %}
        availability: "{{ states('sensor.fritzbox_gigabytes_received_source') > '0.0' }}"

      - name: FRITZ!Box GB gesendet (Copy, BUG 1.31)
        # entity_id:
        # Zunächst (Test)                       : sensor.fritzbox_gigabytes_sent_test
        # Final (ursprünglich des Quell-Sensors): sensor.fritzbox_gigabytes_sent
        unique_id: xyzxyzxyz-custom
        state_class: total_increasing
        unit_of_measurement: GB
        icon: mdi:upload
        state: >-
          {% if states('sensor.fritzbox_gigabytes_sent_source') != '0.0' %}
            {{ states('sensor.fritzbox_gigabytes_sent_source') }}
          {% endif %}
        # Variante davor/alternativ:
        #state: >-
        #  {% if is_state('sensor.fritzbox_gigabytes_sent_source', '0.0') %}
        #  {% else %}
        #    {{ states('sensor.fritzbox_gigabytes_sent_source') }}
        #  {% endif %}
        availability: "{{ states('sensor.fritzbox_gigabytes_sent_source') > '0.0' }}"

Summary on the “template sensor option” to deal with the problematic data

To be honest, taking that observation into account I’d still really prefer to just block the integration writing the rubbish 0.0 values to the database at all. By using SQL TRIGGERS. Cause not having shit in the database at all is a million times better than having to deal with the shit once it is in there.

In other words: copying and struggling with an additional amount of (template) sensors and their different data in their different tables (1x states, 2x statistics) shown in different ways and places in the UI is much more complicated - and definitely HAS SIDE EFFECTS - than expected.

Not sure how to proceed from here. Please assist on the initial SQL TRIGGER questions if anyone is smart enough for it.

By the way I’m starting to get a bit angry about the whole situation, facing all the issues and technical challenges coming up trying to work around the core issue (having a damn 0.0 in one single row in the database), while looking at Unreliable sensor data (data usage) since firmware 7.50 - filter/block that data · Issue #86661 · home-assistant/core · GitHub seeing there’s no solution (or even first response).

==> Latest status (see also “Update” section above):
Using the template sensor option now. Even in my opinion using the SQL TRIGGER option would be the better option. It’s just that there are few unanswered questions on using SQL TRIGGERs on the HA database, see end of original post.

Appendix:

First inspiration of filtering unwanted data (funnily, in that case also 0.0 needed to be blocked): Need MQTT sensor template that filter state=0