Ultrasonic sensor - Remove extreme values

Hello,

i am using an ultrasonic sensor in our rain water tank.
The water level changes very slowly but I sometimes have measurement errors that I cannot explain.

I use a median of 30 values ​​to try to limit the effect of these errors. Would it be possible to simply remove the extreme values ​​from this sample?

here is my YAML code

sensor:
 - platform: ultrasonic
   trigger_pin: GPIO25
   echo_pin: GPIO26
   name: cuve_distance
   update_interval: 60s
   accuracy_decimals: 3
   unit_of_measurement: m
   pulse_time: 50us
   filters:
    - filter_out: nan
    - median:
       window_size: 30
       send_every: 10
       send_first_at: 10

can you help me ?

thank you :slight_smile:

If the extreme values are always the same you can use the filter_out filter, https://esphome.io/components/sensor/index.html#filter-out

Unfortunately ESPHome does not have an outlier filter but home assistant does, https://www.home-assistant.io/integrations/filter/#outlier

I would say median is a perfectly valid and good way to handle them.

For single spikes you should only need a window of 3, for double spikes a window of 5.

A window of ~10 should be enough to both handle outliers and provide a good centrepoint if your sensor measurements have some variability. This is a reason why a median filter is one of my favourites.

You can also construct a lambda filter (if > Y then Z, else x).

Thank you for answer.

I still have a problem which in my opinion is related to the treatment via the quantiles.

Look at the graph I get
image

The trend looks good to me but the sharp drops are amazing. Could it be due to the rounding function I use?
My idea was precisely to provide more reliable values ​​for the quantiles.

Here is my actual code

sensor:
 - platform: ultrasonic
   trigger_pin: GPIO25
   echo_pin: GPIO26
   name: cuve_distance
   update_interval: 60s
   accuracy_decimals: 3
   unit_of_measurement: m
   pulse_time: 50us
   filters:
    - filter_out: nan
    - lambda: return round(x*10000)/10000;
    - quantile:
        window_size: 30
        send_every: 10
        send_first_at: 10
        quantile: .8
   #- median:
   #    window_size: 30
   #    send_every: 10
   #    send_first_at: 10

   # - calibrate_linear:
   #    - 0.10 -> 1.65
   #    - 1.65 -> 0.0
   on_value:
    then:
     - sensor.template.publish:
        id: cuve_litre
        state: !lambda 'return 4677*(0.680625*acos((0.825-(1.65-x))/0.825)-(0.825-(1.65-x))*sqrt(0.680625-(0.825-(1.65-x))*(0.825-(1.65-x))));'
        #state: !lambda 'return 10*round(4677*(0.680625*acos((0.825-(1.65-x))/0.825)-(0.825-(1.65-x))*sqrt(0.680625-(0.825-(1.65-x))*(0.825-(1.65-x))))/10);'
     - sensor.template.publish:
        id: cuve_pourcent
        state: !lambda 'return (4677*(0.680625*acos((0.825-(1.65-x))/0.825)-(0.825-(1.65-x))*sqrt(0.680625-(0.825-(1.65-x))*(0.825-(1.65-x)))))/100;'
     - sensor.template.publish:
        id: cuve_hauteur_eau
        state: !lambda 'return (1.65-x);'

Is there a reason you’re interested in the 80th%ile?

Personally I can’t see how that would produce a good central measure.

It’s probably choppy by nature of the measure - outliers jump around and you’re basically picking up the right hand tail with that.

Is the median not working as you wish? I can’t see why it would be under normal outlier scenarios. It’s a Robust Measure.

Maybe make some copy sensors and show some raw plus treated plots so we can see what’s happening.

Look at those measurements :

as you can see when it’s wrong it is very wrong :wink:

I have solved my issue with Node-Red, ESP-Home is just used for measurement.

This Node (node-red-contrib-cistern (node) - Node-RED) is perfect for me and seems to be created for my module (the HC-SR04)

Thank you for help :slight_smile:

If that is all you are using NR for it’s a bit like cracking a walnut with a sledgehammer. Home assistant can do this natively:

:rofl: you’re right !

no i use NR for multiple usage with HA.

Well that’s ok then. It would have been a rather large overhead just for the outlier filter.

I’m still surprised by the amount of outliers :

[15:05:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.22 m
[15:05:15][D][sensor:127]: 'cuve_distance': Sending state 0.22415 m with 5 decimals of accuracy
[15:06:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:06:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:07:37][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.22 m
[15:07:37][D][sensor:127]: 'cuve_distance': Sending state 0.21952 m with 5 decimals of accuracy
[15:08:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:08:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:09:13][D][sensor:127]: 'cuve_wifi': Sending state 20.00000 % with 0 decimals of accuracy
[15:09:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:09:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:10:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.22 m
[15:10:15][D][sensor:127]: 'cuve_distance': Sending state 0.22398 m with 5 decimals of accuracy
[15:11:16][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:11:16][D][sensor:127]: 'cuve_distance': Sending state 0.56012 m with 5 decimals of accuracy
[15:12:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.22 m
[15:12:15][D][sensor:127]: 'cuve_distance': Sending state 0.22415 m with 5 decimals of accuracy
[15:13:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:13:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:14:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:14:15][D][sensor:127]: 'cuve_distance': Sending state 0.55978 m with 5 decimals of accuracy
[15:15:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.22 m
[15:15:15][D][sensor:127]: 'cuve_distance': Sending state 0.22398 m with 5 decimals of accuracy
[15:16:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:16:16][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:17:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.22 m
[15:17:15][D][sensor:127]: 'cuve_distance': Sending state 0.22415 m with 5 decimals of accuracy
[15:18:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:18:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:19:11][D][sensor:127]: 'cuve_wifi': Sending state 24.00000 % with 0 decimals of accuracy
[15:19:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:19:15][D][sensor:127]: 'cuve_distance': Sending state 0.55978 m with 5 decimals of accuracy
[15:20:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:20:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:21:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:21:15][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy
[15:22:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:22:15][D][sensor:127]: 'cuve_distance': Sending state 0.56029 m with 5 decimals of accuracy
[15:23:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:23:15][D][sensor:127]: 'cuve_distance': Sending state 0.55978 m with 5 decimals of accuracy
[15:24:15][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:24:15][D][sensor:127]: 'cuve_distance': Sending state 0.55978 m with 5 decimals of accuracy
[15:25:16][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:25:16][D][sensor:127]: 'cuve_distance': Sending state 0.55978 m with 5 decimals of accuracy
[15:26:16][D][ultrasonic.sensor:040]: 'cuve_distance' - Got distance: 0.56 m
[15:26:16][D][sensor:127]: 'cuve_distance': Sending state 0.55995 m with 5 decimals of accuracy

for the moment I do not know if I should multiply the measurements or increase the size of the sample

For the moment I make a measurement per minute and a sample of 10 measurements.

In the end I have a value recorded every 10 minutes

There’s only so many times I can suggest “just use a median”.

Sounds like you’ve found a solution you prefer though anyway.