What IS the temperature outside? How to ignore spurious values?

I’m using the outside temperature to estimate heat losses from my house as part of my effort to monitor heap pump performance. I’ve been getting some odd results and dived into the temperature history looking for anomalies. It turns out the 2 sensors I’m using which are built into the heat pump and the MVHR periodically go haywire giving spurious zero readings or shooting up by 15C or so (see graph below). It there some technique to ignore the wrong’uns?
It turns out I’ve got a 3rd sensor (which came as part of an Ecowitt wind monitor I bought recently), so with 3 sensors I must surely be able to get a solid number I can rely on. At different times either the HP sensor or the Ecowitt may get some solar gain which I’d like to excplude if possible.
The spikes in these lines appear to be real numbers rather than “unkown” etc. I was wondering whether using a 10 minute moving average might eliminate them & then maybe I could discard the outlier of the 3 readings to produce an average? What do you think?

I take the lowest of my two sensors as the “true” temperature to avoid sun loading (one is east-facing, one west), although zero readings are a problem particularly when the temperature could be zero.

I wish the DS18B20 sensors I use via ESPHome could return a Kelvin reading, as I could then reasonably ignore zero values as I don’t live in Scotland.

I’d do moving average and outlier removal. Probably outlier removal first, then moving average. Seems like you’d be able to do an outlier removal of around 3-4 degrees and it would remove all those spikes.

EDIT: Filter integration does that btw.

As ever, you guys are so helpful. I didn’t know about the filter integration, so I’ll play with that. It only takes a singlr ensor though, so I’d need a different way of discarding whichever one of the 3 current readings might be out of line. Maybe average the 2 lowest? Any thoughts?

Play with this in the template editor:

{% set t1, t2, t3 = (8, 0, 9) %}
{% set diffs = [(t1-t2)|abs,(t2-t3)|abs,(t1-t3)|abs] %}
{% set avs = [(t1+t2)/2, (t2+t3)/2, (t1+t3)/2] %}
{{ avs[diffs.index(diffs|min)] }}

t1, t2 and t3 are your sensor readings.

diffs is a list of the differences between the three possible pairings (with abs making sure each is always positive).

avs is the average of the corresponding pairs.

The final line returns the average of the two readings that are closest to each other.

There are probably better ways to do this, but it’s what came to my mind first.

Thanks again for your suggestions. For completions sake, this is what I’m trying now: 1st filter the 3 sensors than discard the highest & average the other 2. It would be nice to go backwards & see how this would have dealt with recent problems, but I guess that’s now possible. Is it even possible to get past daily averages?

  - platform: filter
    name: "Filtered MVHR Outside Temperature"
    unique_id: filtered_mvhr_outside_temp
    entity_id: sensor.outside_temperature
    filters:
      - filter: outlier
        window_size: 5
        radius: 2.0
      - filter: time_simple_moving_average
        window_size: "00:05"
        precision: 2

times 3, then

    - name: processed_outside_temp ## discards the highest of the 3 outside temps & averages the others
      unique_id: processed_outside_temp
      unit_of_measurement: °C
      device_class: temperature
      state_class: measurement
      icon: mdi:thermometer
      state: >
        {% set ts = [states('sensor.filtered_hp_outside_temperature')|float(0), states('sensor.filtered_mvhr_outside_temperature')|float(0), states('sensor.filtered_shed_outside_temperature')|float(0)]  %}
        {% set avs = [(ts[1]+ts[2])/2, (ts[0]+ts[2])/2, (ts[0]+ts[1])/2] %}
        {{ avs[ts.index(ts|max)] }}

if you’re looking to shorten the template and potentially make it easier to manage:

   {% set t1, t2, t3 = ('sensor.filtered_hp_outside_temperature', 'sensor.filtered_mvhr_outside_temperature', 'sensor.filtered_shed_outside_temperature') | map('state') | map('float', 0) | list %}
   {% set avs = [(t2, t3), (t1, t3), (t1, t2)] | map('average') | list %}
   {{ avs[ts.index(ts|max)] }}

That will give the wrong answer if the:

My solution above should deal with any single sensor going spuriously low or high.

Hi Troon, you are of course correct nd I did test it for a while. However when there are no spurious readings I still want to ignore the highest one (as there are 2 that occassionally read high because of the sun - never at the same time). This was my attempt at achieving that. I was hoping those filters would remove the spurious readings, but I can’t see how to test it other than letting it run.
Any other suggestions?

neat, I’ll try that