REST Sensor eliminate bad data at sensor point

I use a rest sensor to pull data from a near real time feed on lake water levels (updated every 5 minutes I believe). Every hour I request the last 10 days data and then read the last record (level and date) . Reasonable values would be between 330m and 350m. Most of the time everything is great. Sometimes though I return 0. (I assume the provider doesn’t answer the request as I always read the last record that it sends so even if it missed a sample I’d still read the previous record). Sometimes the provider might be down for maintenance, or I’m sure there could be other reasons they don’t answer. Within the sensor, is there a way I can test the value and say (for example) - ignore any value below 300?

I use this data in a number of analytical purposes - so it would be nice if I could cleanse it right at the source (sensor) point - rather than filter it later.

###  SHUSWAP LAKE LEVEL ANALYSIS
#
- platform: rest
  name: shuswap_lake_crescent_bay_depth
  scan_interval: 3600
  resource_template: "https://vps267042.vps.ovh.ca/scrapi/station/08LE070/primarylevel/?startDate={{ (now() - timedelta(days=1)).strftime('%Y-%m-%d') }}&endDate={{ (now() + timedelta(days=10)).strftime('%Y-%m-%d') }}&resultType=history&key=-XXXTtPODmHSv4lai4_I"
  value_template: >
        {{ (value_json.message.history | last).value }}
#
#
- platform: rest        
  name: shuswap_lake_crescent_bay_reading_date
  scan_interval: 3600
  resource_template: "https://vps267042.vps.ovh.ca/scrapi/station/08LE070/primarylevel/?startDate={{ (now() - timedelta(days=1)).strftime('%Y-%m-%d') }}&endDate={{ (now() + timedelta(days=10)).strftime('%Y-%m-%d') }}&resultType=history&key=-XXXTtPODmHSv4lai4_I"
  value_template: >
        {{ (value_json.message.history | last).date }}

I think you should bring in of the data into the rest sensor. Then, do a template sensor to parse that data and get the info that you want/need.

Note that you will want to bring the rest sensor in attribute and not the state to avoid the size limitation.

For a start as both your sensors are using the same resource you should use the rest integration. This will populate both sensors with only one call to the resource:

configuration.yaml (not sensors.yaml)

rest:
  - resource_template: "https://vps267042.vps.ovh.ca/scrapi/station/08LE070/primarylevel/?startDate={{ (now() - timedelta(days=1)).strftime('%Y-%m-%d') }}&endDate={{ (now() + timedelta(days=10)).strftime('%Y-%m-%d') }}&resultType=history&key=-XXXTtPODmHSv4lai4_I"
    scan_interval: 3600
    sensor:
      - name: shuswap_lake_crescent_bay_depth
        value_template: >
          {{ (value_json.message.history | last).value }}
      - name: shuswap_lake_crescent_bay_reading_date
        value_template: >
          {{ (value_json.message.history | last).date }}

This may fix your problem as you are now only making one call to the resource at a time and are less likely to hit a rate limit. If it does not you can try filtering the output like this:

        value_template: >
          {% set depth = (value_json.message.history | last).value %}
          {{ this.state if depth == 0 else depth }}

this.state is the previous value of the sensor.

Thanks for the ideas Tom. I’ll give the filter a try. I get 200 reads a day and am only using 48 of them so I don’t think I am hitting a rate limit. I think it’s more likely the provider isn’t responding in time (I considered changing the default time limit, but I figured the default 10 seconds was sufficient if the provider was up and available.)

Do I need to switch to this to configuration.yaml from sensors.yaml in order to use the filter example you provided?

Thanks,
Ken

It’s not the daily rate limit you might be hitting. It’s the fact that the server may be busy responding to one of your sensors when the other tries to update.

Using the rest integration that makes one resource call to populate both sensors would alleviate this issue.

aha - thanks for clarifying. I will look to make that change.

I also thought I might reduce the data I request from 10 days to 1 to reduce the amount of data sent. 10 days is 2,880 records it needs to retrieve. I’m only taking the last value anyway.

For the filter you provided, if the provider takes (for example) a 6 hour maintenance outage, will that still work (after 6 reads), or does it only work on a single read failure (i.e. taking the previous read values)

It will keep the last value until a new valid (not 0) value is received.