Can I replace REST sensors with local file data?

Can anyone steer me in the right direction please? I’m using the UKHO API and REST sensors to get the current local tidal height and the tidal events (high & low tide times & heights) for the next week. All’s well with how I do this but I’ve observed that there are some significant errors in their tidal forecast, which they accept. They have now provided me with new data and asked me to help validate it. As part of this I’d like to try substituting it into a copy of my tides dashboard. The test data consists of 2 csv files, one has nearly 53,000 pairs of <datetime><height> (every 10mins for a year), the other about 1,400 <datetime><height> pairs (1 per tide). I have two questions:

  • can I do something to these two datasets that allows me to interrogate them with my REST code (or something derived from it)?
  • failing that, what’s the best way of looking up data from them? I’ve read about SQL but it looks daunting to replicate what I’m doing with REST sensors. The FILE sensor clearly won’t work here.

This is an example of what I’m trying to replicate. The first gets the current height and what it will be in 90 mins.

  - resource_template: https://admiraltyapi.azure-api.net/uktidalapi-premium/api/V2/Stations/0536/TidalHeights?StartDateTime={{ utcnow().strftime("%Y-%m-%d %H:%M:00Z") }}&EndDateTime={{ (utcnow()+timedelta(minutes=91)).strftime("%Y-%m-%dT%H:%MZ") }}&IntervalInMinutes=10
    scan_interval: 120000
    headers:
      Ocp-Apim-Subscription-Key: my_licensed_key
      Cache-Control: no-cache

    sensor:
      - name: "UKTideNow"
        unique_id: "uktidenow"
        value_template: "{{ (value_json|first).Height|float(0) }}"
        state_class: measurement
        unit_of_measurement: m
      - name: "UKTideLater"
        unique_id: "uktidetlater"
        value_template: "{{ (value_json|last).Height|float(0) }}"
        state_class: measurement
        unit_of_measurement: m
      - name: "UKTideNowTime"
        unique_id: "uktidenowtime"
        value_template: "{{ (value_json|first).DateTime }}"
      - name: "UKTideLaterTime"
        unique_id: "uktidelatertime"
        value_template: "{{ (value_json|last).DateTime }}"

The 2nd shows how it get (one of) the events. I get a week of tidal events via this section.

  - resource: https://admiraltyapi.azure-api.net/uktidalapi/api/V1/Stations/0536/TidalEvents?duration=7
    scan_interval: 43200
    headers:
      Ocp-Apim-Subscription-Key: my_license_key

    sensor:
      - name: "UKTide 1 type"
        unique_id: "UKtide1type"
        value_template: "{{ value_json[0]['EventType']|replace('Water','') }}"
      - name: "tide1"
        unique_id: "tide1"
        value_template: "{{ as_datetime(value_json[0]['DateTime']+'+00:00') }}"
        device_class: timestamp
        icon: mdi:clock-time-twelve
      - name: "tide1h"
        unique_id: "tide1h"
        value_template: "{{ value_json[0]['Height']|float(0) }}"
        state_class: measurement
        unit_of_measurement: m
        icon: >
         {% if states('sensor.uktide_1_type')=="High" %}
           mdi:arrow-collapse-up
         {% else %}
           mdi:arrow-collapse-down
         {% endif %}

Any ideas?

If you want to do local REST queries, I believe you need a custom web application that can interpret the GET parameters (after the question mark in the URI) to scan the file and return the right information. If you’re comfortable writing code, you could probably write a short python daemon to do this, or maybe a PHP script running in a http server.

If you include a few lines of the file we might be able to help further with the 2nd question. Is the file CSV or XML formatted? If it’s the latter, you might be able to retrieve the entire file every time, and setup a value_template to query the fields previously filtered by the GET parameters.

Thanks for the reply. Both files are very simple csv format, but I can convert them to whatever works, e.g. json.
Any suggestions where I can read up on your first suggestion?
The next few days events data looks like this:

20/08/2024 00:34	0.285
20/08/2024 06:11	7.168
20/08/2024 12:54	0.240
20/08/2024 18:25	7.626
21/08/2024 01:26	-0.062
21/08/2024 06:52	7.419
21/08/2024 13:41	-0.037
21/08/2024 19:07	7.847
22/08/2024 02:12	-0.269
22/08/2024 07:32	7.465
22/08/2024 14:23	-0.115
22/08/2024 19:48	7.877
23/08/2024 02:53	-0.279
23/08/2024 08:10	7.324
23/08/2024 15:01	0.037
23/08/2024 20:28	7.697

The first file is the same but with heights every 10 mins.
In the 2nd REST sensor, the API also returns “High water” or “low water” for each event, but this is absent in the test data set.

You could convert the file to JSON — in the right format — and do something similar to the examples on the HA Docs for REST sensor (e.g. at the bottom) but that would be a little tedious for a very large dataset.

You might consider using the command-line sensor to execute a grep | awk command against your file — using templates for the date fields — and return the proper value. I think the file needs to be in your configuration directory for this to work. A (very rough) example:

command_line:
  - sensor:
      name: Tide height
      command: "grep {{ utcnow().strftime('%Y-%m-%d %H:%M:00Z') }} tidefile.csv | awk '{print($2)}' "
      unit_of_measurement: m
      etc...

Edit: on review, that only works if the entry is the exact date and time; if this executes every minute you’d have to return a non-zero exit code so that no value is saved in between readings.

Thanks, thats my project for tomorrow then! I’ll post back with the results.

I’ve made progress with getting a value from the file that lists the 10 min heights like this:
First I create a trigger sensor that only updates on the 10 min intervals

  - trigger:
    - platform: time_pattern
      minutes: 10
    sensor:
      - name: tenmins
        unique_id: tenmins
        state: "{{ utcnow().strftime('%d/%m/%Y %H:%M') }}"

Then I created this based on your suggestion:

command_line:
  - sensor:
      name: NewTideheight
      unique_id: newtideheight
      command: " grep '{{ states('sensor.tenmins') }}' /config/www/NewTideHts.csv | awk '{print($2)}' "
      value_template: '{{value.split(",")[1]}}'
      scan_interval: 600
      state_class: measurement
      unit_of_measurement: m

(Somehow the date formatting changed when I converted the whole file).
I’m not sure why awk '{print($2)}' didn’t split the output at the comma delimiter rather than the space, but adding a value_template fixed it anyway.
I’d like to add the time tenmins as an attribute but can’t see how.

It took my aging brain far too long to resolve your great suggestion - for which many thanks. I’ll now look at what you suggest about using the REST sensor to parse the smaller file in JSON format.

Actually it looks like that trigger sensor isn’t updating on the 10 minute mark. It should be

- trigger:
    - platform: time_pattern
      minutes: "/10"

EDITED to correct file location:
I’m not having any success using REST to read the json file though. The sensors are not created and the error messages in the log says
Timeout while fetching data: http://192.1.64:8123/local/NewTidalEvents.json
I’ve tried using CURL in the terminal window to test this ie
curl -X http://192.1.64:8123/local/NewTidalEvents.json
But i get Invalid method encountered: b'get /config/www/NewTidalEvents.json HTTP/1.1'
The problem seems to be in accessing a local file but I can’t find a description of how to do this in the RESTful docs.
This is what I’ve got:
NewTidalEvents.json:

[
  {
    "DateTime": "20/08/2024 00:34",
    "Height": 0.285,
    "EventType": "Low"
  },
  {
    "DateTime": "20/08/2024 06:11",
    "Height": 7.168,
    "EventType": "High"
  },
  {
    "DateTime": "20/08/2024 12:54",
    "Height": 0.24,
    "EventType": "Low"
  },
  {
    "DateTime": "20/08/2024 18:25",
    "Height": 7.626,
    "EventType": "High"
  },
  {
    "DateTime": "21/08/2024 01:26",
    "Height": -0.062,
    "EventType": "Low"
  },
  {
    "DateTime": "21/08/2024 06:52",
    "Height": 7.419,
    "EventType": "High"
  },
  {
    "DateTime": "21/08/2024 13:41",
    "Height": -0.037,
    "EventType": "Low"
  },
  {
    "DateTime": "21/08/2024 19:07",
    "Height": 7.847,
    "EventType": "High"
  }
]

and the section from configuration.yaml

  - resource: http://192.1.64:8123/local/NewTidalEvents.json
    sensor:
      - name: "TestTide 1 type"
        unique_id: "Testtide1type"
        value_template: "{{ value_json[0]['EventType'] }}"
      - name: "Testtide1"
        unique_id: "Testtide1"
        value_template: "{{ as_datetime(value_json[0]['DateTime']+'+00:00') }}"
        device_class: timestamp
        icon: mdi:clock-time-twelve
      - name: "Testtide1h"
        unique_id: "Testtide1h"
        value_template: "{{ value_json[0]['Height']|float(0) }}"
        state_class: measurement
        unit_of_measurement: m

Looks like there might be a typo in your resource IP address.

Well spotted - I can’t believe I missed that for so long.
Thank you so much for your help.

To wrap up this thread, I hope it might be useful to some future reader to post my working solution to using the command_line sensor. This gets a day’s worth of records from the csv (starting from now) file and extracts the first one where the height falls below a threshold. It corrects where the records splits (on a comma not a space).

command_line:
  - sensor:
      name: NewTideNextRestrictionStart
      unique_id: NewTideNextRestrictionStart
      command: >
        grep -A144 '{{ states('sensor.tenmins') }}' /config/www/NewTideHts.csv | awk 'BEGIN {FS=","};{if ($2+3.68 < '{{ states('input_number.tide_low_restriction') }}') print($0)}' | awk 'BEGIN {FS=","};NR==1{print($1)}'
      scan_interval: 8640