How would you scrape a site which is basically a .csv file?


I am trying to create a sensor with a time plot of my local lake ice thickness. They update the thickness in random intervals.

The data is available here

So it is a very basic html site, with only plaintext values. Can I use Scrape to deal with this?

So far I have tried:

  - platform: scrape
    name: ice_thickness
    unit_of_measurement: cm
    select: ".18.03.2021;40"
    value_template: >
        {{ value.split(';')[2] }}

I guess my biggest issue is, what to use as a “select” parameter to select the whole text in the page?

Any ideas?

I’m not sure a scrape sensor can do what you need. It depends on html tags, and it looks like the data returned over the http connection is just the raw dates and depths.

Here’s a command line sensor* that can do the job:

  - platform: command_line
    name: lake_ice_thickness
    unit_of_measurement: 'cm'
    command: "curl -s | head -1"
    value_template: '{{ value.split(";")[1] }}'

It pulls the data (curl url with a -s flag to suppress curl stats), only returns the first line (head -1), then splits the date off and only uses the depth. End result:

*Note: Requires access to curl and head linux utils (I run Home Assistant in a venv on raspbian, so I’m not sure if these are easily accessible via docker or other installs).

Thank you for the reply. Actually I also tried using curl before, and I am actually able to output the values. I will give a try to you method as well. Would you happen to know if it possible to have the whole ice thickness history plotted, so not only the most recent value?

My current solution:

Sorry, off the top of my head, I don’t know how to convince Home Assistant to add past values, only to update as they happen. The sensor I posted above should update the next time the thickness is reported and that will be plotted on the graph. This also assumes that you won’t allow Home Assistant to purge the database before then, HA default is to only keep 10 days of data). So Home Assistant might not be the best tool for this job, looking at the past updates. I personally might use a Google Spreadsheet and some Apps script to pull and chart the data instead. There are probably other good options as well.

Thank you, I will try to start looking for some other options, maybe plot the data on Octave and then only show it as an image in the Home Assistant.