Long shot .. scraping site > list

Only hoping for input…
I have a website (below) that provides WM(T)S parameters which I would like to use in a list/helper
As this content changes over time, I would not want to just manually copy that…
Has anyone exprience in this? Is it ‘just’ scraping and templating ?

Services Géoplateforme de diffusion | Géoservices (ign.fr)
… as an exmplae the secrion in grey “Données du Service d’images tuilées WMTS”

I assume you’re looking for the data from the GET links on the page? If you use GET the response will be in xml. From there it needs to be parsed. I use node red and I don’t know how to do this in HA. It’s fairly simple in NR.

Ex. this link appears to be the same wtms data shown at the bottom of the page.

https://data.geopf.fr/wmts?SERVICE=WMTS&VERSION=1.0.0&REQUEST=GetCapabilities

image

Hey, thanks for responding and looking into it … it is indeed that xml and the objects tagged

<ows:Identifier>AREAMANAGEMENT.ZUS</ows:Identifier>

So now I only need to list them and put then in a helper…the latter I found some code for.
Node red has been put aside 1+ y ago when HA automation became more powerful (for my needs) but I will find a way …again, thanks!

Use a REST sensor to process the XML.

That’s a very long XML response though: you might have issues working with it.

Yeah… and since I need to get all elements out that start with <ows: … I think I will have to use a python script… or xquery but that is not installed in HA

Using the RESTful integration (so under rest: in your config):

- resource: https://data.geopf.fr/wmts?SERVICE=WMTS&VERSION=1.0.0&REQUEST=GetCapabilities
  scan_interval: 3600
  sensor:
    - name: WMTS count
      value_template: "{{ value_json['Capabilities']['Contents']['Layer']|length }}"

    - name: WMTS 0
      value_template: "{{ value_json['Capabilities']['Contents']['Layer'][0]['ows:Identifier'] }}"
      availability: "{{ value_json['Capabilities']['Contents']['Layer']|length > 0 }}"

    - name: WMTS 1
      value_template: "{{ value_json['Capabilities']['Contents']['Layer'][1]['ows:Identifier'] }}"
      availability: "{{ value_json['Capabilities']['Contents']['Layer']|length > 1 }}"

    - name: WMTS 2
      value_template: "{{ value_json['Capabilities']['Contents']['Layer'][2]['ows:Identifier'] }}"
      availability: "{{ value_json['Capabilities']['Contents']['Layer']|length > 2 }}"

    - name: WMTS 9
      value_template: "{{ value_json['Capabilities']['Contents']['Layer'][9]['ows:Identifier'] }}"
      availability: "{{ value_json['Capabilities']['Contents']['Layer']|length > 9 }}"

    - name: WMTS 999
      value_template: "{{ value_json['Capabilities']['Contents']['Layer'][999]['ows:Identifier'] }}"
      availability: "{{ value_json['Capabilities']['Contents']['Layer']|length > 999 }}"

gives this:

You could use a script (template or offline) to generate the sensor configs in a loop automatically then paste into the config file.

Then you can interrogate the sensors to do things like pull out the available entities’ states:

{{ states['sensor']
   |selectattr('entity_id','contains','wmts_')
   |rejectattr('entity_id','contains','wmts_count')
   |rejectattr('state','==','unavailable')
   |map(attribute='state')
   |list }}

It’s not exactly the same list as on the website: “Besoin chaleur industriel” doesn’t appear on the site. It should give you enough to get started though.

1 Like

I know you are a wizard…still beyond amazing at times

1 Like