Scraping HTML Page

dermeb · February 12, 2017, 11:40am

Hello,
I’m having a hard time to get information of a website.
I’m new in the templating business and can’t find any good explanations how to use of the split-function.

What I’m trying to do is to get the value (in %) of the attribute ‘style="margin-left: 3%’ (in this case obviously 3) of the class “crowd-level-tag crowd-level-pointer” from https://www.boulderwelt-muenchen-ost.de/
Background: It’s the estimated occupancy rate of my boulder place.

  - platform: scrape
    resource: https://www.boulderwelt-muenchen-ost.de/
    name: BW Ost
    select: ".crowd-level-pointer"
    value_template: '{{ value.split(??) }}'
    unit_of_measurement: '%'

With the selector above I get the following output:

< div class=“crowd-level-tag crowd-level-pointer”> < img src=“https://www.boulderwelt-muenchen-ost.de/wp-content/plugins/cxo-crowd-level//resources/img/pointer.png” style=“margin-left:3%”/>

So I’m nearly there, but I can’t get to extract the needed number from this line.
If you can point me into the right direction it would be much appreciated!

Thanks in advance
Michael

covrig · February 12, 2017, 1:54pm

I would create a new template sensor reading your scrape sensor. Using jinja2 filters you can extract the value you need (format the value).

This might help you:

dermeb · February 14, 2017, 8:57pm

Thank you, I will look into that!

But I stumpled upon another problem.
The output above I get only if I follow the steps of the Jupyter Notebook by hand.
But in Home-Assistant the sensor state will be empty.
Is it filtered somehow?

Thanks
Michael

dermeb · February 14, 2017, 9:32pm

I got it working.
I ditched the scraper sensor and used the command line sensor to read the value of a python script instead .

chutoro · February 16, 2017, 4:33pm

Any chance you could share the code for that as I’m wading through beautiful soup at the moment trying to do just that.