I’ve been having a hard time trying to scrape a minimalist, dynamic, constantly updated webpage. Our local school closings are posted and changed very frequently. The page updates basically a large table with no CSS and due to nature of the data, the nth value is dynamic based on the length of the list. I have been able to pull only the first value of the list by either using just a plain “tr” or “td” under “select” in Scrape. I can also find the correct value if I know the proper nth value which defeats the purpose.
How would I search/scrape this page to see if a particular school is on this list?
@imthefrizzlefry Thanks for your reply! The list is back (no snow days between posts to work on this until today). How do I define indexes? I had limited success with Multiscrape getting a select_list but ran into the 255 character limit on state. Not sure how to set or poll an attribute. Just learning templating and more error than success right now.
Turns out I just needed to make a binary sensor for each school and it worked! Then I set an Automation that calls a scene when the binary sensor changes from Unavailable to Off. Thanks again! Although I’d still like to learn about calling/setting the indexes, if you have a moment.
The above screenshot is the add sensor dialog in the UI for the scrape integration.
I don’t know if you are setting it up in the config file, but there is a pretty nice UI that makes it easy to define a resource (aka page you are connecting to), and a sensor.
looking at the documentation for the Scrape integration, I can see there is a value for index as part of the sensor map, so the YAML would look something like this: