I’m very new to this and have no experience with any programming since BASIC, dBase & Excel many years ago, but I am very keen to learn!
I’m running Home Assistant 2022.11.2
Supervisor 2022.10.2
Operating System 9.3
Frontend 20221108.0 - latest
My objective is to scrape various pieces of data on local tidal, wave & wind conditions from various websites and present them in a single dashboard.
My first attempt is only partly successful: From a website with tidal info I can get the name of the locality but I’m failing to get the variable tidal data. I either get nothing or a large text block with the tidal height buried within it. I’ve tried a few other sites and have similar problems scraping specific data (in all cases I can get some simple text but not the varying numbers as per the example below).
My configuration includes:
- name: Locality scraper
resource: https://www.tidetimes.org.uk/appledore-tide-times
scan_interval: 86400
button:
- unique_id: Loc_refresh
name: Refresh local info
sensor:
- unique_id: Loc_place
icon: mdi:pin
name: Place
select: "#left-col > p:nth-child(3) > b"
- unique_id: Loc_update
name: Latest data
select: "#left-col > p:nth-child(3)"
The log file returns
2022-11-16 11:51:13.718 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Locality scraper # New run: start (re)loading data from resource
2022-11-16 11:51:13.718 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Locality scraper # Rendered resource template into: https://www.tidetimes.org.uk/appledore-tide-times
2022-11-16 11:51:13.719 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Locality scraper # Request data from https://www.tidetimes.org.uk/appledore-tide-times
2022-11-16 11:51:13.720 DEBUG (MainThread) [custom_components.multiscrape.http] Locality scraper # Executing page-request with a get to url: https://www.tidetimes.org.uk/appledore-tide-times.
2022-11-16 11:51:13.969 DEBUG (MainThread) [custom_components.multiscrape.http] Locality scraper # Response status code received: 200
2022-11-16 11:51:13.971 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Loading the content in BeautifulSoup.
2022-11-16 11:51:14.106 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Locality scraper # Data succesfully refreshed. Sensors will now start scraping to update.
2022-11-16 11:51:14.106 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Finished fetching multiscrape data in 0.389 seconds (success: True)
2022-11-16 11:51:14.108 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.multiscrape
2022-11-16 11:51:14.109 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.multiscrape
2022-11-16 11:51:14.109 INFO (MainThread) [homeassistant.components.button] Setting up button.multiscrape
2022-11-16 11:51:14.110 DEBUG (MainThread) [custom_components.multiscrape.sensor] Locality scraper # Place # Setting up sensor
2022-11-16 11:51:14.111 DEBUG (MainThread) [custom_components.multiscrape.sensor] Locality scraper # Latest data # Setting up sensor
2022-11-16 11:51:14.114 DEBUG (MainThread) [custom_components.multiscrape.sensor] Locality scraper # Place # Start scraping to update sensor
2022-11-16 11:51:14.122 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Place # Tag selected: <b>Appledore Tidal Predictions</b>
2022-11-16 11:51:14.122 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Place # Selector result: Appledore Tidal Predictions
2022-11-16 11:51:14.122 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Place # Final selector value: Appledore Tidal Predictions
2022-11-16 11:51:14.122 DEBUG (MainThread) [custom_components.multiscrape.sensor] Locality scraper # Place # Selected: Appledore Tidal Predictions
2022-11-16 11:51:14.123 DEBUG (MainThread) [custom_components.multiscrape.entity] Locality scraper # Place # Icon template rendered and set to: mdi:pin
2022-11-16 11:51:14.123 DEBUG (MainThread) [custom_components.multiscrape.entity] Locality scraper # Place # Updated sensor and attributes, now adding to HA
2022-11-16 11:51:14.124 DEBUG (MainThread) [custom_components.multiscrape.sensor] Locality scraper # Latest data # Start scraping to update sensor
2022-11-16 11:51:14.132 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Latest data # Tag selected: <p>
<b>Appledore Tidal Predictions</b><br/>
Here are the predicted tides for Appledore. Use the calendar to change the date view.
<br/>
Right now, the water height at Appledore is approximately 3.94m.
</p>
2022-11-16 11:51:14.133 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Latest data # Selector result:
Appledore Tidal Predictions
Here are the predicted tides for Appledore. Use the calendar to change the date view.
Right now, the water height at Appledore is approximately 3.94m.
2022-11-16 11:51:14.133 DEBUG (MainThread) [custom_components.multiscrape.scraper] Locality scraper # Latest data # Final selector value:
Appledore Tidal Predictions
Here are the predicted tides for Appledore. Use the calendar to change the date view.
Right now, the water height at Appledore is approximately 3.94m.
2022-11-16 11:51:14.133 DEBUG (MainThread) [custom_components.multiscrape.sensor] Locality scraper # Latest data # Selected:
Appledore Tidal Predictions
Here are the predicted tides for Appledore. Use the calendar to change the date view.
Right now, the water height at Appledore is approximately 3.94m.
I am trying to get just the number right at the end of this ie 3.94m in this instance. Using Chrome, selecting the relevant number & doing “copy” does not allow “selector” unless the entire block is highlit.
Failing to be selective in what is scraped, I have tried following advice on this forum to extract the number from the string but I just don’t understand the terminology or how to setup a template.
I’d be very grateful is anyone is welling to help an old dinosaur evolve into a modern human (or just show me what to do!).