I’m successfully scraping tides data but am getting wrong data on certain tides, when there are only 3 tides instead of 4. The web site actually presents data on twice the number of tides in each day (a set for BST and one for GMT). So on a 3 tide day my code, which always returns 4 set of data, is just returning the next line even if it is a (normally hidden) GMT version of the 1st tide. There is a tag which determines which set of tides is valid, but I’d need a condition to determine whether to scrape the last tide of the day. To complicate matters further a second scrape is used to return the first couple of the next day’s tides.
My multiscrape is this:
- name: Tides scraper
resource: https://www.tidetimes.org.uk/appledore-tide-times
scan_interval: 183600
button:
- unique_id: tide_data
name: Tides data refresh
sensor:
- unique_id: tides_location
name: Tides Location
select: "#left-col > p:nth-child(3) > b"
value_template: '{{ value.split(" Tidal")[-2] }}'
- unique_id: tides_now
name: Tide height now
select: "#left-col > p:nth-child(3)"
value_template: '{{ value.split("approximately ")[1].split("m")[-2]|float }}'
state_class: measurement
unit_of_measurement: m
- unique_id: tide1
name: Tide1
select_list: "#tides > table > tr:nth-child(3) "
value_template: '{{ value.split() }}'
- unique_id: tide2
name: Tide2
select_list: "#tides > table > tr:nth-child(4) "
value_template: '{{ value.split() }}'
- unique_id: tide3
name: Tide3
select_list: "#tides > table > tr:nth-child(5) "
value_template: '{{ value.split() }}'
- unique_id: tide4
name: Tide4
select_list: "#tides > table > tr:nth-child(6) "
value_template: '{{ value.split() }}'
- name: Next Tides scraper
resource: https://www.tidetimes.org.uk/appledore-tide-times-{{((now() + timedelta(days=1)).date()).strftime("%Y%m%d")}}
scan_interval: 183600
sensor:
- unique_id: tide5
name: Tide5
select_list: "#tides > table > tr:nth-child(3) "
value_template: '{{ value.split() }}'
- unique_id: tide6
name: Tide6
select_list: "#tides > table > tr:nth-child(4) "
value_template: '{{ value.split() }}'
And what it returned (on a 4 tide day) looks like
<b>Appledore Tidal Predictions</b><br/>
Here are the predicted tides for Appledore. Use the calendar to change the date view.
<br/>
Right now, the water height at Appledore is approximately 4.63m.
</p>
<div class="clr"></div>
<div class="block" id="tides">
<table>
<thead>
<tr><th colspan="3">Tide Times
<div>
BST: <img id="bstmode" onclick="doAction('/toggle-bst.php');" src="/graphics/sw1.png" title="British Summer Time on/off"/></div>
</th></tr>
</thead>
<tr class="colhead">
<td class="tal">Hi/Lo</td>
<td class="tac">Time</td>
<td class="tar">Height</td>
</tr>
<tr class="vis2">
<td class="tal">High</td>
<td class="tac"><span>04:31</span></td>
<td class="tar">6.81m</td>
</tr>
<tr class="vis2">
<td class="tal">Low</td>
<td class="tac"><span>11:25</span></td>
<td class="tar">0.46m</td>
</tr>
<tr class="vis2">
<td class="tal">High</td>
<td class="tac"><span>17:03</span></td>
<td class="tar">6.97m</td>
</tr>
<tr class="vis2">
<td class="tal">Low</td>
<td class="tac"><span>23:48</span></td>
<td class="tar">0.42m</td>
</tr>
<tr class="vis0">
<td class="tal">High</td>
<td class="tac"><span>03:31</span></td>
<td class="tar">6.81m</td>
</tr>
<tr class="vis0">
<td class="tal">Low</td>
<td class="tac"><span>10:25</span></td>
<td class="tar">0.46m</td>
</tr>
<tr class="vis0">
<td class="tal">High</td>
<td class="tac"><span>16:03</span></td>
<td class="tar">6.97m</td>
</tr>
<tr class="vis0">
<td class="tal">Low</td>
<td class="tac"><span>22:48</span></td>
<td class="tar">0.42m</td>
</tr>
<tr><td class="tac" colspan="3" style="font-size:11px; padding-top:10px;">NOT TO BE USED FOR NAVIGATION</td></tr>
tr class=“vis2” are the lines I need