I’m trying to use the scrape sensor to make a simple internet uptime monitor.
My modem (Arris/Motorola SB6141) has a pretty simple web page that displays the up time at the bottom of the page. If I inspect the page code it is stored here:
<tbody>
<tr>
<th style="background-color: rgb(115, 107, 8);"><font color="#ffffff">Cable Modem Operation</font></th>
<th style="background-color: rgb(115, 107, 8);"><font color="#ffffff">Value</font> </th></tr>
<tr>
<td>Current Time and Date</td>
<td>Apr 03 2018 07:26:40</td></tr>
<tr>
<td>System Up Time</td>
<td>0 days 11h:44m:5s</td></tr>
</tbody>
If I am understanding correctly, that should get the value of the 21st field. Maybe?
Instead I get this:
2018-04-03 08:20:45 ERROR (MainThread) [homeassistant.components.sensor] scrape: Error on device update!
Traceback (most recent call last):
File "/usr/src/app/homeassistant/helpers/entity_platform.py", line 188, in _async_add_entity
await entity.async_device_update(warning=False)
File "/usr/src/app/homeassistant/helpers/entity.py", line 327, in async_device_update
yield from self.hass.async_add_job(self.update)
File "/usr/local/lib/python3.6/asyncio/futures.py", line 327, in __iter__
yield self # This tells Task to wait for completion.
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 250, in _wakeup
future.result()
File "/usr/local/lib/python3.6/asyncio/futures.py", line 243, in result
raise self._exception
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/src/app/homeassistant/components/sensor/scrape.py", line 120, in update
value = raw_data.select(self._select)[0].text
IndexError: list index out of range
Sorry, I should have been more clear. That is just the section of the page that has the Uptime. I didn’t want to crowd the thread with the full code, but I can if it helps. When I searched it was the 21st instance of < td> out of 23.
It’s pretty hard to get scrape sensors right on the first go. My approach is to just spam them - create like 12 of them from 1-12 just to get my bearings and see how the scraper is reading the page, then I work from there.
and replaced x with numbers 1 through 23. Just to see if I got anything and just ended up with 23 of the same warning message. So now I’m at a complete loss. Here is the full page source, to see if anyone can tell me why I’m an idiot.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0037)indexdata.html -->
<HTML><HEAD>
<META content="text/html; charset=windows-1252" http-equiv=Content-Type>
<META content=no-cache http-equiv=Pragma>
<META content="Wed, 30 Apr 1975 02:00:00 GMT" http-equiv=Expires>
<META content="Microsoft FrontPage 4.0" name=GENERATOR>
<script language="JavaScript" src="utility.js" type="text/javascript">
</script>
</HEAD>
<BODY aLink=#7b2939 link=#485a91 text=#000000 vLink=#7b2939 onload="onloadmainpage()">
<script language="javascript" type="text/javascript">
var infoText = 'This page provides information about the startup \
process of the Cable Modem. If there is a problem with the startup, the \
word "Failed" may appear in the Status column. Should this occur, visit \
the Help area and perform the Checkup procedures listed there. If the \
problem continues, click on the word "Failed" for more detailed \
information about the failure, or call your service provider for \
assistance.';
document.write(displayHeader("cm","cmStatus",infoText));
</script>
<CENTER>
<TABLE align=center border=1 cellPadding=8 cellSpacing=0 width="100%">
<TBODY>
<TR>
<TH><FONT color=#ffffff>Task</FONT></TH>
<TH><FONT color=#ffffff>Status</FONT> </TH></TR>
<TR>
<TD>DOCSIS Downstream Channel Acquisition</TD>
<TD>Done</TD></TR>
<TR>
<TD>DOCSIS Ranging</TD>
<TD>Done</TD></TR>
<TR>
<TD>Establish IP Connectivity using DHCP</TD>
<TD>Done</TD></TR>
<TR>
<TD>Establish Time Of Day</TD>
<TD>Done</TD></TR>
<TR>
<TD>Transfer Operational Parameters through TFTP</TD>
<TD>Done</TD></TR>
<TR>
<TD>Register Connection</TD>
<TD>Done</TD></TR>
<TR>
<TD>Cable Modem Status</TD>
<TD>Operational</TD></TR>
<TR>
<TD>Initialize Baseline Privacy</TD>
<TD>Done</TD></TR>
</TBODY>
</TABLE>
</CENTER>
<P></P>
<TABLE align=center border=1 cellPadding=8 cellSpacing=0 width="100%">
<TBODY>
<TR>
<TH><FONT color=#ffffff>Cable Modem Operation</FONT></TH>
<TH><FONT color=#ffffff>Value</FONT> </TH></TR>
<TR>
<TD>Current Time and Date</TD>
<TD>Apr 03 2018 08:59:38</TD></TR>
<TR>
<TD>System Up Time</TD>
<TD>0 days 0h:27m:48s</TD></TR>
</TBODY>
</TABLE>
<P></P>
<script language="javascript" type="text/javascript">
document.write(displayFooter("cm"));
</script>
</BODY>
It seems that Firefox sees the td’s as lower-case and Chrome as upper-case; so I tried both wondering if it was case sensitive and ended up with the same results.
FYI this won’t work on newer firmware, restarting thru the web UI has been disabled “in the name of security”. A smart-plug would work though letting you power cycle it.