Scrape sensors stopped working

My scrape sensors were previously working fine. I’ve just noticed I wasn’t getting updates and I have the following error in the logs. Can anyone help?

scrape: Error on device update!
Traceback (most recent call last):
  File "/usr/src/app/homeassistant/helpers/entity_platform.py", line 248, in _async_add_entity
    await entity.async_device_update(warning=False)
  File "/usr/src/app/homeassistant/helpers/entity.py", line 319, in async_device_update
    yield from self.hass.async_add_job(self.update)
  File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/app/homeassistant/components/sensor/scrape.py", line 120, in update
    value = raw_data.select(self._select)[0].text
IndexError: list index out of range

what does your scrape sensor config look like? This error seems to be the result of something else going wrong, maybe a previous exception. What lines are before this in the log?

This is one of them. The only other errors I have relate to warnings a bout custom components. I have tried removing all the custom components to see if they are the issue but I still get the same error

- platform: scrape
  scan_interval: 360
  name: nextcloud
  resource: https://github.com/nextcloud/server/releases
  select: "#js-repo-pjax-container > div.container.new-discussion-timeline.experiment-repo-nav > div.repository-content > div.position-relative.border-top > ul > li:nth-child(1) > div > div > h3 > a > span"

Edit: setup of config is taking over 10 seconds immediately precedes it.
The custom complements in the logs are warnings not errors as well

Not sure. The component hasn’t been updated since October and your select works fine.

I’m seeing the same issue in Hass.IO 0.76.2. I had to run the following to get the wget https to work:

apk add ca-certificates wget

otherwise you get the following error:

wget: error getting response: Connection reset by peer

after that (details changed to protect the innocent):

core-ssh:/config# wget https://server.domain.com/index.html
--2018-08-21 22:30:16--  https://server.domain.com/index.html
Resolving server.domain.com... 10.0.0.1
Connecting to server.domain.com|10.0.0.1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10701 (10K) [text/html]
Saving to: 'index.html'

index.html                                           100%[=====================================================================================================================>]  10.45K  --.-KB/s    in 0.003s  

2018-08-21 22:30:17 (3.22 MB/s) - 'index.html' saved [10701/10701]

config:

sensor:
  - platform: scrape
    resource: https://server.domain.com/index.html
    select: "Hello World"

log:

core-ssh:/config# cat home-assistant.log 
2018-08-21 22:17:47 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.scrape
2018-08-21 22:17:48 DEBUG (SyncWorker_12) [homeassistant.components.sensor.scrape] 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<body>
... [redacted] ...
</body>
</html>

2018-08-21 22:17:48 ERROR (MainThread) [homeassistant.components.sensor] scrape: Error on device update!
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/homeassistant/helpers/entity_platform.py", line 248, in _async_add_entity
    await entity.async_device_update(warning=False)
  File "/usr/lib/python3.6/site-packages/homeassistant/helpers/entity.py", line 319, in async_device_update
    yield from self.hass.async_add_job(self.update)
  File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python3.6/site-packages/homeassistant/components/sensor/scrape.py", line 120, in update
    value = raw_data.select(self._select)[0].text
IndexError: list index out of range

I’m running hass.io 0.76.2 on a Raspberry Pi

There is a bug here, but it hasn’t been updated recently:

EDIT: I tried the above with http://tc.o to verify that it is not an https/ssl/tls issue, and the result was the same.

Update: I am still experiencing the issue on Hass.IO 0.79.2 running on a Raspberry Pi 3.
To make 100% sure this wasn’t an SSL/TLS/Certificate/Trust/Root issue, I found a basic insecure HTTP site to test (http://www.stealmylogin.com). Here’s my config:

sensor:                       
  - platform: scrape                     
    resource: http://www.stealmylogin.com                                     
    select: "dangers"
logger:                                                           
  default: critical           
  logs:                       
    homeassistant.components.sensor: debug

and the debug output from the log (I truncated the HTML because it’s long and likely unnecessary):

2018-10-02 19:25:14 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.command_line
2018-10-02 19:25:14 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.scrape
2018-10-02 19:25:14 DEBUG (SyncWorker_9) [homeassistant.components.sensor.rest] Updating from http://www.stealmylogin.com/
2018-10-02 19:25:15 DEBUG (SyncWorker_15) [homeassistant.components.sensor.rest] Updating from http://www.stealmylogin.com/
2018-10-02 19:25:16 DEBUG (SyncWorker_15) [homeassistant.components.sensor.scrape] <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>StealMyLogin.com - exposing the dangers of insecure login forms</title>
<style type="text/css">
---------------------------------
----- truncated for brevity -----
---------------------------------
</script>
</body>
</html>

2018-10-02 19:29:55 ERROR (SyncWorker_14) [homeassistant.components.sensor.scrape] Unable to extract data from HTML

and that’s the end of it. no other debug output.