Scrape stop working at update 2022.6.0 with basic authentication

Scraping from 4 different URLs, one requiring authentication has stopped working (the other 3 works ok). GUI shows “Unknown”
No improvement on 2022.6.1.

  - platform: scrape
    resource: http://myurl
    name: Painters indoor
    username: myuser
    password: mypw
    authentication: basic
    select: 'body'
    index: 1     value_template: '{{ value.split(" ")[9] }}'
    unit_of_measurement: "°C"


2022-06-03 15:51:29 WARNING (SyncWorker_3) [homeassistant.components.scrape.sensor] Index 'None' not found in None
2022-06-03 15:51:29 ERROR (MainThread) [homeassistant.helpers.template] Template variable error: 'None' has no attribute 'split' when rendering '{{ value.split(" ")[9] }}'

I guess that no data is returned and possibly caused by authentication?

Hass Virtual box.
OS 8.0 (8.1 install problem)

First debugging would be to open “http://myuser:mypw@myurl” in a browser or via curl to check if the expected result hasn’t change.

Thanks koying, yes it happened more than once looking into wrong direction.
But yes, it works with browser login.

Ok, and what’s the actual output in the browser (the html source)?

The last line is of my interest

  .button {
    background-color: #4CAF50 ;
    border: none;
    color: white;
    padding: 15px 32px;
    text-align: center;
    text-decoration: none;
    display: inline-block;
    font-size: 40px;
    margin: 4px 4px;
    cursor: pointer;

  .button2 {background-color: #008CBA;} /* Blue */
  .button3 {background-color: #f44336;} /* Red */ 
  .button4 {background-color: #e7e7e7; color: black;} /* Gray */ 
  .button5 {background-color: #555555;} /* Black */
  .button6 {background-color: #4CAF50;} /* Green */
  .button7 {background-color: #ff9900;} /* Yellow */
  .button8 {background-color: #993399;} /* Magenta */

 .ta {
	width: 230px;
	border:1px solid #3366FF;
	border-left: 4px solid #3366FF;
	font-size: 30px;
.ta5 {
	border: 2px solid #765942;
	border-radius: 10px;
	height: 60px;
	width: 230px;
.round-button {
    display: inline-block;
    border: 2px solid #f5f5f5;
    border-radius: 50%;
    background: #008CBA;
    box-shadow: 0 0 3px gray;
 .buttonGray {background: #555555;} 
 .buttonYellow {background: #ff9900;}  
 .buttonGreen {background: #4CAF50;}  
 .buttonRed {background: #f44336;}  

  <h1>Number 10 Painters Road</h2>

  <a href="/" ><button class="round-button buttonGray">Status</button></a>
  <a href="/" ><button class="round-button buttonYellow">On</button></a>
  <a href="/" ><button class="round-button buttonGreen">Off</button></a>


<br><button class="button button6">Off!</button><br><br><font size="5"><body><br>Rain <span id='Rain-id'> 289.0</span><br>Outdoor <span id='Outdoor-id'> 16.89</span><br>Garage <span id='Garage-id'> 17.39</span><br>Garden <span id='Garden-id'> 20.51</span><br>Living1 <span id='Living1-id'> 25.64</span><br>Hall <span id='Hall-id'> 24.46</span><br>Chimney <span id='Chimney-id'> 44.22</span><br>Stove <span id='Stove-id'> 67.81</span><br><br><br></body><body06-06 13:54:42 off<br>06-06 13:50:43 off<br>06-05 14:21:41 Stop. Started: 14:21:36 OK<br>06-05 13:20:23 Stop. Started: 13:20:18 OK<br>06-05 12:44:32 Stop. Started: 12:44:26 OK<br>06-05 12:19:39 Stop. Started: 12:19:34 OK<br>06-05 09:45:56 on<br>06-05 07:42:32 off<br>06-04 17:53:08 on<br>06-04 06:29:47 off<br>06-03 23:51:59 on<br>06-03 07:36:01 off<br>06-02 21:43:49 on<br>06-02 16:37:10 System started<br></body>

What last line?
Has this ever worked as it is?

You are taking index 1 of “Body”, which is already wrong (only one body, so should be 0).
Then you are taking the 10th “word” from it, which roughly returns href="/"

It’s the line with rain and temperature values.
Yes it worked before upgrade. No change to the web page.
I receive no data at all, maybe that’s caused by Body - 1. I’ll try to change to 0.

Thanks, that is the problem. By changing to index I get data, still wrong data, but that’s related to the split. I will consider to rewrite the page to be more scrape friendly.

Again thank you for taking your time on this
/ Lars

Ah no, didn’t notice you actually have 2 “body” (strangely), so index 1 is correct for what you want, I guess, assuming there was not an HA change that made this invalid.
You actually have “index:” and “value_template:” on 2 different lines, do you? Not like what you show here…

If you wrote the page yourself, a HTML page with 2 body is invalid HTML.

1 Like

Yes Index and value_template on different lines.
And yes, two bodys.
I’ll try to poke around to see if i can catch any valuable data.

Thanks, so maybe this explains why I get data with index 0?

Yes it’s my own HTML page, I understand I need to rewrite it.

Case solved, thanks to both of you for pointing to the right direction! I can now also use span-id’s to scrape.

Again Thank you!