Scrape sensor does not work anymore, but why?

Since a month or so I have a scrape sensor retrieving some values from a website, and this has been working fine for several weeks, but since January 4 one of the values is not retrieved anymore.
As far as I can see the website is not changed at all (checked via the Wayback Machine), and I also cannot find any changes to the Scraping sensor in the Home Assistant release notes since that date, so why doesn’t it work anymore?
This is my scrape sensor:

# Scrape setup
scrape:
  - resource: https://www.energievergelijk.nl/energieprijzen/stroomprijs
    sensor:
      - name: EPEX Stroomprijs Europese beurs
        unique_id: stroomprijs_epex
        unit_of_measurement: "€/kWh"
        select: "body article div div div div div div div p"
        index: 0
        value_template: "{{ ( value.split(' ')[1] ) | replace(',', '.') | float }}"
      - name: Stroomprijs energieleveranciers
        unique_id: stroomprijs_leveranciers
        unit_of_measurement: "€/kWh"
        select: "body article div div div div div div div p"
        index: 1
        value_template: "{{ ( value.split(' ')[1] ) | replace(',', '.') | float }}"

The first value is still retrieved, but the second isn’t any more since January 4:

This is the applicable HTML code:

Can somebody explain why this doesn’t work anymore, and how I can repair it?

I did some more testing, and found select settings that return both values again, but it looks like it is going wrong in the value_template.

With this code I get both values as string:

# Scrape setup
scrape:
  - resource: https://www.energievergelijk.nl/energieprijzen/stroomprijs
    sensor:
      - name: EPEX Stroomprijs Europese beurs
        unique_id: stroomprijs_epex
        select: "div.cell.medium-12:nth-child(2) div.wrapper div.box.text-center p"
      - name: Stroomprijs energieleveranciers
        unique_id: stroomprijs_leveranciers
        select: "div.cell.medium-12:nth-child(3) div.wrapper div.box.text-center p"

This is the result in the Developer tools states:

But when I add twice the same value_template like this only the first value is returned correctly and the second value is returned as unavailable:

# Scrape setup
scrape:
  - resource: https://www.energievergelijk.nl/energieprijzen/stroomprijs
    sensor:
      - name: EPEX Stroomprijs Europese beurs
        unique_id: stroomprijs_epex
        select: "div.cell.medium-12:nth-child(2) div.wrapper div.box.text-center p"
        value_template: "{{ ( value.split(' ')[1] ) | replace(',', '.') | float }}"
      - name: Stroomprijs energieleveranciers
        unique_id: stroomprijs_leveranciers
        select: "div.cell.medium-12:nth-child(3) div.wrapper div.box.text-center p"
        value_template: "{{ ( value.split(' ')[1] ) | replace(',', '.') | float }}"

With this result in the Developer tools states:

The second value gets restored: true and supported_features: 0 in the states attributes list.

I don’t understand what the difference is between those two values and why the second one is not handled correctly in the value_template.
In the Template Editor both strings are handled correctly:

Is this a bug in the Scrape integration?
Does anybody have an idea?

Look in your logs for errors.

This template here will always find a number regardless of the contents

{{ value | regex_findall('[0-9,]+') | first | default('') | replace(',','.') | float }}
1 Like

I’ve just set these two sensors up via the UI (Integrations, Scrape).

For the first one, URL as the resource, nothing else needs setting on the first screen. Then:

So that’s a select of div.box.text-center p, index of 0, value_template of:

{{ (value|select('in','-,0123456789')|join).replace(',','.') }}

and unit of measurement of €/kWh.

Exactly the same for the second, but use 2 for the index value. Then:

That value_template assumes you’re going to have decimal commas only in the input, no decimal points. If you have thousand separator points, it will ignore those.

1 Like

Thank you Pedro, that works!

For completeness, this yaml code is giving the correct values:

# Scrape setup
scrape:
  - resource: https://www.energievergelijk.nl/energieprijzen/stroomprijs
    sensor:
      - name: EPEX Stroomprijs Europese beurs
        unique_id: stroomprijs_epex
        unit_of_measurement: "€/kWh"
        select: "div.cell.medium-12:nth-child(2) div.wrapper div.box.text-center p"
        value_template: "{{ value | regex_findall('[0-9,]+') | first | default('') | replace(',','.') | float }}"
      - name: Stroomprijs energieleveranciers
        unique_id: stroomprijs_leveranciers
        unit_of_measurement: "€/kWh"
        select: "div.cell.medium-12:nth-child(3) div.wrapper div.box.text-center p"
        value_template: "{{ value | regex_findall('[0-9,]+') | first | default('') | replace(',','.') | float }}"

Indeed I didn’t think of looking in the logs (and of course I should have done that).
This is an excerpt from the logs when it didn’t work correctly:

2024-01-19 13:03:20.518 ERROR (MainThread) [homeassistant.components.sensor] Error adding entities for domain sensor with platform scrape
Traceback (most recent call last):
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 1984, in forgiving_float_filter
return float(value)
^^^^^^^^^^^^
ValueError: could not convert string to float: ‘’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/usr/src/homeassistant/homeassistant/helpers/entity_platform.py”, line 507, in async_add_entities
await asyncio.gather(*tasks)
File “/usr/src/homeassistant/homeassistant/helpers/entity_platform.py”, line 752, in _async_add_entity
await entity.add_to_platform_finish()
File “/usr/src/homeassistant/homeassistant/helpers/entity.py”, line 1281, in add_to_platform_finish
await self.async_added_to_hass()
File “/usr/src/homeassistant/homeassistant/components/scrape/sensor.py”, line 196, in async_added_to_hass
self._async_update_from_rest_data()
File “/usr/src/homeassistant/homeassistant/components/scrape/sensor.py”, line 204, in _async_update_from_rest_data
value = template.async_render_with_possible_json_value(value, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 746, in async_render_with_possible_json_value
return _render_with_context(self.template, compiled, **variables).strip()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 2305, in _render_with_context
return template.render(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/jinja2/environment.py”, line 1301, in render
self.environment.handle_exception()
File “/usr/local/lib/python3.11/site-packages/jinja2/environment.py”, line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File “”, line 1, in top-level template code
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 1987, in forgiving_float_filter
raise_no_default(“float”, value)
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 1625, in raise_no_default
raise ValueError(
ValueError: Template error: float got invalid input ‘’ when rendering template ‘{{ ( ( value.split(’ ‘)[1] ) | replace(’,‘, ‘.’) ) | float }}’ but no default was specified
2024-01-19 13:03:20.529 ERROR (MainThread) [homeassistant.components.sensor] Error while setting up scrape platform for sensor
Traceback (most recent call last):
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 1984, in forgiving_float_filter
return float(value)
^^^^^^^^^^^^
ValueError: could not convert string to float: ‘’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/usr/src/homeassistant/homeassistant/helpers/entity_platform.py”, line 368, in _async_setup_platform
await asyncio.gather(*pending)
File “/usr/src/homeassistant/homeassistant/helpers/entity_platform.py”, line 507, in async_add_entities
await asyncio.gather(*tasks)
File “/usr/src/homeassistant/homeassistant/helpers/entity_platform.py”, line 752, in _async_add_entity
await entity.add_to_platform_finish()
File “/usr/src/homeassistant/homeassistant/helpers/entity.py”, line 1281, in add_to_platform_finish
await self.async_added_to_hass()
File “/usr/src/homeassistant/homeassistant/components/scrape/sensor.py”, line 196, in async_added_to_hass
self._async_update_from_rest_data()
File “/usr/src/homeassistant/homeassistant/components/scrape/sensor.py”, line 204, in _async_update_from_rest_data
value = template.async_render_with_possible_json_value(value, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 746, in async_render_with_possible_json_value
return _render_with_context(self.template, compiled, **variables).strip()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 2305, in _render_with_context
return template.render(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/jinja2/environment.py”, line 1301, in render
self.environment.handle_exception()
File “/usr/local/lib/python3.11/site-packages/jinja2/environment.py”, line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File “”, line 1, in top-level template code
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 1987, in forgiving_float_filter
raise_no_default(“float”, value)
File “/usr/src/homeassistant/homeassistant/helpers/template.py”, line 1625, in raise_no_default
raise ValueError(
ValueError: Template error: float got invalid input ‘’ when rendering template ‘{{ ( ( value.split(’ ‘)[1] ) | replace(’,‘, ‘.’) ) | float }}’ but no default was specified

Unfortunately, I still cannot figure out from this log what is causing this error: the log is only complaining about the value to be empty (float got invalid input ‘’ when rendering), but why is that only for the second value?
The two values seem to be identical in format?

perhaps there are 2 spaces and not just 1? so the index is an empty string?
Indeed going to the page and copying the element out of developer tools -

€ 0,315 though it doesn’t look like it - there are indeed 2 spaces.

{% set r = "€  0,315" %}
{% set r = r.split(" ") %}
{{ r[0] }}
{{ r[1] }}
{{ r[2] }}

Outputs -

€ # 0
   # 1
0,315 # 2
1 Like

Thank you Troon!
That works as well, and it is interesting to see this Scrape set-up via the GUI. I wasn’t aware of that possibility yet.
A big advantage of this method is that one does not have to restart Home Assistant with every change to the Scrape Yaml code (at least that is what I have been doing up to now).

WOW!
Nice catch :smiley:
What exactly do you meen by “copying the element out of developer tools”?

Right click (in Chrome) inspect element.

That way I was reasonably sure that I was copying the actual value, rather than what the browser has changed it to (unless you use   in your HTML then browsers have a tendency to not render multiple spaces in a row)

Thanks again!
Personally I am not using Chrome but Firefox, but that appears to have a similar method: “Copy inner HTML”.
This how it looks for me (with a Dutch language version of Firefox):

And indeed like this I can see those double spaces as well now.
This is the result for the two values when pasted in Notepad:

afbeelding

However, when pasting this result in the forum post the double spaces are automatically “repaired”, so it indeed isn’t visible anymore:

€ 0,085 /kWh
€ 0,315 /kWh

1 Like

That’s why you use code block formatting for code on the forum:

€  0,315 <sup>/kWh</sup>

… and why it’s a good idea to process inputs in a way that deals with this, like the value_templates petro and I suggested.

1 Like