Scrape sensor improved - scraping multiple values

Hi @danieldotnl,

I’m trying to upgrade from ‘hass-multiscrape (drogfild) pre-login’ to your latest version 4.1.3 ‘ha-multiscrape (danieldotnl)’ and I can’t figure out (also not after reading your wiki) how to move everything into one new multiscrape: (scrape and sensors) action.

Can you perhaps help me?

Below the pre-login and (part of) my sensors.

  - platform: multiscrape
    name: osc scraper
    resource: https://portal.xxx.com/gateways/xxxxx
    verify_ssl: false
    prelogin:
      preloginpage: https://portal.xxx.com/u/sign_in
      preloginform: 'loginForm'
      username_field: 'user[email]'
      password_field: 'user[password]'
      username: !secret xxx_username
      password: !secret xxx_password
    scan_interval: 00:15:00 # Request every 15 min
    selectors:
      levering_dag:
        name: Levering dag
        select: "div.col-lg-4:nth-child(4) > div:nth-child(1) > div:nth-child(2) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(2) > a:nth-child(1)"
        value_template: "{{value[:-4] | replace('.', '') | float | round(0)}}"
      levering_nacht:
        name: Levering nacht
        select: "div.col-lg-4:nth-child(4) > div:nth-child(1) > div:nth-child(2) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(2) > a:nth-child(1)"
        value_template: "{{value[:-4] | replace('.', '') | float | round(0)}}"

#######################
  - platform: template
    sensors:
      osc_afnemen_dag:
        friendly_name: OSC afnemen dag
        icon_template: "mdi:thermometer"
        unit_of_measurement: "kWh"
        value_template: "{{ state_attr('sensor.osc_scraper', 'Levering dag') }}"
        attribute_templates:
          updated: >
            {{ as_timestamp(states.sensor.osc_scraper.last_updated) | timestamp_custom('%Y-%m-%d %H:%M', true) }}

  - platform: template
    sensors:
      osc_afnemen_nacht:
        friendly_name: OSC afnemen nacht 
        icon_template: "mdi:thermometer"
        unit_of_measurement: "kWh"
        value_template: "{{ state_attr('sensor.osc_scraper', 'Levering nacht') }}"
        attribute_templates:
          updated: >
            {{ as_timestamp(states.sensor.osc_scraper.last_updated) | timestamp_custom('%Y-%m-%d %H:%M', true) }}

multiscrape:
  - resource: 'https://thepagewiththedatathatyouwant.com'
    scan_interval: 30
    form_submit:
      submit_once: True
      resource: 'https://thesitewiththeform.com'
      select: ".unique-css-selector-for-the-form"
      input:
        username: [email protected]
        password: '12345678'
        extra: field
    sensor:
      - select: 'td.mydata:nth-child(1) > a:nth-child(1)'
        name: scraped-value-after-form-submit

Thanks a lot for your help! :wink:

1 Like

Erik, you are bit early. The form-submit functionality has not yet been released. I hope to release it this week or in the weekend.
Could you indicate what’s not clear on the wiki, so I can improve it? Thanks!

Allright, thanks for the info. I’ll wait for the release and have a look/try and let you know.:wink:

Hi…
I’m using your example from your config.

Release v5.0.0 with the form-submit functionality is out there! The options are described in the readme and the wiki.

Great sensor, I used it for this:

1 Like

Hmmm with this page Usługa ePPK Pekao TFI – serwis PPK dla pracowników
what woulb be the name of input fields in configuration? I con’t get this idea of merging…

If I have a string ASCII: 1111111100000000 in a P tag
How can I retrieve only one character at a time from that string?

For example I only want the first 1, I tried this but it doesn’t work…

- platform: scrape
    resource: http://192.168.1.#/####/##
    name: Irrig State3
    select: "p"
    value_template: "{{ value.text[8][1] | }}"

Also tried value_template: “{{ value.text[8:1] | }}” but doesn’t work…

In your browser just right-click on the textboxes and inspect. You’ll see the id ‘app-input-username’ for the username and ‘app-input-password’ for the password field.

Edit: your input fields don’t have a name, see reply below.

I manage to do it with this code:

value_template: >-
      {% if (value | regex_findall_index("(\d)", index=0)) == "0" %}
        on
      {% else %}
        off
      {% endif %}

So it is not name but id of the component right?

No sorry, you are right it should be the name. But the form inputs on that page don’t have a name. Instead some javascript magic seems to happen and it then posts a json body like {"login":"xcvcx","passwordEncrypted":"5kMhR235O3swwWtcW0zqgw=="}

That’s not supported by the form-submit functionality :neutral_face:

Thanks anyway

Hello everybody, would you please help me how to get this value

image

image

from here

https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=16495FDB-97AB-11D4-BB62-00508BA24287&mapData=Idosor#mapModule

using scrape sensor@hassio? thank you!

Release v5.1.0

I released v5.1.0 which creates a service that allows you to “manually” trigger an update or schedule it at specific times with automations :partying_face:

Hey,
You can ‘right click’ on the number > Copy > Copy Selector
image

That will give you the selector you need to pull the data

#_content > table > tbody > tr > td:nth-child(1) > table > tbody > tr:nth-child(2) > td:nth-child(2) > strong

hi 91JJ, thank you for your help,

I tried this in config.yaml:

but have:

is my select wrong?

You might need to remove the tbody’s from the select, check the scraping guide.

hello Daniel,

it was the problem, now it works like a charm!

thank you!

Hi,
I’m trying to get some results from Markets - Bitvavo
Used the ‘copy selector’ to get the value required.
removed the # before the socket to get unknown instead of unavailable.

multiscrape:
  - name: bitvavo
    resource: https://bitvavo.com/en/markets
    scan_interval: 600 # seconds
    sensor:
      - unique_id: bitvavo_btc_price
        name: bitvavo.btc_price
        icon: mdi:bitcoin
        select: "socket > div:nth-child(2) > div:nth-child(2) > div:nth-child(2) > div > table.interactive.responsive.sortable > tbody > tr:nth-child(8) > td:nth-child(4) > span"
      - unique_id: bitvavo_eth_price
        name: bitvavo.eth_price
        select: "socket > div:nth-child(2) > div:nth-child(2) > div:nth-child(2) > div > table.interactive.responsive.sortable > tbody > tr:nth-child(4) > td:nth-child(4) > span"
      - unique_id: bitvavo_ant_price
        name: bitvavo.ant_price
        select: "socket > div:nth-child(2) > div:nth-child(2) > div:nth-child(2) > div > table.interactive.responsive.sortable > tbody > tr:nth-child(42) > td:nth-child(4) > span"

removing the > tbody doesn’t have any effect, value remains at state ‘unknown’.
any tips?