Issues to extract oil price with scrape from a webpage

Hi all

I have the latest Home Assistant version and I am coming from Openhab. I like Home Assistant very much, but I did run into some issues extracting the oil price from a web page.

I went to Scrape Github and tried to do the well explained examples described there step by step but 2 out of 3 examples from Scrape Github seem no longer to work as the CSS Selector can not deal with Javascript.
=> Here is the guide which I follow… Home Assistant: Web Scraper | vd Brink Home Automations

The web link I want to extract data from is: Heizöl online bestellen | Migrol AG

And the selector is:
body > div.migrol-body.layout > main > div.shop.migrol-form > div > div:nth-child(2) > section > div:nth-child(4) > fieldset > div.shop__materials > div.shop__material.shop__material–selected.shop__material–selectable > div.shop__material-block-price.shop__material-block-col-1 > div.shop__material-price > div > div.shop__material-price-per-unit
=> basically I want to extract the oil price

But scrape returns an empty response. Could anyone please point me in the right direction? I work with https://try.jsoup.org/ and also tried NodeRed but never got an ouput.
=> It looks to me that the problem is that the data I want to extract is generated by a JavaCode?

It seems to me that scrape is unable to extract JavaScript data from what I red. Did anyone else solver the problem differently?
=> Is python the way to go with a script?
=> or NodRed?

A REST sensor is what you want. As you suspected, the data comes in dynamically as a POST response from this URL:

https://www.migrol.ch/migrolapi/de/shop/CalculatePrice

where the request is made with this payload:

{"ProductType":"Heizoel","RequestSequenceNumber":1,"OfferId":null,"CampaignParticipantId":null,"PriceResponseId":null,"AuftraggeberId":null,"SelectedDeliveryPeriodId":-2042265318,"SelectedMaterialId":null,"WantsNewsletter":false,"MyClimate":false,"DiscountCode":null,"PaymentMode":"SingleUP","DifferentPaymentConfigurationText":null,"Positions":[{"ID":1,"Amount":3000,"WarenempfaengerId":1,"ReguliererId":null,"EquipmentId":null,"WantsFillOption":false}],"Addresses":[{"ID":1,"CustomerID":"","AddressSource":2,"Salutation":null,"FirstName":null,"LastName":null,"CompanyName":null,"Street":null,"StreetNumber":null,"AddressLine2":null,"Zipcode":8967,"City":"Widen","CountryCode":"CH","Phone":null,"EmailAddress":null,"CumulusNumber":null,"MLGuid":null}],"Equipments":[]}

If you can reliably replicate that, you get a response that looks like this:

{
  "PriceResponseId": "e1ad61c5-0b2d-41b4-a97d-3dba82f2e75c",
  "Outcome": 1,
  "Prices": [
    {
      "PriceType": 2,
      "Material": "Heizoel_OekoPlus",
      "PositionID": 1,
      "Price": 100.34
    },
    {
      "PriceType": 1,
      "Material": "Heizoel_OekoPlus",
      "PositionID": 1,
      "Price": 3010.2000
    },
    {
      "PriceType": 2,
      "Material": "Heizoel_OekoPlus",
      "PositionID": null,
      "Price": 100.34
    },
    {
      "PriceType": 1,
      "Material": "Heizoel_OekoPlus",
      "PositionID": null,
      "Price": 3010.2000
    },
    {
      "PriceType": 2,
      "Material": "Heizoel_GreenLife",
      "PositionID": 1,
      "Price": 104.89
    },
    {
      "PriceType": 1,
      "Material": "Heizoel_GreenLife",
      "PositionID": 1,
      "Price": 3146.7000
    },
    {
      "PriceType": 2,
      "Material": "Heizoel_GreenLife",
      "PositionID": null,
      "Price": 104.89
    },
    {
      "PriceType": 1,
      "Material": "Heizoel_GreenLife",
      "PositionID": null,
      "Price": 3146.7000
    }
  ],
  "RequestSequenceNumber": 1,
  "LoggingId": null
}

from which you can easily pull the value you want. At a guess:

value_template: "{{ (value_json['Prices']|selectattr('PriceType','eq',2)|first)['Price'] }}"

UPDATE: tried it and it works. I use the RESTful integration as a rule rather than individual sensors, and with a bit of fiddling, this (in configuration.yaml, with a restart if it’s your first rest: entry):

rest:
  - resource: https://www.migrol.ch/migrolapi/de/shop/CalculatePrice
    method: POST
    payload: '{"ProductType":"Heizoel","RequestSequenceNumber":1,"OfferId":null,"CampaignParticipantId":null,"PriceResponseId":null,"AuftraggeberId":null,"SelectedDeliveryPeriodId":-2042265318,"SelectedMaterialId":null,"WantsNewsletter":false,"MyClimate":false,"DiscountCode":null,"PaymentMode":"SingleUP","DifferentPaymentConfigurationText":null,"Positions":[{"ID":1,"Amount":3000,"WarenempfaengerId":1,"ReguliererId":null,"EquipmentId":null,"WantsFillOption":false}],"Addresses":[{"ID":1,"CustomerID":"","AddressSource":2,"Salutation":null,"FirstName":null,"LastName":null,"CompanyName":null,"Street":null,"StreetNumber":null,"AddressLine2":null,"Zipcode":8967,"City":"Widen","CountryCode":"CH","Phone":null,"EmailAddress":null,"CumulusNumber":null,"MLGuid":null}],"Equipments":[]}'
    headers:
      User-Agent: "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0"
      Content-Type: "application/json"
    sensor:
      - name: Oil price
        value_template: "{{ (value_json['Prices']|selectattr('PriceType','eq',2)|first)['Price'] }}"
        unit_of_measurement: CHF

returns this:

Huh, price has gone up whilst writing this post!

Note that there seems to be a timestamp (SelectedDeliveryPeriodId) in the POST data: I don’t know if that’s important. It’s changing with time OK:

1 Like

That is really cool! Thank you so much.

I try now to digest and understand what you sent, but this is so great.

Maybe one last update. If you want to change the scan_intervall, you need to FULLY RESTART HA (as Troon stated btw).

# get info from webpage
rest:
  - resource: https://www.migrol.ch/migrolapi/de/shop/CalculatePrice
    scan_interval: 3600
    method: POST
    payload: '{"ProductType":"Heizoel","RequestSequenceNumber":1,"OfferId":null,"CampaignParticipantId":null,"PriceResponseId":null,"AuftraggeberId":null,"SelectedDeliveryPeriodId":-2042265318,"SelectedMaterialId":null,"WantsNewsletter":false,"MyClimate":false,"DiscountCode":null,"PaymentMode":"SingleUP","DifferentPaymentConfigurationText":null,"Positions":[{"ID":1,"Amount":3000,"WarenempfaengerId":1,"ReguliererId":null,"EquipmentId":null,"WantsFillOption":false}],"Addresses":[{"ID":1,"CustomerID":"","AddressSource":2,"Salutation":null,"FirstName":null,"LastName":null,"CompanyName":null,"Street":null,"StreetNumber":null,"AddressLine2":null,"Zipcode":8967,"City":"Widen","CountryCode":"CH","Phone":null,"EmailAddress":null,"CumulusNumber":null,"MLGuid":null}],"Equipments":[]}'
    headers:
      User-Agent: "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0"
      Content-Type: "application/json"
    sensor:
      - name: "Heizung Oil Price"
        value_template: "{{ (value_json['Prices']|selectattr('PriceType','eq',2)|first)['Price'] }}"
        unit_of_measurement: CHF