Dutch Gas Priceses

Probably a change in the scraper. The 8th div was literal earlier, but now it is in the hierarchy. This is my adjusted config

sensor:
  - platform: scrape
    name: Euro95 Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(2) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # be nice; once per hour only

  - platform: scrape
    name: Diesel Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(3) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # be nice; once per hour only

  - platform: scrape
    name: LPG Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(4) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # be nice; once per hour only
2 Likes

Thanks that did the trick :slight_smile:

My solution to the prices was a python3 script:

from bs4 import BeautifulSoup
import requests
import json


###Take the url name from "https://www.brandstof-zoeker.nl/"
###Example: "https://www.brandstof-zoeker.nl/station/avia-h-i-ambacht-hendrik-ido-ambacht-771/"
###Mutiple stations supported (comma seprated)
stationList = ['avia-h-i-ambacht-hendrik-ido-ambacht-771']

list = {}

for station in stationList:
    stationName = station.split('-',1)[0]

    ###load website with request
    result = requests.get("https://www.brandstof-zoeker.nl/station/{0}/".format(station))
    ###Get content from request website
    content = result.content

    ###Load content in bs4
    soup = BeautifulSoup(content, "html.parser")

    ###Put body in variable
    soupBody = soup.body

    ###try if available, catches errors
    try:
        diesel = soupBody.find_all('dd')[0]
    except IndexError:
        diesel = 'null'
    
    ###try if available, catches errors
    try:
        euro95 = soupBody.find_all('dd')[1]
    except IndexError:
        euro95 = 'null'


    ###If error change both values to not available, prevents euro95 and diesel pricing mixup
    if euro95 == 'null' or diesel == 'null':
        dieselClean = "Not available"
        euro95Clean = "Not available"
    else:
        ###Gets pricing line from beautifull soup input
        diesel = soupBody.find_all('dd')[0]
        euro95 = soupBody.find_all('dd')[1]
        ###Strips everything except price
        euro95Clean = str(euro95).split('\n',1)[-1].split('\n', 1)[0].split(' ', 1)[-1].split(' ', 1)[0]
        dieselClean = str(diesel).split('\n',1)[-1].split('\n', 1)[0].split(' ', 1)[-1].split(' ', 1)[0]

    ###creates a list in json format so homeassistant templates can be used.
    list["{0}".format(stationName)] = {"euro95": "{0}".format(euro95Clean),"diesel": "{0}".format(dieselClean)}

encapsulatedList = {"list": list}

json = json.dumps(encapsulatedList)

print(json)

This script outputs a JSON array with the prices. Next I added a sensor in hass to get the output as a sensor.

- platform: command_line
  name: 'gasPrice'
  command: "python /config/scripts/gasprice/gasPrices.py"
  scan_interval: 60
  json_attributes: ['list']

To avoid multiple request to the site for every gasstation I created a template to extract the data from the commandline sensor to sepperate entities. This I did with the following sensors:

- platform: template
  sensors:
    shell_euro95:
      unit_of_measurement: '€'
      value_template: >-
        {{- state_attr('sensor.gasPrice', 'list').shell.euro95 -}}
    berkman_euro95:
      unit_of_measurement: '€'
      value_template: >-
        {{- state_attr('sensor.gasPrice', 'list').berkman.euro95 -}}
2 Likes

Works perfect! Too bad the Tamoil close by doesn’t give any information via www.brandstof-zoeker.nl.
But even for a n00b like me, I got it working. Thanks!

Tinq has recenlty done an update on their site, so the example in this topic doesn’t work anymore.

I am now playing with BeautifulSoup to get the diesel price from my local fueling station, but I have too little experience to get there. Is there anyone able to help?

What I have so far is:

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.tinq.nl/tankstations/nootdorp-kerkweg")
soup = BeautifulSoup(page.content, 'html.parser')
soupBody = soup.body

diesel = soupBody.select("div.field.field--name-field-station-prices.field--type-entity-reference.field--label-above > div.field__items > div:nth-child(1) > div > div.field.field--name-field-prices-price-pump.field--type-float.field--label-hidden.field__item")

print(diesel)

this is the output:

[<div class="field field--name-field-prices-price-pump field--type-float field--label-hidden field__item" content="1.349">€ 1.34<sup>9</sup><span>EUR/L</span></div>]

And that’s where I’m stuck at the moment. I see the correct price in the content tag, so that would be the easiest place in my opinion to get the price from, but how do I get it from there?

edit: sometimes it helps to ask for help to get to the sollution yourself :slight_smile:

looking at the example from @legoracers I found the following sollution:

dieselstripped = str(diesel).split('=',2)[-1].split('>',1)[0].split('"')[1]
print(dieselstripped)

or (even shorter)

dieselstripped = str(diesel).split('"')[3]

Maybe not the best sollution, but it works for now!

You can have a look at the fuel package in my configuration repo for Tinq using the scrape sensor.

BTW, Lukoil also changed their website recently. I fixed that as well.

Hero! I just noticed my Tinq sensors were broken for a long time and finally decided to check their website. That has changed quite a bit :slight_smile:

Your code works excellent. Direct link for others (as it includes more brands): https://github.com/metbril/home-assistant-config/blob/00d68bae7c6ec525ea28c712773645acd42b9a7d/packages/fuel.yaml

As i needed quite a few vendors I decided to go another way and build an OCR api on top of directlease.nl. You can try it here:
https://brandstof-api.sanwil.net/docs#/default/api_brandstof_prijs_api_v1_brandstof_prijzen__png__get
Just paste in a png filename such as “7025.png” (i didn’t see the numbers switching over time).

Direct endpoint would be “https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/7025.png” for example.

Let me know what you think. Later on i can make it run faster (it’s on a slow server now). For now it supplies the gasoline and diesel prices. ocr_station is just an indication of the first OCR line in the image, so don’t use it.

2 Likes

Whoa, you’re taking this to a whole different level this way!

I just need the diesel and benzine prices of 2 different TinQ stations, so the scrape is fine for me now.

I was just playing with the scrape function but I was not able to get the data from the esso and shell sites so I will look to your solution. I’ll tried a few and it seems to work fine, not slow at all. As a copy-paste programmer I have to look how to get the data into HA.

@proton999, you could use the rest sensor of HA:

  - platform: rest
    resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/XXXX.png
    name: Give The Station A Name
    value_template: '{{ value_json.benzine_prijs }}'
    unit_of_measurement: "€"

Still improving the code, so if there is a station which is not working or providing wrong prices let me know.

Thanks for the code, as soon I find some time (probably this weekend) I’ll do some testing and let you know.

It’s working fine, only the last digit is hard to catch for your OCR software because sometimes it;'s missing or wrong and sometimes spot on. But I’m already happy with it. I don’t know what de refresh rate of the directlease.nl site is or at your site so probably the one hour refresh update I use is to much.
Below my results en code. (have to make the card a bit nicer

Fuel

homeassistant:
   customize_glob:
    sensor.euro95*:
      unit_of_measurement: "€/L"

sensor:
  - platform: rest
    resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/4895.png
    name: Shell Express Dorp
    value_template: '{{ value_json.benzine_prijs }}'
    unit_of_measurement: "€/L"
    scan_interval: 3600 # once per hour only

  - platform: rest
    resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/4916.png
    name: Esso Randweg Noord
    value_template: '{{ value_json.benzine_prijs }}'
    unit_of_measurement: "€/L"
    scan_interval: 3600 # once per hour only

  - platform: rest
    resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/4929.png
    name: Tango Smulders
    value_template: '{{ value_json.benzine_prijs }}'
    unit_of_measurement: "€/L"
    scan_interval: 3600 # once per hour only

  - platform: rest
    resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/4966.png
    name: Tinq Praxis
    value_template: '{{ value_json.benzine_prijs }}'
    unit_of_measurement: "€/L"  
    scan_interval: 3600 # once per hour only

  - platform: rest
    resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/5002.png
    name: Gulf Leto
    value_template: '{{ value_json.benzine_prijs }}'
    unit_of_measurement: "€/L" 
    scan_interval: 3600 # once per hour only
    
  - platform: scrape
    name: Euro95 Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(2) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # once per hour only
1 Like

Great. I improved the code again so it will use a cache of 3600 seconds. I don’t know how good/up to date directlease prices are but stations seem to be updated more times in a day from what I see.

Good that you use a scan_interval it helps with performance on the server side.

1 Like

Very nice API!
Works like a charm, except for some fuel stations the OCR’ed data is incorrect or null.
For example 7016.png.

Are you willing to share or open-source your API ?
Would be cool to host it myself for myself.

I’ve tried your config for the scrape, but getting error:

Error loading /config/configuration.yaml: 'utf-8' codec can't decode byte 0x80 in position 6022: invalid start byte

I think there must be something wrong in your configuration? Can you post your code?

Yes will definitely publish the code on Github! Hopefully I get some time this or next week to make it a version 1.0 and publish it.

I’ve only edit the following in my sensor.yaml.

  - platform: scrape
    name: Euro95 Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(2) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # be nice; once per hour only

  - platform: scrape
    name: Diesel Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(3) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # be nice; once per hour only

  - platform: scrape
    name: LPG Advies
    resource: "https://www.unitedconsumers.com/brandstofprijzen"
    select: ".table div:nth-of-type(4) div:nth-of-type(3)"
    value_template: "{{ value |replace('€', '') |replace(',', '.') |round(3) }}"
    scan_interval: 3600 # be nice; once per hour only

@sanderdw very interesting api and thanks for sharing. I added some stations but found on some the prices are not what is published. For example;

2963
Is shown as;

Code:

- platform: rest
  resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/2963.png
  name: BP Heerbaan Breda
  value_template: '{{ value_json.benzine_prijs }}'
  unit_of_measurement: "€/L"
  scan_interval: 3600 # once per hour only

And
3095
Is shown as;


Code;

- platform: rest
  resource: https://brandstof-api.sanwil.net/api/v1/brandstof_prijzen/3095.png
  name: Berkman Beneluxweg Oosterhout
  value_template: '{{ value_json.benzine_prijs }}'
  unit_of_measurement: "€/L"
  scan_interval: 3600 # once per hour only

Any idea why this happens?