Home Assistant Community

Scrape information from car sell website

#1

How can I scrape information from this example website link using using scrape sensor

Can someone show me example how to get text and pic from website on link?

#2

We need more information and an example of what you already tried and what errors are produced.

#3

Sorry @bosborne.

I try to get number of results for a start.

I try like this but not working:

  - platform: scrape
    resource: https://www.avto.net/Ads/results.asp?znamka=Alfa%20Romeo&model=GT&modelID=&tip=katerikoli%20tip&znamka2=&model2=&tip2=katerikoli%20tip&znamka3=&model3=&tip3=katerikoli%20tip&cenaMin=0&cenaMax=999999&letnikMin=0&letnikMax=2090&bencin=0&starost2=999&oblika=0&ccmMin=0&ccmMax=99999&mocMin=&mocMax=&kmMin=0&kmMax=9999999&kwMin=0&kwMax=999&motortakt=&motorvalji=&lokacija=0&sirina=&dolzina=&dolzinaMIN=&dolzinaMAX=&nosilnostMIN=&nosilnostMAX=&lezisc=&presek=&premer=&col=&vijakov=&EToznaka=&vozilo=&airbag=&barva=&barvaint=&EQ1=1000000000&EQ2=1000000000&EQ3=1000000000&EQ4=100000000&EQ5=1000000000&EQ6=1000000000&EQ7=1110100120&EQ8=1010000001&EQ9=100000000&KAT=1010000000&PIA=&PIAzero=&PSLO=&akcija=&paketgarancije=&broker=&prikazkategorije=&kategorija=&zaloga=&arhiv=&presort=&tipsort=&stran=
    name: ALFA GT
    select: '.ResultsAdTopLeft Left'

#4

I have not used the scrape sensor myself. Let me see if I can figure it out.

#5

Does this give you the results you wish? I replaces the space in the last line with a .

  - platform: scrape
    resource: https://www.avto.net/Ads/results.asp?znamka=Alfa%20Romeo&model=GT&modelID=&tip=katerikoli%20tip&znamka2=&model2=&tip2=katerikoli%20tip&znamka3=&model3=&tip3=katerikoli%20tip&cenaMin=0&cenaMax=999999&letnikMin=0&letnikMax=2090&bencin=0&starost2=999&oblika=0&ccmMin=0&ccmMax=99999&mocMin=&mocMax=&kmMin=0&kmMax=9999999&kwMin=0&kwMax=999&motortakt=&motorvalji=&lokacija=0&sirina=&dolzina=&dolzinaMIN=&dolzinaMAX=&nosilnostMIN=&nosilnostMAX=&lezisc=&presek=&premer=&col=&vijakov=&EToznaka=&vozilo=&airbag=&barva=&barvaint=&EQ1=1000000000&EQ2=1000000000&EQ3=1000000000&EQ4=100000000&EQ5=1000000000&EQ6=1000000000&EQ7=1110100120&EQ8=1010000001&EQ9=100000000&KAT=1010000000&PIA=&PIAzero=&PSLO=&akcija=&paketgarancije=&broker=&prikazkategorije=&kategorija=&zaloga=&arhiv=&presort=&tipsort=&stran=
    name: ALFA GT
    select: '.ResultsAdTopLeft.Left'

The sensor returns Zadetki 1 - 15 od skupno 15

#6

Thanks @bosborne.

Can I have two result in one scrape sensor?

I tryed like this, bur not working:

  - platform: scrape
    resource: https://www.avto.net/Ads/results.asp?znamka=Alfa%20Romeo&model=GT&modelID=&tip=katerikoli%20tip&znamka2=&model2=&tip2=katerikoli%20tip&znamka3=&model3=&tip3=katerikoli%20tip&cenaMin=0&cenaMax=999999&letnikMin=0&letnikMax=2090&bencin=0&starost2=999&oblika=0&ccmMin=0&ccmMax=99999&mocMin=&mocMax=&kmMin=0&kmMax=9999999&kwMin=0&kwMax=999&motortakt=&motorvalji=&lokacija=0&sirina=&dolzina=&dolzinaMIN=&dolzinaMAX=&nosilnostMIN=&nosilnostMAX=&lezisc=&presek=&premer=&col=&vijakov=&EToznaka=&vozilo=&airbag=&barva=&barvaint=&EQ1=1000000000&EQ2=1000000000&EQ3=1000000000&EQ4=100000000&EQ5=1000000000&EQ6=1000000000&EQ7=1110100120&EQ8=1010000001&EQ9=100000000&KAT=1010000000&PIA=&PIAzero=&PSLO=&akcija=&paketgarancije=&broker=&prikazkategorije=&kategorija=&zaloga=&arhiv=&presort=&tipsort=&stran=
    name: position_000
    select: '.ResultsAdDataTop .ResultsAdPrice'
    index: 0
#7

I do not know. That was the first time I used that sensor. You may need 2 sensors & combine the data.

#8

Here is example witch shows two data:

  - platform: scrape
    resource: http://www.bom.gov.au/vic/forecasts/melbourne.shtml
    name: Melbourne Forecast Summary
    select: ".main .forecast p"
    value_template: '{{ value | truncate(255) }}'
    # Request every hour
    scan_interval: 3600
    headers:
      User-Agent: Mozilla/5.0

Why my sensor shows unknown?

#9

What have you tried? I see your example uses double quotes & yours has single quotes, for example.

#10

What have you tried? I see your example uses double quotes & yours has single quotes, for example.

I already try this still same result. Any idea?

#11

I get error:

Unable to extract data from HTML

#12

How I can remove same text from sensor using value template?

I have sensor:

  - platform: scrape
    resource: https://www.avto.net/Ads/results.asp?znamka=Alfa%20Romeo&model=GT&modelID=&tip=katerikoli%20tip&znamka2=&model2=&tip2=katerikoli%20tip&znamka3=&model3=&tip3=katerikoli%20tip&cenaMin=0&cenaMax=999999&letnikMin=0&letnikMax=2090&bencin=0&starost2=999&oblika=0&ccmMin=0&ccmMax=99999&mocMin=&mocMax=&kmMin=0&kmMax=9999999&kwMin=0&kwMax=999&motortakt=&motorvalji=&lokacija=0&sirina=&dolzina=&dolzinaMIN=&dolzinaMAX=&nosilnostMIN=&nosilnostMAX=&lezisc=&presek=&premer=&col=&vijakov=&EToznaka=&vozilo=&airbag=&barva=&barvaint=&EQ1=1000000000&EQ2=1000000000&EQ3=1000000000&EQ4=100000000&EQ5=1000000000&EQ6=1000000000&EQ7=1110100120&EQ8=1010000001&EQ9=100000000&KAT=1010000000&PIA=&PIAzero=&PSLO=&akcija=&paketgarancije=&broker=&prikazkategorije=&kategorija=&zaloga=&arhiv=&presort=&tipsort=&stran=
    name: position_000
    select: '.ResultsAd'
    index: 0
    value_template: '{{ value | truncate(255) }}'

I get this output:

Alfa Romeo GT 2.0 JTS Distinctive Letnik 1.registracije:2004 215000 kmbencinski motor, 1970 ccm, 122 kW / 166 KMroèni menjalnik (5 pr.) Ogled oglasa Parkiraj v moj.avto.net 
 1.600 

How I can remove text Ogled oglasa Parkiraj v moj.avto.net from sensor using value_template?