Bin / Waste Collection

Could someone help me with a scrap i am still learning this and getting a bit stuck.
I used node red and http request h3 and i get the following just not able to get the dates.

image

https://www.braintree.gov.uk/info/200198/recycling_information_and_advice/1164/route_10_collection_dates

any help anyone could give would be great

Hi

I live in wakefield too, I dont suppose you would share your work please ?

thanks

Martyn

I am on ubuntu 18.04 server and have Node Red running successfully.

Does anyone know how to install PUP as it wont let us install from sudo apt-get install pup ?

I live in Wakefield and seen 1 user on here also uses it from the council website too but if some one would give me guidance, that would be appreciated.

image

thanks in advance

martyn

Wow what a wealth of information here. Ive managed to get so far but not been able to extract any useful information yet on dates etc. Was wondering if someone could help point me in the right direction.

My cancel site is https://apps.castlepoint.gov.uk/cpapps/index.cfm?fa=wastecalendar

By using chrome F12 I’ve found out my road id as 2757 and this url takes us there but not got any further: https://apps.castlepoint.gov.uk/cpapps/index.cfm?roadID=2757&fa=wastecalendar.displayDetails

Would be grateful for some advice.

1 Like

That’s as far as I got. The scraping is the big bit.

What OS are you using and have you tried what the original poster did to install PUP ?

Here’s scrape sensors to return the date for pink and grey days:

First “pink” day on page (don’t need an index for the first one, returns “10”):

- platform: scrape
  name: First Pink
  resource: https://apps.castlepoint.gov.uk/cpapps/index.cfm?roadID=2757&fa=wastecalendar.displayDetails
  select: ".pink"

Second “pink” day on page (indexes start at 0, returns “24”):

- platform: scrape
  name: Second Pink
  resource: https://apps.castlepoint.gov.uk/cpapps/index.cfm?roadID=2757&fa=wastecalendar.displayDetails
  select: ".pink"
  index: 1

And you can do the same for the grey days, but the class is .normal in this case (returns “3”).

- platform: scrape
  name: First Grey
  resource: https://apps.castlepoint.gov.uk/cpapps/index.cfm?roadID=2757&fa=wastecalendar.displayDetails
  select: ".normal"

The month header can be grabbed with the following (months are the second and third “h2” on the page, returns “June 2019”)

- platform: scrape
  name: First Month
  resource: https://apps.castlepoint.gov.uk/cpapps/index.cfm?roadID=2757&fa=wastecalendar.displayDetails
  select: "h2"
  index: 1

“.pink” is the css class used for the pink dates, “.normal” for the grey, and “h2” is used for the month headers. Then you just use index to select what number occurrence the element you’re looking for is, index numbers start at zero and go up from there.

1 Like

Hi

https://www.wakefield.gov.uk/site/Where-I-Live-Results?uprn=63121996

This is not my exact address but a random one but how would I scrape that please ?

There isnt a calendar on the page just text

Thanks in advance

Martyn

So here are instructions on how to accomplish this with a scrape sensor, but I will say that I was unable to get a working sensor for it as the wakefield site takes far too long to load and times out.

  1. In chrome right click the element that you want to get and select “inspect”.
  2. In the panel that just opened right click the highlighted element.
  3. Select “copy” then “copy selector”.
  4. Paste the copied selector into the select: field of a scrape sensor.

For the wakefield site, doing this on the date under “last collection” of “household waste” gives us:

#ctl00_PlaceHolderMain_Waste_output > div:nth-child(2) > div:nth-child(2) > div:nth-child(2)

Your sensor would look like this:

- platform: scrape
  resource: https://www.wakefield.gov.uk/site/Where-I-Live-Results?uprn=63121996
  select: "#ctl00_PlaceHolderMain_Waste_output > div:nth-child(2) > div:nth-child(2) > div:nth-child(2)"
3 Likes

Thats amazing, looks so easy but I just couldn’t work it out. Thank you mayker for your time on this.

Is there a way to create and test scrapes without writing them in yaml and restating Hassio each time?

You could scrape the data and then set it to a variable in the template editor and play with the template to get the correct data and then make the sensor and restart. I do that all the time to check I am trying the right options.

Unfortunately not. There are browser plugins to help find out the correct selectors though. Search Google Chrome CSS selector extension

Thanks David and Robbrad, I will look into both methods.

Thank you for explaining and possibly why I wasn’t getting anything but I will keep trying lol

Martyn

Hi,

Sorry for the delay in replying. If you use Node-red the following code will scrap the information, split it into Garden, Recycling, and General waste. Those are then sent via mqtt (which I “sense” in HA for display).

You will need to change the http request for the one you obtain when you enter your postcode, but it appears to be working for me.

[{“id”:“4737b518.c5bcec”,“type”:“inject”,“z”:“3e0d8381.5dd29c”,“name”:"",“topic”:"",“payload”:"",“payloadType”:“date”,“repeat”:“7200”,“crontab”:"",“once”:false,“onceDelay”:0.1,“x”:210,“y”:100,“wires”:[[“686629f7.10a8d8”]]},{“id”:“240a841d.08d0dc”,“type”:“debug”,“z”:“3e0d8381.5dd29c”,“name”:"",“active”:false,“tosidebar”:true,“console”:false,“tostatus”:false,“complete”:“true”,“x”:570,“y”:240,“wires”:[]},{“id”:“686629f7.10a8d8”,“type”:“http request”,“z”:“3e0d8381.5dd29c”,“name”:"",“method”:“GET”,“ret”:“txt”,“url”:"",“tls”:"",“x”:350,“y”:200,“wires”:[[“240a841d.08d0dc”,“f489f1db.0789d8”,“9109ca9c.5945c”]]},{“id”:“f489f1db.0789d8”,“type”:“html”,“z”:“3e0d8381.5dd29c”,“name”:"",“property”:“payload”,“outproperty”:“payload”,“tag”:“table[class=“mb10 wilWasteContent gardenFutureData”]”,“ret”:“html”,“as”:“single”,“x”:390,“y”:280,“wires”:[[“8836ed72.79a688”]]},{“id”:“8836ed72.79a688”,“type”:“debug”,“z”:“3e0d8381.5dd29c”,“name”:"",“active”:false,“tosidebar”:true,“console”:false,“tostatus”:false,“complete”:“true”,“x”:630,“y”:340,“wires”:[]},{“id”:“9109ca9c.5945c”,“type”:“html”,“z”:“3e0d8381.5dd29c”,“name”:"",“property”:“payload”,“outproperty”:“payload”,“tag”:“div[class=“mb10 ind-waste-wrapper”]”,“ret”:“html”,“as”:“single”,“x”:370,“y”:460,“wires”:[[“c19b6f46.cd128”,“e7255980.dc6a08”,“91d1edf5.05ad3”,“ae205d9f.5ab878”]]},{“id”:“c19b6f46.cd128”,“type”:“debug”,“z”:“3e0d8381.5dd29c”,“name”:"",“active”:true,“tosidebar”:true,“console”:false,“tostatus”:false,“complete”:“true”,“x”:710,“y”:500,“wires”:[]},{“id”:“e7255980.dc6a08”,“type”:“html”,“z”:“3e0d8381.5dd29c”,“name”:"",“property”:“payload[2]”,“outproperty”:“payload”,“tag”:“div”,“ret”:“html”,“as”:“single”,“x”:810,“y”:440,“wires”:[[“b8414891.1b716”,“ad2e3ccd.e71bf”]]},{“id”:“b8414891.1b716”,“type”:“debug”,“z”:“3e0d8381.5dd29c”,“name”:"",“active”:true,“tosidebar”:true,“console”:false,“tostatus”:false,“complete”:“payload[6]”,“x”:1000,“y”:440,“wires”:[]},{“id”:“91d1edf5.05ad3”,“type”:“html”,“z”:“3e0d8381.5dd29c”,“name”:"",“property”:“payload[1]”,“outproperty”:“payload”,“tag”:“div”,“ret”:“html”,“as”:“single”,“x”:810,“y”:380,“wires”:[[“4503744c.71f94c”,“54055bd3.a99f34”]]},{“id”:“4503744c.71f94c”,“type”:“debug”,“z”:“3e0d8381.5dd29c”,“name”:"",“active”:true,“tosidebar”:true,“console”:false,“tostatus”:false,“complete”:“payload[6]”,“x”:1000,“y”:380,“wires”:[]},{“id”:“ae205d9f.5ab878”,“type”:“html”,“z”:“3e0d8381.5dd29c”,“name”:"",“property”:“payload[0]”,“outproperty”:“payload”,“tag”:“div”,“ret”:“html”,“as”:“single”,“x”:810,“y”:320,“wires”:[[“26ffccf.40081b4”,“1bb9846e.98a20c”]]},{“id”:“26ffccf.40081b4”,“type”:“debug”,“z”:“3e0d8381.5dd29c”,“name”:"",“active”:true,“tosidebar”:true,“console”:false,“tostatus”:false,“complete”:“payload[6]”,“x”:1000,“y”:320,“wires”:[]},{“id”:“87378101.fd7128”,“type”:“mqtt out”,“z”:“3e0d8381.5dd29c”,“name”:"",“topic”:“recycling”,“qos”:“2”,“retain”:“true”,“broker”:“db107b84.66e6”,“x”:1180,“y”:420,“wires”:[]},{“id”:“1cfdde38.c152e2”,“type”:“mqtt out”,“z”:“3e0d8381.5dd29c”,“name”:"",“topic”:“garden”,“qos”:“2”,“retain”:“true”,“broker”:“db107b84.66e6”,“x”:1180,“y”:480,“wires”:[]},{“id”:“1c5197ad.e595e”,“type”:“mqtt out”,“z”:“3e0d8381.5dd29c”,“name”:"",“topic”:“waste”,“qos”:“2”,“retain”:“true”,“broker”:“db107b84.66e6”,“x”:1170,“y”:360,“wires”:[]},{“id”:“1bb9846e.98a20c”,“type”:“function”,“z”:“3e0d8381.5dd29c”,“name”:"",“func”:“msg.payload = msg.payload[6];\nreturn msg;”,“outputs”:1,“noerr”:0,“x”:1010,“y”:360,“wires”:[[“1c5197ad.e595e”]]},{“id”:“54055bd3.a99f34”,“type”:“function”,“z”:“3e0d8381.5dd29c”,“name”:"",“func”:“msg.payload = msg.payload[6];\nreturn msg;”,“outputs”:1,“noerr”:0,“x”:1010,“y”:420,“wires”:[[“87378101.fd7128”]]},{“id”:“ad2e3ccd.e71bf”,“type”:“function”,“z”:“3e0d8381.5dd29c”,“name”:"",“func”:“msg.payload = msg.payload[6];\nreturn msg;”,“outputs”:1,“noerr”:0,“x”:1010,“y”:480,“wires”:[[“1cfdde38.c152e2”]]},{“id”:“db107b84.66e6”,“type”:“mqtt-broker”,“z”:"",“name”:"",“broker”:“localhost”,“port”:“1883”,“clientid”:"",“usetls”:false,“compatmode”:true,“keepalive”:“60”,“cleansession”:true,“birthTopic”:"",“birthQos”:“0”,“birthPayload”:"",“closeTopic”:"",“closeQos”:“0”,“closePayload”:"",“willTopic”:"",“willQos”:“0”,“willPayload”:""}]

Ive run ‘go get github.com/ericchiang/pup’ on my ubuntu distribution and its installed the ‘pup’ file in /go/bin folder. Is there anything else I need to do? as it keeps coming up with

Command ‘pup’ not found, but there are 17 similar ones.

Does running /go/bin/pup work?

/go/bin/pup didn’t but /root/go/bin/pup seem to do something, just a cursor waiting for something to be entered.

59

This is what I get until I press cont+c to stop.

Sorry to be so vague but I’ve searched everywhere I could think of and couldn’t get any information hence posting here.

1 Like

Node - Red maybe the solution for some but I could not copy that code into my Node Red. I got a couple of erros with the " not been correct and then with mb10.

Not sure if its just me but I could not import it via the clipboard to even have a lool

I just tried copying it in myself and can confirm I couldn’t copy it into my work machine’s node-red.

I’ll have a look ino it when I get home tonight.

Sorry about that.