Creating a Home Assistant Web Scraper Sensor

debsahu · December 2, 2018, 1:09pm

Here is a demonstration of using lxml for scraping a website to extract essential HTML data and pass the data as sensor to Home Assistant.

Source Code: https://github.com/debsahu/lxmlWebScraper

Video: https://www.youtube.com/watch?v=KUYVLubFplM

nickrout · December 2, 2018, 8:48pm

As home assistant already has a web scraper, maybe you could explain how this one differs and what the pros and cons are compared to the built in.

debsahu · December 2, 2018, 9:06pm

There are many full blown scapers, scapy, beautiful-soup, lxml etc. The one included is HA is just designed for simple web pages. The example I used here scapes a simple HTML page, but lxml is capable of much more.

nickrout · December 2, 2018, 9:12pm

Thank you, so lxml is the point of differentiation. Useful to know. when I get a spare few hours to learn something new I’ll look at lxml. Cheers and good to see contributions coming in all the time.

gsksim · December 25, 2018, 1:03am

hi,

how to install lxml in hassio?

debsahu · December 25, 2018, 6:45am

I’m not familiar with docker. I gave up on hassio, using regular hassbian on my end