I know there are some scraping tools like Scrape or Multiscrape, which just works on very simple websites, either with basic auth. or without any authentication, but most of the time most of the sites have nowadays very powerful auth. processes, which makes those tools useless most of the time.
Is there any way on which i can create a Selenium like script, which will run on a headless browser to scrape some data from websites?
I modified the script to use the source (gaz distributor) and create a json to be pushed to HA long term statistics (which is not visible in above link though)
The solution requires some pre-installed stuff which is not available by default in HA so it uses a container
In the end the chromium thing proved instable for multiple arch to I returned to session requests (later releases). To start a chromium browser on a pre-defined container was not that easy esp. since I needed to get it to work as add-on (more complexities). But then it was straight-forward and the meters2ha above also has a section to lean on outsourced captcha solutions (really neat)
EDIT: this topic may be too wide…if you want you an send me a pm (or discord)
EDIT+: reminiscing, my examples are only focussing on getting access to a provider so not really scraping…may not apply to you