I think i already tried that but it didn’t work.
I found an other way to do what i wanted with node-red, but thanks anyway.
Hi @wigster, thanks for writing this up!
Since the release of browserless v2, it seems the debugger is no longer available in the browserless-chrome addon.
I want to write my first scripts based on yours but I have no idea where to place them in the HA file system. Any advice?
Alas I upgraded only yesterday and have hit the same problem. From the documentation it seems as if the debugger should be there, so maybe it’s a misconfiguration of the container?
The hosted demo server at chrome.browserless.io still runs 1.61, so maybe it’s a bit early to try to migrate? It seems like the project is in transition without taking hostages. IN any case, you could try to prototype there.
But some minor changes do seem to be necessary in the form of the function call, see
browserless/MIGRATION-2.0.md at main · browserless/browserless (github.com)
Thank you for this guide, it’s helped me solve a problem that’s vexed me for some time!
I had problems calling the /function
endpoint using json in the commandline (it always caused an error in the add-on about no matching WebSocket route handler), but the endpoint also supports receiving the JS code as a binary file upload. This approach avoids the need to minify and escape quotations marks, so I think it’s a bit easier. The curl commandline in this case looks like
curl --data-binary @/config/myfunction.js -H 'Content-Type: application/javascript' http://localhost:3000/function
I used the debugger on the demo server, and it seems to be generating the 2.0-style browserless functions. I prototyped there and ran the resulting function in the add-on with no issues.
That’s great to know. Can you share the yaml configuration that allows the binary JS file to be used by multiscrape?
I am still trying to figure out how best to do this, but I do not think you can through multiscrape.
Right now, I have a shell script that is a curl command like the above from @atlflyer. I then call it from the automation and send the output to a file in the /config/www folder, which is accessible as a public url. I then call the multiscrape service form the automation to update that set of sensors and multiscrape is pointed to scrape the file sitting in /config/www.
Once I am happy with the results, I will update the guide.
Have a look at the guide – I’ve updated it for how I have it running on my machine now with browserless v2. There might be a nicer solution though – suggestions welcome.
Thank you for the excellent writeup!
I’m trying to use this method for scraping, but I can’t get the shell command to work. I’ve created a “scripts” folder and put a “browserless_scraper.sh” in the folder, and have this in my configuration.yaml:
shell_command:
browserless_scraper: "./scripts/browserless_scraper.sh {{function}} {{output}}"
I can find the shell command in home assistant, but when I try to run it, I always get this error logged: FileNotFoundError: [Errno 2] No such file or directory: ‘./scripts/browserless_scraper.sh’
I’m using HomeAssistant on an intel nuc, and I’m guessing I need to specify the path to the folder containing the script somehow differently, but I can’t figure out how.
If I open a terminal through the HA UI, I can see the file, and run it. I’m not all that familiar with the docker containers and what file structure is accessible from where
I have a standard HAOS installation running on an arm machine. In my case you can attach to the docker container of HA by sshing into the machine and sending:
docker exec -it homeassistant /bin/bash
this opens a bash shell in the container and I believe the environment, paths etc should be the same as what HA sees. In my case, the /config directory is the default path, but maybe that’s not the same for everyone.
You could try:
shell_command:
browserless_scraper: "/config/scripts/browserless_scraper.sh {{function}} {{output}}"
So maybe this is he problem – I’ve created the scripts directoty inside /config. Is that what you have?
If I ssh into my home assistant I come to the “Welcome to the Home Assistant command line”, and the prompt starts with [core-ssh ~]$. I can’t run the docker command as it say “command not found”. This is what I see in terms of directories, and I have the script under config/scripts
Still, I get this when trying to run it:
FileNotFoundError: [Errno 2] No such file or directory: ‘/config/scripts/browserless_scraper.sh’
I had the same issue.
Adding the word ‘bash’ to the line:
browserless_scraper: "bash ./scripts/browserless_scraper.sh {{function}} {{output}}"
Fixed it for me.
It did! Thank you very much for the help!
I made a code and tested it in https://chrome.browserless.io/.
Then I copied the code to a file js_scrapers/warema_zip_runter.js.
Studio code does not like the file:
export default async ({ page }: { page: Page }) => { …
“Type annotations can only be used in typescript files”
So I changed the filename to warema_zip_runter.ts
now a different error appears in Studio Code: "Cannot find name ‘Page’
Then I tried the service: shell_command.browserless_scraper,
I get an error in my_output.html: “missing ) after argument list”
If I change the export call to: export default async ( page ) => { … and try again another error appears: “cannot find page.goto” in my_output.html…
the installation with http://homeassistant.local:3000 is working.
This is my code warema_zip_runter.ts:
// Full TypeScript support for both puppeteer and the DOM
// https://chrome.browserless.io/
export default async ({ page }: { page: Page }) => {
const optionAll= {
wa_E5:"Alle runter",
wa_E6:"Zip runter",
wa_E7:"Dach runter",
wa_E8:"Zip hoch",
wa_E9:"Alle hoch"
}
const option="wa_E6"
const startTime = Date.now();
await page.goto('http://webcontrol-warema.t9lewrtkt6fgp5wt.myfritz.net:80', { timeout: 60000 });
const endTime = Date.now();
const durationInSeconds = (endTime - startTime) / 1000; // Umwandlung von Millisekunden in Sekunden
console.log(`Duration for webcontroll-warema: ${durationInSeconds} seconds`);
//delay
function delay(time) {
return new Promise(function(resolve) {
setTimeout(resolve, time)
});
}
console.log('warten 10s ladezeit warema');
await delay(10000)
console.log(`start: ${ optionAll[option] }`)
// Den Wert des Select-Elements setzen
await page.evaluate(`document.getElementById("KanalSelectBox").getElementsByTagName("select")[0].value = "${option}"`);
// Ein "change" Event auslösen, um das Dropdown-Menü zu aktualisieren
await page.evaluate("document.getElementById('KanalSelectBox').getElementsByTagName('select')[0].dispatchEvent(new Event('change'))");
// click auslösen geht nur bei szenen, sonst müßte schieber bewegt werden
await page.evaluate("document.getElementById('btn-start').click()");
// Logs show up in the browser's devtools
console.log(`beendet ${option}: ${ optionAll[option] }`)
return `beendet ${option}: ${ optionAll[option] }`
};
Any help welcome!
I made 2 errors:
- I installed only browserless Chrome und forget the Repository Updater from alex belgium
- then it works with pure js page in my js-file “export default async ({ page }) => {…”
It is working!
The browserless add-on is stuck on 2.2.05 and has no option to update to the latest 2.8.0. Is this a Home Assistant bug? And there is no manual installation process for add-ons.
I tried the “Repository Updater” but I just get an error:
CRITICAL: API request was denied despite using an API token. Missing scopes? Expired token? Invalid token?
What does this do? Why isn’t the repository updated automatically?
Do you have an ARM architecture CPU? Apparently browserless do not release the ARM architecture Docker images often, so that Docker gets updated very rarely. I do not understand if that’s by omission or whether there is a fundamental problem with their ARM implementation.
My HA is running on an ARM chip so I have the same problem. I have ended up running the browserless Docker container on another (Intel) machine and having HA make requests to there.
The changelog for the Browserless Chrome add-on now says not to update and to “switch to chromium” without any further explanation of what that might mean. Does anyone know what’s going on?
I asked if Alex belgium could look at the arm image.
You can update at your own risk.