Delay scrape to wait for page to load

Saturnus · June 11, 2019, 12:30pm

While a page is loading a certain thing I want to scrape is given as n/a. And that’s the only thing scrape is grabbing before leaving of course.

Is it somehow possible to delay the scraping to wait for a page to complete its inline loading?

lolouk44 · June 11, 2019, 12:39pm

what gets loaded? If the page loads then a javascript (or other script) loads data, you won’t be able to get it I’m afraid. It’s one of the component’s limitations…

Saturnus · June 11, 2019, 12:52pm

I am not very familiar with it, but I think AJAX? Takes about 5 seconds. But as expected scrape is not waiting for ‘n/a’ to change to a number.

lolouk44 · June 11, 2019, 1:02pm

check this page to see if the method applies to what you’re trying to do:
http://toddhayton.com/2015/03/11/scraping-ajax-pages-with-python/

Saturnus · June 11, 2019, 3:15pm

Thanks, that might be something for the future.

On a more general note, some weeks ago I tested scrape and it seemed to be doing well on a static page. Just now tried various more pages and the result is just nothing. What an absolutely horrible component it is.

lolouk44 · June 11, 2019, 3:53pm

it takes a bit of learning to get the data out, but once you get the idea it’s pretty cool.
Also depending on what you want, you may be better off installing beautifulsoup and running your own script then send the data to HA via MQTT (that’s what I did)…

oliverdog · August 17, 2020, 2:28am

I have another example that needs a delay before scraping!

I am trying to scrape Energy Production from EMA website (apsystems):
https://www.apsystemsema.com/ema/intoDemoUser.action?id=0b28481b73ec0fd90173f4ee8ab51ffd&locale=pt_BR

The code works fine:

  - platform: scrape
    resource: https://www.apsystemsema.com/ema/intoDemoUser.action?id=0b28481b73ec0fd90173f4ee8ab51ffd&locale=pt_BR
    name: Energia Hoje
    select: "#today"
    unit_of_measurement: "kWh"
    scan_interval: 10

But it gives me “0 kWh” because it takes 500ms to load the value!!!

Did you get success @Saturnus?

Saturnus · August 20, 2020, 2:44pm

No support for delay with the build in stuff.

manjotsc · May 22, 2021, 10:06am

Any update on delay function?