2022.12: Scrape and scan interval

Ugh… why scrape, why scrape.

looking at 20’s scrape sensors to re-add :unamused:

3 Likes

Scan interval seems to be missing from Scrape’s UI config; also can’t find it in .storage/core.config_entries

The only timespan one can set is “timeout”.

Were you able to set it?

Yep, no “scan interval”.
For now I’m using it as timeout.

timeout integer (optional, default: 10)
Defines max time to wait data from the endpoint.

Should work the same way.

Ps. It would be quite helpful, in Scrape’s UI, if the name of the sensor came up, instead of the resource-url…

Ps2. Hmmm… the start time went from 60.1s to 5.9s, after the change of scrape from yaml. :face_with_raised_eyebrow:

Timeout is substantially different imo.

Every scan_interval seconds a request is made to the resource, waiting timeout seconds for a response.

So even if you set timeout to 86400 seconds (a day), there will be much more requests (depending on default scan_interval)

That is because the HA standard does not allow to set scan interval. It is being removed from all integrations. A way around it is to disable auto update and calling homeassistant.update_entity service with your own frequency/automation.

I would be ok with moving Scrape to the GUI but why it is not being imported automatically like for other integrations that moved to GUI? Will it be coming at later stage?

5 Likes

Thank you, this is good to know.
This will potentially lead to some requests running into rate limiting if left with the global updating…

It is already a case in some cloud integrations where APIs have some limits… I’m not really sure why this has been removed, maybe HA core team would be able to remind/explain?

1 Like

I’ve only had 4 sensors and the migration was just simple. I’ve removed it first from the yaml to avoid creating duplicate entity ids with _2. Worked very well. There is actually one added value: if you are scraping more than one sensor from the same resource (web page) you can now group them under one integration so I’m assuming that the page will be opened only once for all sensors. Quite elegant I must say (although this was I think possible in yaml as well).

1 Like

homeassistant.update_entity is configurable per entity, so rate limiting is in the hands of the user.

homeassistant.update_entity forces an entity update, but I’d have to entirely disable updates for each entity first. How is that done? I could not find anything in the docu, just desperate forum posts :slight_smile:

https://community.home-assistant.io/t/2022-12-it-does-matter/499441/81

Yes, how do you disable auto update?

1 Like

Disable polling in the system options for that entity (I haven’t updated to 2022.12 just yet, so I don’t have the new UI-based scrape integration, but the setting should be the same as any other UI-configured polled entity:

  1. Click the three-dot menu in the scrape integration card for that entity, in the Integrations page of your Home Assistant Settings
    Screen Shot 2022-12-08 at 12.00.00 PM

  2. Turn off “Enable polling for updates”

Congratulations! :slight_smile: Now you’re responsible for updating this entity (for example, with an automation calling the homeassistant.update_entity service). I like the flexibility to change how often I update an entity like this; I can base it off other conditions (e.g. poll a radio station website frequently if I’m actively listening to it so I can get song title and artist info, otherwise, don’t bother polling at all).

3 Likes

Both UI and YAML setup is still supported while YAML provides additional configuration possibilities.
See Scrape - Home Assistant

Yes, I have read that. But will it be automatically moved to UI at some point in time like it has been done for many integrations in the past?

What will happen with those additional possibilities when yaml is not supported anymore? Or maybe there are no plans to drop the yaml support?

Thank you very much, works like a charm! A little bit more cumbersome than the scan_interval approach, but I really like the full flexibility a lot!

1 Like

Or if a deprecated functionality (like SCAN_INTERVAL) is never being ported to the UI?

Scan interval is removed in the ui because the intention is for the user to create an automation to force updates using whatever trigger they want.

1 Like

Thanks.

Do you happen to have a time pattern that runs every hour (or 3600 seconds) after the previous run. So not at a fixed time at the hour. So basically it would run when HA is started, and then exactly 60 minutes afterwards?

hours: *
minutes: 0
seconds: 0

This will run every hour at a fixed time