This thread shows that errno -3 is related to a temporary failure in name resolution. I have no idea what is causing that.
I’ve thought it’d be interesting a config option within multiscrape to not change the state to unknown if value can’t be updated, but just do nothing (keep the old value).
That’s already possible, check the on-error functionality.
That might lead me to the issue.
I have a custom dns within pihole and that might be giving some issues.
I imagine something like it might be quering my custom dns and since not finding it, giving the error instead of falling back to the secondary dns. And therefore only giving proper values when the name resolution is cached (i.e: cause I opened HA and the iframe loaded the value).
Ill dig more into it
Also going to check out the onerror, since id also like such behaviour
Here is the relevant portion of the main HA log with debugging enabled:
2023-06-16 17:04:48.852 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Scraper_noname_1 # New run: start (re)loading data from resource
2023-06-16 17:04:48.852 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Scraper_noname_1 # Deleting logging files from previous run
2023-06-16 17:04:50.600 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Scraper_noname_1 # Rendered resource template into: https://tempestwx.com/map/45085/41.3807/-85.0447/17
2023-06-16 17:04:50.600 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Scraper_noname_1 # Request data from https://tempestwx.com/map/45085/41.3807/-85.0447/17
2023-06-16 17:04:50.600 DEBUG (MainThread) [custom_components.multiscrape.http] Scraper_noname_1 # Executing page-request with a get to url: https://tempestwx.com/map/45085/41.3807/-85.0447/17.
2023-06-16 17:04:50.857 DEBUG (MainThread) [custom_components.multiscrape.http] Scraper_noname_1 # request_headers written to file: page_request_headers.txt
2023-06-16 17:04:51.186 DEBUG (MainThread) [custom_components.multiscrape.http] Scraper_noname_1 # request_body written to file: page_request_body.txt
2023-06-16 17:04:53.829 DEBUG (MainThread) [custom_components.multiscrape.http] Scraper_noname_1 # Response status code received: 200
2023-06-16 17:04:53.905 DEBUG (MainThread) [custom_components.multiscrape.http] Scraper_noname_1 # response_headers written to file: page_response_headers.txt
2023-06-16 17:04:53.923 DEBUG (MainThread) [custom_components.multiscrape.http] Scraper_noname_1 # response_body written to file: page_response_body.txt
2023-06-16 17:04:53.923 DEBUG (MainThread) [custom_components.multiscrape.scraper] Scraper_noname_1 # Loading the content in BeautifulSoup.
2023-06-16 17:04:53.958 DEBUG (MainThread) [custom_components.multiscrape.scraper] Scraper_noname_1 # page_soup written to file: page_soup.txt
2023-06-16 17:04:53.958 DEBUG (MainThread) [custom_components.multiscrape.coordinator] Scraper_noname_1 # Data succesfully refreshed. Sensors will now start scraping to update.
2023-06-16 17:05:10.014 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Today 1 # Setting up sensor
2023-06-16 17:05:10.029 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Yesterday 1 # Setting up sensor
2023-06-16 17:05:10.056 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Today 1 # Start scraping to update sensor
2023-06-16 17:05:10.058 DEBUG (MainThread) [custom_components.multiscrape.scraper] Scraper_noname_1 # MS Tespest WX Rain Today 1 # Tag selected: None
2023-06-16 17:05:10.058 ERROR (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Today 1 # Unable to scrape data: Could not find a tag for given selector
Consider using debug logging and log_response for further investigation.
2023-06-16 17:05:10.062 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Today 1 # On-error, set value to None
2023-06-16 17:05:10.062 DEBUG (MainThread) [custom_components.multiscrape.entity] Scraper_noname_1 # MS Tespest WX Rain Today 1 # Updated sensor and attributes, now adding to HA
2023-06-16 17:05:10.107 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Yesterday 1 # Start scraping to update sensor
2023-06-16 17:05:10.108 DEBUG (MainThread) [custom_components.multiscrape.scraper] Scraper_noname_1 # MS Tespest WX Rain Yesterday 1 # Tag selected: None
2023-06-16 17:05:10.108 ERROR (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Yesterday 1 # Unable to scrape data: Could not find a tag for given selector
Consider using debug logging and log_response for further investigation.
2023-06-16 17:05:10.111 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_1 # MS Tespest WX Rain Yesterday 1 # On-error, set value to None
2023-06-16 17:05:10.111 DEBUG (MainThread) [custom_components.multiscrape.entity] Scraper_noname_1 # MS Tespest WX Rain Yesterday 1 # Updated sensor and attributes, now adding to HA
I obviously also have the log_response data but there’s a lot here already and I don’t want to data dump it all at once so tell me what else I need to post and I will.
Is there a way to make value_template return an array? I’ve been playing with different ways to do it - but it seems like its always returning a string:
Here is my current config:
multiscrape:
# Debug with: https://try.jsoup.org/
- name: IQAir Scraper
resource: https://www.iqair.com/usa/colorado/colorado-springs
scan_interval: 600
log_response: true
sensor:
- unique_id: ms_iq_air_0
name: IQAir Now
select_list: ".pollutant-level-wrapper b"
value_template: "{{value.split(',')[4] | int }}"
select: "forecast"
attributes:
- name: Main Pollutant
select: ".aqi-overview-detail__main-pollution-table td:last-child"
- name: Array_test
select: ".pollutant-level-wrapper b"
value_template: "{{[1,2,3,4]}}"
- name: Raw Data
select_list: ".pollutant-level-wrapper b"
value_template: |
{%- set values = value.split(',') -%}
{%- for x in values %}
- {{ x | int}}
{%- endfor -%}
- name: History
select_list: ".pollutant-level-wrapper b"
value_template: "{{value.split(',')[0:4] | reverse | list | map('int') | list }}"
- name: Forecast
select_list: ".pollutant-level-wrapper b"
value_template: "{{value.split(',')[5:] | map('int') | list}}"
But no matter how I swing it if I try to get a value:
-- DATA
{{ states.sensor.ms_iq_air_0.attributes.array_test[2]}}
{{ states.sensor.ms_iq_air_0.attributes.history}}
{{ states.sensor.ms_iq_air_0.attributes.history[0]}}
{{ states.sensor.ms_iq_air_0.attributes.history | list }}
{{ states.sensor.ms_iq_air_0.attributes.history.split(',') }}
--
If the answer is no I can happily live with it - but I figure there should be a way to get an array of ints…
running the is string on all the stuff is always returning true
Use the Beautiful Soup capture (a .txt file in a multiscrape directory in your condig directory). I recently had a case where it helped me. This is the HTML that is getting parsed. You might be surprised, but what your browser will show isn’t always identical to what will be parsed here. See the integration’s docs for more info.
Thanks for the replies…I’ve been out of town for a few days…
Right, I’m trying to just “borrow” some data from someone else’s weather station so I don’t have a key/token.
I tried that but I can’t seem to find the data field I’m looking for in that capture. I guess I’m not sure what I’m looking for to find something in a sidebar but not in the main page.
Then I think your luck is out, because that capture is literally what the parser sees. I had a recent case like that too. The issue will be a dynamic call that executes after page load, so the content isn’t actually there until later.
You’ll need to check in the browser dev tools which calls are being made during page load and see if you can use one of those to call directly.
I was specifically looking at attributes which also appear to be strings… I can work around it … but just a tad confused as how to make an actual attribute which is an array of “stuff”
There are a few ways. You can store an actual array in an attribute (but not the main state), or you could serialise the data as a string yourself and put it as the main state (making it a CSV list, for example, or even as JSON), but you then need to parse it every time you use it.
Hi There, just started with multi scrape, I have the following being returned:
Tag selected: <span class="jsx-3024714417 figure">ESE<!-- --> <!-- -->0<!-- -->km/h</span>
Selector result: ESE 0km/h
Final selector value: ESE 0km/h
Selected: ESE 0km/h
Could someone help me understand value_template, I would like to just get the “0” as wind_speed, and then add “ESE” as an attribute, e.g. “Wind Direction” or something. Thanks!
EDIT: I just need to work at it a little bit more really
Get wind speed:
value_template: “{{ value.split(’ ')[1] | replace(‘km/h’, ‘’) }}”
then using attributes:
value_template: “{{ value.split(’ ')[0] }}”
select: "#__next > div.jsx-1060093929.grid-container.weather-forecast.for.tauranga > div.jsx-1060093929.grid-width-12.grid-wrapper.section-wrapper-home > div.jsx-66893679.forecasts-top-block-wrapper.grid-width-12 > div > div > div.jsx-66893679.forecast-summaries.grid-container.grid-width-12 > div.jsx-66893679.current-conditions.grid-item.grid-width-4 > div.jsx-3619355618.conditions-slider > div > div > div > div.slick-slide.slick-active.slick-current > div > div > div > div.jsx-3024714417.bottom-section.grid-item.grid-width-4.grid-container-3 > div.jsx-3024714417.details.grid-item.grid-width-2 > span:nth-child(5)"
Possibly, but you’ll need to do some more work. You only need to ensure that at each level in the HTML document hierarchy that each part of the selector is unique. In other words, if an element only contains one element, say a div containing a div, and it’s the only child, you don’t need to include classes, IDs, etc. to identify it, since it will already be uniquely identifiable.
Hi everybody,
I was trying to automate water usage reporting through a website of local water provider (should be done every month or every second month).
Managed to get in HA water meter readings using wmbus and dedicated meter wireless extention + ES+CC1101 module ( zibous/ha-watermeter ).
Now when having current reading as one of entities value I would like to setup reporting it on website requiring logon (one form submit) using submitting the form with given value (second form submit).
First issue i got is the possibility to use entity’ value in the mutliscrape config (to put it in form input field). Tried few forms but can’t find the valid one (if possible at all?), it’s this line:
Second unknown by now will be possibility to submit reporting form after using initial logon one, but before trying this i would like to solve first issue…