I’m trying to use the multiscrape custom integration to pull in weather data from my weather station that hosts a local webpage with the info I want (no login or anything). Seems to be about as basic of a webpage as you can get, yet I’m not having any luck. Here’s a screenshot of the page with a value I’m trying to grab highlighted:
When I copy selector for that value, I get:
body > table > tbody > tr:nth-child(16) > td:nth-child(2) > input
And here’s my resulting YAML:
multiscrape:
- name: HA scraper
resource: http://192.168.1.81/livedata.html
scan_interval: 30
sensor:
- unique_id: multiscrape_test_temperature
name: Multiscrape Test Temperature
select: "body > table > tbody > tr:nth-child(16) > td:nth-child(2) > input"
I’ve tried with the above, as well as removing ‘tbody’ and ultimately even ‘body’, ‘table’ and ‘input’ with no luck; the sensor is always ‘unavailable’ (I will mention that with one or more of the many variations I tried, it would periodically NOT be unavailable, but also didn’t have a state). Is there something super-obvious I’m missing (I’m sure there is)? Obviously I can’t actually share the link since it’s a local webpage, but if more info is needed just ask.
I don’t have advice on the web scrape but there is an ambient weather integration.
Yeah, I used that years ago but it’s both cloud-based and also obviously requires that you’re sending your station data to them, which I’m not. I’ve come across other potential local options in the past, but they’re all for newer/different weather stations than what I have and thus don’t work.
Unfortunately my weather station isn’t supported, it’s too old.
WS-0900-IP
I think that was one of (or THE) other local option(s) that I checked out in the past.
Tried using the free ChatGPT?
I tried to use it to get a scrape of my monthly internet data usage from my router, but apparently there’s an issue with the way the selector is formatted that prevented it from working. But I was impressed that by pasting the page source and clearly explaining what I was trying to solve, that ChatGPT came up with the YAML code that I believe would have worked. I also used it to code the sensors for using SNMP to get my monthly data usage and if you give it any errors generated it searches for the reason and offers up corrected code.
From the multiscrape integration web page:
If you don’t manage to scrape the value you are looking for, please enable debug logging and log_response
. This will provide you with a lot of information for continued investigation. log_response
will write all responses to files. If the value you want to scrape is not in the files with the output from BeautifulSoup (*-soup.txt), Multiscrape will not be able to scrape it. Most likely it is retrieved in the background by javascript. Your best chance in this case, is to investigate the network traffic in de developer tools of your browser, and try to find a json response containing the value you are looking for.
Sponsor me here, and I’ll try to assist you with your multiscrape
configuration within 1-2 days. The support funds will go towards family time, making up for the hours I spend on Home Assistant
.
Good point, I had planned to do that but didn’t get a chance to yesterday.
I get the following error: Unable to scrape data: Could not find a tag for given selector
The log file includes the data I’m after, which would seem to confirm that it should be possible.