Have postet this one before a while ago: I have tried to use select command, but in this case I have no clue which word I could look for as the word I am trying to use is shown several times. Maybe somebody could help with this to get this values out to use in HA. I would need the three yellow marked figures in the screenshot.
The weblink for the site is http://77.119.243.51:86
I am afraid it will not be very easy as there are no selectors to find. You could try to search for all elements like in the example of the scrape integration:
# Example configuration.yaml entry
sensor:
- platform: scrape
resource: http://77.119.243.51:86
name: Temperature Cellar
select: "td"
index: 60 #experiment with this number to find the correct value
unit_of_measurement: "C"
Once you find the correct TD element it should be easy to clone this code and get the next value.
Hi Alekxsy,
thanks for your answer, I am not sure what you mean to clone the code and get next value? I can say that I am not an expert in this topic, so I feel it will be too complicated for me to get that to work.
So far I have added the example you gave me to my config, I have restarted HA but don’t see a sensor/entity called Temperature Cellar yet, so maybe I am doing something wrong.
In theory this should work, but I am getting an error in the logs… something to do with headers. Possibly related to the port being different…
sensor:
- platform: scrape
resource: "http://77.119.243.51:86"
name: Temperature Cellar
select: "td"
index: 7 #experiment with this number to find the correct value
value_template: '{{ ((value.split(" ")[0]) | replace (",", ".")) }}'
unit_of_measurement: "°C"
- platform: scrape
resource: "http://77.119.243.51:86"
name: Hunidity Cellar
select: "td"
index: 8 #experiment with this number to find the correct value
value_template: '{{ ((value.split(" ")[0]) | replace (",", ".")) }}'
unit_of_measurement: "%"
- platform: scrape
resource: "http://77.119.243.51:86"
name: Air Pressure Cellar
select: "td"
index: 9 #experiment with this number to find the correct value
value_template: '{{ ((value.split(" ")[0]) | replace (",", ".")) }}'
Hi and thanks for your response!
I have added your config to my setup , but I get the following error after restarting HA:
Logger: homeassistant.components.rest.data
Source: components/rest/data.py:69
Integration: RESTful (documentation, issues)
First occurred: 20:24:39 (12 occurrences)
Last logged: 20:28:14
Error fetching data: http://77.119.243.51:86 failed with illegal chunk header: bytearray(b’F9 \r\n’)
Error fetching data: http://77.119.243.51:86 failed with
maybe you know what that means… I have no clue to be honest
If someone could help that would be great. Thank you!!
Trying to get the current outdoor temp on this site. I am unable to get this to work.
Website is ambientweather.net/dashboard/kbck
Here is the select:
#root > div > div.page-container > div > div > div > div.device-device-realtime-dashboard > div > div > div.device-widget.square.temp > div.device-temp-widget.center-aligned > div > div.top > span > span.fdp-val
Not sure how to fix. Any ideas?
looks like the data is loaded via javascript, see page source:
<noscript>Sorry, Javascript must be enabled to use the Ambient Weather Dashboard.</noscript>
That means the data is not available when the page html conten is loaded and is loaded afterwards via javascript. that means you can’t scrape it.
May sugest either using one of the multiple weather integrations already available or create your own since Ambient Weather has an API
Thank you. I didnt notice that. dang.
Hi all, I need some help with my scraping.
I’m trying to extract the values of my solar panel controller web page. It used to work but I updated multiscrape and now it doesn’t support the property index anymore.
My web page is like that:
<html>
<head>
<meta http-equiv=pragma content=no-cache>
<meta http-equiv=expire content=now>
<title></title>
</head>
<body bgcolor=ffffff text=black><br><br>
<table align=center border=1 cellpadding=0 cellspacing=0 bordercolor=#008000 bordercolorlight=#ffffff borderdark=#808000 width=1024>
<center>
<tr bgcolor=#43CD80>
<td align=center>Inverter ID</td>
<td align=center>Current Power</td>
<td align=center>Grid Frequency</td>
<td align=center>Grid Voltage</td>
<td align=center>Temperature</td>
<td align=center>Date</td>
</tr>
</center>
<center>
<tr>
<td align=center>404000066234-A</td>
<td align=center> 62 W</td>
<td align=center> 50.0 Hz</td>
<td align=center> 233 V</td>
<td align=center> 23 <sup>o</sup>C</td>
<td align=center> 2022-06-07 11:38:22</td>
</tr>
.....
</table><br><br>
<hr></hr><center><tr><td>©2013 Altenergy Power System Inc.</td></tr></center>
</body>
</html>
I’m trying to extract the values in rows after the header. I tried body > table > tr:nth-child(2) > td:nth-child(1)
and also table > center > tr:nth-child(2) > td:nth-child(5)
but I keep getting errors like:
2022-06-10 12:54:34 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_0 # Panel 1 Name # Start scraping to update sensor
2022-06-10 12:54:34 DEBUG (MainThread) [custom_components.multiscrape.scraper] Scraper_noname_0 # Panel 1 Name # Tag selected: None
2022-06-10 12:54:34 ERROR (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_0 # Panel 1 Name # Unable to scrape data: Could not find a tag for given selector.
Consider using debug logging and log_response for further investigation.
2022-06-10 12:54:34 DEBUG (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_0 # Panel 1 Name # On-error, set value to None
2022-06-10 12:54:34 DEBUG (MainThread) [custom_components.multiscrape.entity] Scraper_noname_0 # Panel 1 Name # Updated sensor and attributes, now adding to HA
Any idea what I’m doing wrong?
Thanks a lot, appreciate any help.
Could you enable file logging and post the HTML page as logged?
I’m using multiscrape to try to get my Logitech mouse battery level. The app I’m using provides this XML file:
<xml>
<device_id>devxxxxxxxxx</device_id>
<device_name>PRO X Wireless</device_name>
<device_type>Mouse</device_type>
<battery_voltage>-0,00</battery_voltage>
<battery_percent>100,00</battery_percent>
<charging>False</charging>
</xml>
My current code is this: (I used the Chrome inspect copy selector to get the selector)
multiscrape:
- resource: http://mypcIP:12321/device/dev4f7137224093c0940000
name: PRO X Wireless
scan_interval: 60
sensor:
- name: PRO X Wireless Battery level
unique_id: pro_x_wireless_battery_level
icon: mdi:mouse-bluetooth
select: "folder0 > div.opened > div:nth-child(5) > span:nth-child(2)"
log_response: true
I’m getting the error PRO X Wireless # PRO X Wireless Battery level # Unable to scrape data: Could not find a tag for given selector
Any idea what I’m doing wrong?
I can’t test it but just try:
select: battery_percent
That worked, thanks so much!
Is there a reason why the copy selector method didn’t work? I used that before and I had good results.
Because it’s XML instead of HTML.
Good to hear it works!
I have one more question, the scraper works perfectly, but it generates a lot of errors when my PC is off. Which makes sense of course, but is there a way to suppress errors for one “scrape” without just suppressing all Multiscrape errors?
Hello, I’ve been trying to retrieve the status of urbackup for hours. The URL is local and can be scraped, but so far I have only succeeded with the selector div > div:nth-child(1)
, but here I only get the alt-text for the top right button (“Toggle navigation UrBackup”).
If I copy the selector via Chrome or Firefox, I get for example #status_table > tbody > tr.even > td:nth-child(5)
or #status_table_wrapper > div.dataTables_scroll > div.dataTables_scrollHead
. And very reliably the message ‘unavailable’. Also all attempts with #body, #tbody, #root
or other selectors after div
lead to the message ‘unavailable’.
Does anyone have a tip how I could somehow get into the table?
When I look in the log, I find the following messages:
- “Unable to scrape data: Could not find a tag for given selector Consider using debug logging and log_response for further investigation.”
- “homeassistant.exceptions.InvalidStateError: Invalid state encountered for entity ID: sensor.urbackup_server. State max length is 255 characters.”
This is strange, because the ID has a normal length:
- resource: http://192.168.xx.xx:55414
scan_interval: 3600
sensor:
- unique_id: urbackup-server
name: Urbackup RP4
select: '.div.dataTables_scrollBody > table > tbody > tr:first-child'
It seems to be java generated, so I solved it with Python: GitHub - uroni/urbackup-server-python-web-api-wrapper: Python wrapper to access and control an UrBackup server
Hi,
anyone for helping me.
I try to get the EPEX DAM value from this site : Paramètres d'indexation d'électricité | ENGIE
but I cannot get the value, I still get this error : * Index ‘0’ not found in sensor.epex_dam