Scrape a text without CSS

I have to scrape a web page without any CSS structure.
This is the example.

master#0.53 kW#33.8#406.8#2752.2#114551.5#0#0#1;7;0.28;16.5;56678.6;132;1#2;7;0.25;17.3;57871.2;2097185;3#x#

I need to scrape the “0.53” value (It could be 15.32 for example, 5 chars at all).

How can I use “master#” as pattern in select item?

Regards,
Graziano.

You mean HTML not CSS.

If that’s a text response, use a rest sensor. Is it always on the first line?

I’m sorry. The page seems a well formed html page…

<html>
<head></head>
<body>
master#10.08kW#34.2#407.2#2752.6#114551.9#0#0#1;7;0.04;16.6;56678.8;4;1#2;7;0.05;17.5;57871.4;2097153;3#x#
</body>
</html>

I need to extract the 10.08 text.

Regards,
Graziano.

A simple regex should work if the string you need always starts with a ‘#’ and ends with ‘kW’. I created this scrape sensor with your example html as the input:

And I get 10.08 as the sensor value:

Here’s the important bit of the sensor:
{{value | regex_findall_index('master#(.*)kW')}}

1 Like

Once you have that working, go and slap whoever decided that octothorpe-delimited multi-line text in an HTML body was an appropriate way to share data.

Ok Gonzotek,
this morning I figured out the situation.

I have to read a simple txt file with the string I have indicated.
But for some reason HA fails to authenticate and returns me <DOCTYPE… (rest) or Unauthorized (scrape). The rest returns the first 12 char and scrape the content.

I have to discover because I cannot authenticate…

Regards,
Graziano.

Ok, guys. I dit it…

image

I have to use digest auth, convert . to , and some other stuff… but I did it.

Tip: I encounter a problem with hash char and I solved inverting " with ’ in order to use the escape char before #.

value_template: '{{ value | regex_findall_index("master\#(.*)kW") | replace (".",",") }}'

Thank you very much.

Do you know because if I use “unit_of_measurement: kW” it returns me that the parameter is not valid in rest??

Regards,
Graziano.

I suspect that’s because you’re replacing the decimal point with a comma which is making it not a number, and thus conflicting with the unit_of_measurement setting. Suggest you don’t do that replacement: your locale setting should do that reformatting in your front end display.

No…

The system cannot restart because the configuration is not valid: Invalid config for [rest]: [unit_of_measurement] is an invalid option for [rest]. Check: rest->rest->0->unit_of_measurement.

  - authentication: digest
    resource: !secret sunwaysURL
    username: !secret sunwaysID
    password: !secret sunwaysPW
    unit_of_measurement: kwh
    scan_interval: 30
    sensor:
      - name: Sunways
        value_template: '{{ value | regex_findall_index("master\#(.*)kW") }}'```

Put the unit_of_measurement in the sensor: definition, not in the top block.

  - authentication: digest
    resource: !secret sunwaysURL
    username: !secret sunwaysID
    password: !secret sunwaysPW
    scan_interval: 30
    sensor:
      - name: Sunways
        unit_of_measurement: 'kWh'
        value_template: '{{ value | regex_findall_index("master\#(.*)kW") }}'

Note “kWh” with a capital W.

Ahhhhgggrrr…

Right!!! Not in rest… but in sensor section… It works!

Thanks a lot!

Regards,
Graziano.

1 Like