Scrape Local Basic, Text-only Sensor Page

Greetings!

Apologies if this has been asked, I did try searching beforehand… The biggest difference between my scenario and other posts is that all the data is on a single page (no separate HTML tags, no different URL for each sensor on the package, no consistent delimiter between values, etc.).

I built AirGradient’s DIY air quality monitoring sensor package and used Jeff Geerling’s customized firmware to have it report the output regularly to a web page. I have Prometheus running on Docker on a different machine and it is able to grab the output from the sensor without issue, which I then turned into a Grafana dashboard (also running under Docker on that same machine).

My question is: what is the easiest way to grab the output from the sensors on a regular basis and display them on my default Lovelace dashboard? Do I somehow use the “Scrape” sensor type? Do I somehow import from Prometheus or Grafana?

Sensor URL: http://192.168.xx.xx:9926/
Sensor URL contents:

# HELP pm02 Particulate Matter PM2.5 value
# TYPE pm02 gauge
pm02{id="",mac="xx:xx:xx:xx:xx:xx"}0
# HELP rco2 CO2 value, in ppm
# TYPE rco2 gauge
rco2{id="",mac="xx:xx:xx:xx:xx:xx"}806
# HELP atmp Temperature, in degrees Celsius
# TYPE atmp gauge
atmp{id="",mac="xx:xx:xx:xx:xx:xx"}23.10
# HELP rhum Relative humidtily, in percent
# TYPE rhum gauge
rhum{id="",mac="xx:xx:xx:xx:xx:xx"}42

Web page contents (it’s super, super basic):

I’d be looking to pull:
pm02 as “PM2.5”
rco2 as “CO2 Saturation”
atmp as “Temperature”
rhum as “Relative Humidity”

Thanks in advance!

EDIT: If the best route is to use SCRAPE, I’d be looking to pull new values ever 30 seconds or so as I’m doing currently in Prometheus. Would that potentially cause performance issues on my RaspberryPi 4?

Scrape seems like an overkill for this.
I’d try to use a RESTful sensor:

sensor:
  - platform: rest
    name: pm02
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    value_template: "{{ value | regex_findall_index('pm02{[^}]*}\-?\d+\.?\d*') | regex_replace('^[^}]*}') }}"

You’ll need to create a separate rest sensor for each of the values. Alternatively, you could use a single rest sensor with no value_template and parse the data later with template sensors. That would reduce the number of http requests made. I’d also recommend that you exclude the rest sensor from the recorder if you decide to use it without a value_template.

That will definitely not be a problem with a RESTful sensor.

Thanks for the suggestion!

The regular expressions are a little over my head, but it looks like there’s a bug in this string. I’ll keep playing with it and see if I can figure it out.

It seemed to work fine in the template editor in developer tools:

{% set a = '# HELP pm02 Particulate Matter PM2.5 value
# TYPE pm02 gauge
pm02{id="",mac="xx:xx:xx:xx:xx:xx"}0
# HELP rco2 CO2 value, in ppm
# TYPE rco2 gauge
rco2{id="",mac="xx:xx:xx:xx:xx:xx"}806
# HELP atmp Temperature, in degrees Celsius
# TYPE atmp gauge
atmp{id="",mac="xx:xx:xx:xx:xx:xx"}23.10
# HELP rhum Relative humidtily, in percent
# TYPE rhum gauge
rhum{id="",mac="xx:xx:xx:xx:xx:xx"}42' %}
{{ a | regex_findall_index('pm02{[^}]*}\-?\d+\.?\d*') | regex_replace('^[^}]*}') }}

You’re right, I tried it in the template editor and it worked there for me as well. When I put it into my configuration.yaml, though, I get this error:

unknown escape sequence at line 19, column 62:
     ... egex_findall_index('pm02{[^}]*}\-?\d+\.?\d*') | regex_replace('^ ... 
                           ^

There seems to be a strange issue with regex and single quotes.
It is probably the YAML parser messing with the backslashes. I have updated the configuration below.
Try this:

sensor:
  - platform: rest
    name: pm02
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    value_template: '{{ value | regex_findall_index("pm02{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
1 Like

Thanks for your continued help!

That definitely got me past the error, but no sensor shows up after a reboot. I’ll keep playing with it.

EDIT: I added all four of the sensors now and two of the four do show up so I’m assuming something is wrong with parsing the other two (PM2.5 and CO2). Hmm.

Please post configuration of the sensors that do not work.
I’m not sure what you mean by the sensors not showing up. If the entities aren’t created, there should definitely be some errors in your logs. If the entities exist but they have incorrect states, then there are probably errors in our regex.

That’s the weirdest thing. The entities neither exist nor are there any errors in my logs. It’s almost as if the PM2.5 and CO2 sensors haven’t been initialized or something…

sensor:
  - platform: rest
    name: Basement PM25
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    value_template: '{{ value | regex_findall_index("rc02{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
  - platform: rest
    name: Basement CO2
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    value_template: '{{ value | regex_findall_index("rc02{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
  - platform: rest
    name: Basement Temperature
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    value_template: '{{ value | regex_findall_index("atmp{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
  - platform: rest
    name: Basement Humidity
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    value_template: '{{ value | regex_findall_index("rhum{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
#

EDIT: Looking back, I see I used ‘rco2’ twice which is a mistake, but that shouldn’t prevent either from working.

Ha, I’m a moron.

“rc02” is an amalgamation of ‘rco2’ and ‘pm02’; “rc02” doesn’t actually get reported so the template for those two sensors would have failed.

Strange that it wouldn’t flag an error or something, but after updating those names all four sensors now work.

Thanks so much!

In case anyone else stumbles across this thread in an attempt to do similarly in the future, here is my final configuration:

sensor:
  - platform: rest
    name: Basement PM25
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    unit_of_measurement: "µg/m³"
    value_template: '{{ value | regex_findall_index("pm02{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
  - platform: rest
    name: Basement CO2
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    unit_of_measurement: "ppm"
    value_template: '{{ value | regex_findall_index("rco2{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
  - platform: rest
    name: Basement Temperature
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    unit_of_measurement: "°C"
    value_template: '{{ value | regex_findall_index("atmp{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'
  - platform: rest
    name: Basement Humidity
    resource: http://192.168.xx.xx:9926/
    scan_interval: 30
    unit_of_measurement: "%"
    value_template: '{{ value | regex_findall_index("rhum{[^}]*}\-?\d+\.?\d*") | regex_replace("^[^}]*}") }}'