Using REST command to scrape NOAA data

I was able to successfully scrape some data from NOAA.gov, but having trouble with a couple data points. I’m not sure I’m using the correct path in the value_template

I was using this to determine the path: https://jsonpathfinder.com/

The URL: https://services.swpc.noaa.gov/products/noaa-scales.json
The Response:
{"0":{"DateStamp":"2024-09-19","TimeStamp":"22:44:00","R":{"Scale":"0","Text":"none","MinorProb":null,"MajorProb":null},"S":{"Scale":"0","Text":"none","Prob":null},"G":{"Scale":"0","Text":"none"}},"1":{"DateStamp":"2024-09-19","TimeStamp":"22:44:00","R":{"Scale":null,"Text":null,"MinorProb":"45","MajorProb":"10"},"S":{"Scale":null,"Text":null,"Prob":"10"},"G":{"Scale":"0","Text":"none"}},"2":{"DateStamp":"2024-09-20","TimeStamp":"00:00:00","R":{"Scale":null,"Text":null,"MinorProb":"45","MajorProb":"10"},"S":{"Scale":null,"Text":null,"Prob":"10"},"G":{"Scale":"0","Text":"none"}},"3":{"DateStamp":"2024-09-21","TimeStamp":"00:00:00","R":{"Scale":null,"Text":null,"MinorProb":"45","MajorProb":"10"},"S":{"Scale":null,"Text":null,"Prob":"10"},"G":{"Scale":"0","Text":"none"}},"-1":{"DateStamp":"2024-09-18","TimeStamp":"22:44:00","R":{"Scale":"0","Text":"none","MinorProb":null,"MajorProb":null},"S":{"Scale":"0","Text":"none","Prob":null},"G":{"Scale":"1","Text":"minor"}}}

The sensor:

  - platform: template
    sensors:
      solar_r_noaa:
         friendly_name: "Solar R"
         value_template: "{{ state_attr('sensor.aurora_noaa_space_weather_conditions', '0.R.Scale')}}"

Can you post what is shown in Developer Tools for sensor.aurora_noaa_space_weather_conditions?

I don’t think state_attr supports dot notation, so you’d have to get the first attribute and then move the rest to after the call like this:

state_attr('sensor.aurora_noaa_space_weather_conditions', '0').R.Scale
1 Like

I tried state_attr(‘sensor.aurora_noaa_space_weather_conditions’, ‘0’).R.Scale, and get this:

Your sensor.aurora_noaa_space_weather_conditions is not loading - it’s empty, so getting attributes from it is failing. Can you post your REST sensor that is retrieving it? You would expect to see YAML-ified JSON under the “friendly_name”.

1 Like
  - platform: rest
    name: aurora noaa space weather conditions
    json_attributes:
      - R
      - S
      - G
    scan_interval: 60
    resource: https://services.swpc.noaa.gov/products/noaa-scales.json
    value_template: "{{ value_json.noaadata }}"

JSON:
{
  "0": {
    "DateStamp": "2024-09-21",
    "TimeStamp": "03:35:00",
    "R": {
      "Scale": "0",
      "Text": "none",
      "MinorProb": null,
      "MajorProb": null
    },
    "S": {
      "Scale": "0",
      "Text": "none",
      "Prob": null
    },
    "G": {
      "Scale": "0",
      "Text": "none"
    }
  },
}

~~~~~~~
It's identical format as:``
 
 - platform: rest
    name: aurora noaa solar winds magnetic fields
    json_attributes:
      - Bt
      - Bz
      - TimeStamp
    scan_interval: 60
    resource: https://services.swpc.noaa.gov/products/summary/solar-wind-mag-field.json
    value_template: "{{ value_json.noaadata }}"

JSON:
{
“Bt”: “3”,
“Bz”: “1”,
“TimeStamp”: “2024-09-21 03:30:00.000”
}

The returned JSON does not have a top-level attribute called “noaadata”, so it’s not finding anything.

You may also find it easier to use the other REST format (lives under “rest:” in your config). Your current “sensor:” version requires you to restart HA on every change (or possibly a full config reload), whereas the “rest:” version can be reloaded under “Developer Tools > YAML > Rest Entities and Notify Services”. The following will then load attribute “0”:

- resource: https://services.swpc.noaa.gov/products/noaa-scales.json
  sensor:
    - name: MB Test Rest 1
      unique_id: mb_test_rest_1
      value_template: "OK"
      json_attributes:
        - "0"

I was able to scrape more data using the above REST: method:

Now I’m moving on to more data from another of the NOAA JSON files, which the syntax is different. This one uses “[” “]” vs. “{” “}”

JSON: https://services.swpc.noaa.gov/products/noaa-planetary-k-index-forecast.json

- resource: https://services.swpc.noaa.gov/products/noaa-planetary-k-index-forecast.json
  sensor:
    - name: "Aurora NOAA Planetary"
      unique_id: aurora_noaa_planetary
      value_template: "OK"
      json_attributes:
        - "time_tag"
        - "kp"
        - "observed"
        - "noaa_scale"

BUT, the sensor isn’t populating:

This one kicks back a lot of entries (81 just now) and since you are not (or cannot?) specify which entry to use…there is no result in the sensor
If you add a json attributes path $.0 then you will see the values for the fist entry…question is: do you need only the first?

Missed it… it is a weird json, the first set contains th ekeys, the rest the values…so not json as it was meant I see.
I am a bit confused by the postings as you start off with one url/ response and now another, what do you want?

EDIT: as to ‘what you want’ , please show one dataset/json and tell us what you want to get out of that … i.e. the ultimate goal is data in (?) state or (?) attribute or (?) both or (?) multiple sensors? Are you going to use it for graphs or you just need it as data?

EDIT2’: for the last one, no way to get this properly loaded as it lacks the keys so you would need to rebuild them, e.g. via jq

I’m attempting to scrape as much data from the NOAA.gov JSON’s as pertain to AURORA.

Unfortunately the data I want is in different JSON’s. each with different syntax.

The current data point I want, Kp, is inside:
https://services.swpc.noaa.gov/products/noaa-planetary-k-index-forecast.json

I want the last “observed” data point of Kp, which is located in the middle of this JSON

Once I get all the data points I want, I’m building a Dashboard in HA to display these points, graph them over time, and then create an automation to determine the best visible chance for my area, also taking into account cloud cover, darkness, etc

There are no current integrations on HA that take in all the data points.

I can try to help you with the second one (I know jq a bit) but still not sure what you want out of that. Just the first set of values and just forget all the others? Would that need to be in attributes?

I want these:

   - "time_tag"
    - "kp"
    - "observed"
    - "noaa_scale"

As seen from the middle of the JSON:

 [
    "2024-09-21 12:00:00",
    "1.33",
    "observed",
    null],
  [

Try this as an example…I fail to see how to use it as it is no tmy thing. It will create 1 sensor named something with a attribute named someket… yours to edit

command_line:
  - sensor: 
        name: something
        scan_interval: 1500
        command: >
             echo "{\"somekey\":" $(
             curl 'https://services.swpc.noaa.gov/products/noaa-planetary-k-index-forecast.json' | jq '[.[] | {"time_tag": .[0] , "kp": . [1], "observed": .[2], noaa_scale: .[3]}]'
                ) "}"
        value_template: > 
            {{ value_json.conso | length }}
        json_attributes:
            - somekey                   
1 Like

I was successful in creating a sensor that contains all of the info that was embedded in the JSON:

Now I want to create another sensor.solar_kp_noaa that pulls data from sensor.aurora_noaa_planetary_2

I want to use the data in the last “observed” section of the sensor.aurora_noaa_planetary_2:

  [
    "2024-09-21 09:00:00",
    "1.33",
    "observed",
    null],
  [
    "2024-09-21 12:00:00",
    "1.33",
    "observed",
    null],
  [
    "2024-09-21 15:00:00",
    "2.00",
    "estimated",
    null],
  [
    "2024-09-21 18:00:00",
    "1.67",
    "estimated",
    null],

which at the present is the section of “2024_09_21 12:00:00”
kp = 1.33
observed

This section will change daily, I want the kp right before the first “estimated”

This is what I have so far:

sensor:
  solar_kp_noaa:
  friendly_name: "Solar Kp"
  value_template: {{ state_attr('sensor.aurora_noaa_planetary_2', 'noaa_data_key').kp }}"

I need to loop thru "noaa_data_key [ x ] until “observed” = “estimated”, then back up and pull data.{{states.sensor.aurora_noaa_planetary_2.attributes.noaa_data_key[x-1].observed}}

{{ state_attr('sensor.aurora_noaa_planetary_2', 'noaa_data_key')| selectattr('observed','eq','observed')|list)[-1].kp }}

EDIT: you could do this via jq too, i.e. no need to have two sensors if this is the only value you are interested in

1 Like

" {{ state_attr(‘sensor.aurora_noaa_planetary_2’, ‘noaa_data_key’)| selectattr(‘observed’,‘eq’,‘observed’)|list)[-1].kp }}"

I tried plugging this into Dev_Tools, Template but get an error:
TemplateSyntaxError: unexpected ‘)’

I tried removing “)” and adding “(” but can’t get the error to disappear.

{{ (state_attr('sensor.aurora_noaa_planetary_2', 'noaa_data_key')| selectattr('observed','eq','observed')|list)[-1].kp }}

In short: you first create a list from only ‘observed’ and then pick the last one from which you select kp

1 Like

Works perfectly now, it’s exactly what I was trying to achieve.

Thanks for all of your help @vingerha and @michaelblight