Processing a large Json file

Hi All,

I have a very large json file which is an output of an api call to fetch firewall rules from my fw. I need to process the json file, find a specific value is a description field. then store the corresponding rule ID value from the entry that matches. I have been messing with python_scripts and pyscript modules seems there are limitations with certain functions like get etc.
any suggestion on how I can approach this will be appreciated.

here is sample of the Json File

{
  "code": 200,
  "status": "ok",
  "response_id": "SUCCESS",
  "message": "",
  "data": [
    {
      "id": 0,
      "type": "reject",
      "interface": ["lan"],
      "ipprotocol": "inet",
      "protocol": null,
      "icmptype": null,
      "source": "any",
      "source_port": null,
      "destination": "pfB_PRI1_v4",
      "destination_port": null,
      "descr": "pfB_PRI1_v4",
      "disabled": false,
      "log": true,
      "statetype": "keep state",
      "tcp_flags_any": false,
      "tcp_flags_out_of": null,
      "tcp_flags_set": null,
      "gateway": null,
      "sched": null,
      "dnpipe": null,
      "pdnpipe": null,
      "defaultqueue": null,
      "ackqueue": null,
      "floating": false,
      "quick": null,
      "direction": null,
      "tracker": 1770009341,
      "created_time": 1722967240,
      "created_by": "Auto",
      "updated_time": 1723300215,
      "updated_by": null
    },
    // More rules here...
  ]
}

If you can use curl to get to the API then you can possibly use (pipe through) JQ for your purpose. From what you write I guess descr is the key that you are looking fot but I cannot find any rule ID so not sure if it would work
EDIT;
an example in use with me.
via curl I receive the json output and I clean it a bit, then via jq and a 2 select statements I filter out only those sections I need from which I then construct a new json that is more easy to manage in HA

I can try to help if you can:

  • confirm you have a way to pipe the the json response through JQ
  • provide a more concise explanation on the data search and ?? rule-id ??

Thanks for the guidance! I decided to drop the Python approach and went with jq instead, and it worked out perfectly. Here’s the final setup:

sensor:
  name: Letsencrypt Firewall Rule ID
  command: >
    cat /config/pfsense_firewall_rules.json | 
    jq -r '.data[] | select(.descr == "NAT Allow port 80 for letsencrypt") | .id'
  scan_interval: 86400

I’m now using a separate API call to fetch the data and store it in a JSON file. This way, I avoid the “data too large” error that occurs when trying to handle everything directly in the sensor. The process is streamlined and works smoothly!

1 Like