Find and extract data from JSON

Okay… trying to learn how to do this with regex… and failing miserably! The following regex expression tested in regex101 seems to get me what I want (using a test date of 10/10/2022):
/(10\/10\/2022","ZONE":").*?("},)/gm

However, trying to use this in a template is misconfigured:

  - platform: template
    sensors:
      collection_current_week_zone:
        friendly_name: "Current zone"
        value_template: >-
          {{ states('sensor.collection_week') | regex_findall("/(10\/10\/2022","ZONE":").*?("},)/gm"}}

For reference, my config error is:

Invalid config for [sensor.template]: invalid template (TemplateSyntaxError: expected token ',', got ':') for dictionary value @ data['sensors']['collection_current_week_zone']['value_template']. Got '{{ states(\'sensor.collection_week\') | regex_findall("/(10\\/10\\/2022","ZONE":").*?("},)/gm"}}'. (See ?, line ?).

I am no regex person so not able to help but… I seem to have read a similar request recently and the answer then seems to apply here… if it is all on the same day then why not use the garbage collector plugin?

I wish it were that easy… the main reason is that while our rubbish collection is always the same day (though this could change), the zone information I’m trying to extract will tell me if it is either ‘recycling day’ or ‘green waste day’ depending on whether zone for the current week (from the JSON data) matching my zone (i.e. I’m in ZONE 2 - if it’s a ZONE 2 week, it’s recycling week for me (I think - need to confirm once I have it working), if it’s a ZONE 1 week, it’s green waste week for me).

Slight progress… config now valid… but not returning what I want. Updated config:

- platform: template
   sensors:
     collection_current_week_zone:
       friendly_name: "Current zone"
       value_template: >-
         {{ states('sensor.collection_week') | regex_findall(find='/(10\/10\/2022","ZONE":").*?("},)/gm', ignorecase=False) }}

This returns []. I’d like it to return ZONE 1.

For any regex experts out there, next step is to replace this part of the regex (10\/10\/2022","ZONE":") with something like states('sensor.collection_matching_string') (current value of which is 6/06/2022 ","ZONE":").

I looked at the output that your example above provides… it seems to just alternate between Z1 and Z2, hence regular (from where I am standing).
Alternatives are to import sensors for the next X weeks, so your X sensors would always shave the outlook for X weeks … but it seems the output starts in 2021 so from a REST pov difficult to find out where the future starts
Q: I donot know the source but can you possibly get a link with output to start on a specific date (i.e. later than “today”) ??

This link I used: https://www.data.brisbane.qld.gov.au/data/api/3/action/datastore_search?resource_id=c6dbb0b3-1e00-4bb8-8776-aa1b8f1ecfaa

Another option could be to use REST and a for-loop where you would extract only values if the date matches ‘something’… you could then create (say) 4 sensors when first looks at next week, second the week after, third again a week…etc.
EDIT: with the disclaimer that below is WRONG template (I am not an expert)…this is the direction I was thinking for the rest value_template

      {% for x in range(value_json) %}
        {% if  value_json["result"]["records"][x]["WEEK_STARTING"] < (now() + 7 days) %}
			{{ value_json["result"]["records"][x]["ZONE"] }}
		{% endif %}
      {% endfor %}

It should put the ZONE1 or ZONE2 in the sensor state, you could probably also try to add things to attributes…
Again… not the exact solution but a direction

There are some other posts:
Looping through an attribute derived from a JSON using Restful sensor - Configuration - Home Assistant Community (home-assistant.io)

For loop within json table - Configuration - Home Assistant Community (home-assistant.io)

Did some further research and my comment on also being able to add attributes via a for-loop seems not to work. The one thing I could achieve is to have the state of the sensor show ZONE X and a date but not sure if that would help you
If you want to play around as well… here is where I am ‘at’ just trying things out
As one can reload REST without restarting HA, quite ‘easy’ to check if it works or not

rest:
  - authentication: basic
    scan_interval: 3600
    verify_ssl: false
    headers:
      Content-Type: application/json
      User-Agent: Home Assistant
      Accept-Encoding: identity
    resource: "https://www.data.brisbane.qld.gov.au/data/api/3/action/datastore_search?resource_id=c6dbb0b3-1e00-4bb8-8776-aa1b8f1ecfaa"
    sensor:
      - name: Garbage
        value_template: >
             {% for x in (0,1,2,3,4) %}
                {% if value_json["result"]["records"][x]["WEEK_STARTING"] == "13/12/2021" %}
                    {{ value_json["result"]["records"][x]["ZONE"] }}
                {% else %}
                    {{ "NOK" }}
                {% endif %}
             {% endfor %}

This would work if it was ALWAYS a two-week cycle… and while it usually is, it sometimes gets reset (schedule change, special week added, week cycle drift over the year etc.), hence trying to get the data from “first principles”.

So, knowing a bit more now what can(not) be achieved through rest… what DO you want to have in a sensor? That may help with ideas.

Thanks @vingerha… I’ll have a play with this approach.

You can modify this and loop through using a relative date value eg based on ‘now’ or extract only dates for ZONE2 or…
But the lack of attributes does not make it very attractive (to me)…this would then mean that you’d have to add another template sensor to get data from the state of the above one(s). Repeating, try to imagine what you want in the end, this may also help others come with ideas…I am just 1 person :slight_smile:

Thanks @vingerha… what I’m hoping to be able to achieve is to be able to regularly (once a week would be fine, but I’ve set the REST sensor to pull the JSON daily at the moment) pull the list of week start dates and relevant ZONE # and extract the ZONE # for the current week. By then comparing the ZONE # with my ZONE # (extracted from a separate JSON based on an address query - this bit is already working), I can tell if the current week is a ‘green waste bin’ week or recycling bin week.

So, if you have a sensor called e.g. garbage_current_week and a ZONE# as state value, would that be enough?
You could have additional sensors too for upcoming weeks but all are hard-name-coded, e.g. garbage_week_plus1, garbage_week_plus2 etc.
As the json file also presents historical entries you would need to loop through it anyhow and compare now/today with the json date. Or do you have possibly access to a json file that starts from ‘today’ ?

I believe this is all that’s needed to extract the ZONE value for a specific WEEK_STARTING.

       value_template: >
         {% set x = value_json.result.records 
           | selectattr('WEEK_STARTING', 'eq', '10/01/2022') | list %}
         {{ x[0].ZONE if x != [] else 'unknown' }}

The template’s result is ZONE 2.

You can confirm it works by copy-pasting the following into the Template Editor:

Click to reveal example
{% set value_json = 
{
   "help":"https://www.data.brisbane.qld.gov.au/data/api/3/action/help_show?name=datastore_search",
   "success":true,
   "result":{
      "include_total":true,
      "limit":53,
      "records_format":"objects",
      "resource_id":"c6dbb0b3-1e00-4bb8-8776-aa1b8f1ecfaa",
      "total_estimation_threshold":null,
      "records":[
         {
            "_id":1,
            "WEEK_STARTING":"29/11/2021",
            "ZONE":"ZONE 2"
         },
         {
            "_id":2,
            "WEEK_STARTING":"6/12/2021",
            "ZONE":"ZONE 1"
         },
         {
            "_id":3,
            "WEEK_STARTING":"13/12/2021",
            "ZONE":"ZONE 2"
         },
         {
            "_id":4,
            "WEEK_STARTING":"20/12/2021",
            "ZONE":"ZONE 1"
         },
         {
            "_id":5,
            "WEEK_STARTING":"27/12/2021",
            "ZONE":"ZONE 2"
         },
         {
            "_id":6,
            "WEEK_STARTING":"3/01/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":7,
            "WEEK_STARTING":"10/01/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":8,
            "WEEK_STARTING":"17/01/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":9,
            "WEEK_STARTING":"24/01/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":10,
            "WEEK_STARTING":"31/01/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":11,
            "WEEK_STARTING":"7/02/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":12,
            "WEEK_STARTING":"14/02/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":13,
            "WEEK_STARTING":"21/02/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":14,
            "WEEK_STARTING":"28/02/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":15,
            "WEEK_STARTING":"7/03/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":16,
            "WEEK_STARTING":"14/03/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":17,
            "WEEK_STARTING":"21/03/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":18,
            "WEEK_STARTING":"28/03/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":19,
            "WEEK_STARTING":"4/04/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":20,
            "WEEK_STARTING":"11/04/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":21,
            "WEEK_STARTING":"18/04/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":22,
            "WEEK_STARTING":"25/04/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":23,
            "WEEK_STARTING":"2/05/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":24,
            "WEEK_STARTING":"9/05/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":25,
            "WEEK_STARTING":"16/05/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":26,
            "WEEK_STARTING":"23/05/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":27,
            "WEEK_STARTING":"30/05/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":28,
            "WEEK_STARTING":"6/06/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":29,
            "WEEK_STARTING":"13/06/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":30,
            "WEEK_STARTING":"20/06/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":31,
            "WEEK_STARTING":"27/06/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":32,
            "WEEK_STARTING":"4/07/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":33,
            "WEEK_STARTING":"11/07/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":34,
            "WEEK_STARTING":"18/07/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":35,
            "WEEK_STARTING":"25/07/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":36,
            "WEEK_STARTING":"1/08/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":37,
            "WEEK_STARTING":"8/08/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":38,
            "WEEK_STARTING":"15/08/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":39,
            "WEEK_STARTING":"22/08/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":40,
            "WEEK_STARTING":"29/08/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":41,
            "WEEK_STARTING":"5/09/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":42,
            "WEEK_STARTING":"12/09/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":43,
            "WEEK_STARTING":"19/09/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":44,
            "WEEK_STARTING":"26/09/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":45,
            "WEEK_STARTING":"3/10/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":46,
            "WEEK_STARTING":"10/10/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":47,
            "WEEK_STARTING":"17/10/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":48,
            "WEEK_STARTING":"24/10/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":49,
            "WEEK_STARTING":"31/10/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":50,
            "WEEK_STARTING":"7/11/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":51,
            "WEEK_STARTING":"14/11/2022",
            "ZONE":"ZONE 2"
         },
         {
            "_id":52,
            "WEEK_STARTING":"21/11/2022",
            "ZONE":"ZONE 1"
         },
         {
            "_id":53,
            "WEEK_STARTING":"28/11/2022",
            "ZONE":"ZONE 2"
         }
      ],
      "fields":[
         {
            "id":"_id",
            "type":"int"
         },
         {
            "id":"WEEK_STARTING",
            "type":"text"
         },
         {
            "id":"ZONE",
            "type":"text"
         }
      ],
      "_links":{
         "start":"/api/3/action/datastore_search?resource_id=c6dbb0b3-1e00-4bb8-8776-aa1b8f1ecfaa&limit=53",
         "next":"/api/3/action/datastore_search?resource_id=c6dbb0b3-1e00-4bb8-8776-aa1b8f1ecfaa&limit=53&offset=53"
      },
      "total":162,
      "total_was_estimated":false
   }
}

%}

{% set x = value_json.result.records | selectattr('WEEK_STARTING', 'eq', '10/01/2022') | list %}
{{ x[0].ZONE if x != [] else 'unknown' }}

That is a good idea but not dynamic.
As mentioned, myself being no template specialist, one can have this based on today (now) and a set of data 53 weeks long
@123 any improvement ideas here?

rest:
  - authentication: basic
    scan_interval: 3600
    verify_ssl: false
    headers:
      Content-Type: application/json
      User-Agent: Home Assistant
      Accept-Encoding: identity
    resource: "https://www.data.brisbane.qld.gov.au/data/api/3/action/datastore_search?resource_id=c6dbb0b3-1e00-4bb8-8776-aa1b8f1ecfaa"
    sensor:
      - name: GarbageThisWeek
        value_template: >
             {% for x in range(0,42) %}           
                {% set days = (( as_timestamp(strptime(value_json["result"]["records"][x]["WEEK_STARTING"], '%d/%m/%Y')) - as_timestamp(now()) ) / 86400 ) | round() %}
                    {% if days >= 0 and days < 7 %}
                        {{ value_json["result"]["records"][x]["WEEK_STARTING"]}} - {{value_json["result"]["records"][x]["ZONE"]}}
                    {% endif %}
             {% endfor %}  
      - name: GarbageNextWeek
        value_template: >
             {% for x in range(0,52) %}           
                {% set days = (( as_timestamp(strptime(value_json["result"]["records"][x]["WEEK_STARTING"], '%d/%m/%Y')) - as_timestamp(now()) ) / 86400 ) | round() %}
                    {% if days >= 7 and days < 14 %}
                        {{ value_json["result"]["records"][x]["WEEK_STARTING"]}} - {{value_json["result"]["records"][x]["ZONE"]}}
                    {% endif %}
             {% endfor %}    

Oops! You’re right; I missed that important detail. :man_facepalming:

What’s the reason for the template’s for-loop to iterate through X items of the records list, using range(0,42) and range(0,52), instead of iterating through the list’s length?

Thanks and 42 is a typo, should have been 52
But you are right as one should iterate through the list but I do not know how to do this, just noticed the list length to be fixed … without asking for the solution itself…what is the method?

If the list’s length is always the same then the technique you used is fine. However the ‘stop’ argument in range should be 53.

The linked range example requires 0 to 5 and its ‘stop’ argument for range is 6.

For this application, the list contains 53 items so the list’s index is 0 to 52 and therefore the ‘stop’ argument for range is 53.

Understood and yes…should be 53 (mistake) but… what when I do not know the number … says it once give out 50 instead of 53, would the script crash or just not continue post 50?

If will fail because the for-loop will attempt to access a list item that doesn’t exist.

You can use the count or length filter to get the number of items in a list.

value_json.result.records | count