Extracting printer data from complex XML with REST integration

Shared in case it’s useful to anyone wanting to see how to deal with complex XML and the REST integration.

My HP printer makes a lot of data available via XML. I use this to pull ink levels more precisely than available via the HP printer integration. Here’s a snippet of the XML with a lot of surplus sections removed:

<pudyn:ProductUsageDyn xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/" xmlns:dd2="http://www.hp.com/schemas/imaging/con/dictionaries/2008/10/10" xmlns:pudyn="http://www.hp.com/schemas/imaging/con/ledm/productusagedyn/2007/12/11" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.hp.com/schemas/imaging/con/ledm/productusagedyn/2007/12/11 ../schemas/ProductUsageDyn.xsd">
<dd:TotalImpressions PEID="5082">13647</dd:TotalImpressions>
*** loads of irrelevant stuff removed ***
*** lots more magenta stuff removed **
... and so on

As you can see, the cartridge information is available under ['pudyn:ProductUsageDyn']['pudyn:ConsumableSubunit'] under a separate ['pudyn:Consumable'] for each cartridge: magenta first (['dd:MarkerColor']), then cyan, yellow and black.

I didn’t want to assume that the cartridge data is always in this order, so I needed to read through all the pudyn:Consumable elements to find the one I wanted. Luckily, value_json['foo'] returns a list of foo elements if more than one is available.

Most of the REST examples in the documentation create a lot of attributes on a single sensor, whereas I wanted separate sensors for each measurement — whilst only reading the XML from the printer once.

Here is my REST sensor configuration (via a rest: !include in configuration.yaml) that pulls out page counts, ink usage (not visible in the XML above) and cartridge level and how many the printer has used (cyan only shown, other colours are the same config with the colour name replaced):

- resource:
  scan_interval: 86400
    - name: "Printer total page count"
      value_template: "{{ value_json['pudyn:ProductUsageDyn']['pudyn:PrinterSubunit']['dd:TotalImpressions']['#text'] }}"
      unit_of_measurement: "pages"

    - name: "Printer colour page count"
      value_template: "{{ value_json['pudyn:ProductUsageDyn']['pudyn:PrinterSubunit']['dd:ColorImpressions'] }}"
      unit_of_measurement: "pages"

    - name: "Printer monochrome page count"
      value_template: "{{ value_json['pudyn:ProductUsageDyn']['pudyn:PrinterSubunit']['dd:MonochromeImpressions'] }}"
      unit_of_measurement: "pages"

    - name: "Lifetime ink usage"
      value_template: "{{ value_json['pudyn:ProductUsageDyn']['pudyn:PrinterSubunit']['pudyn:UsageByMarkingAgent']['dd2:CumulativeMarkingAgentUsed']['dd:ValueFloat'] }}"
      unit_of_measurement: "ml"

    - name: "Cyan cartridge level"
      value_template: "{{ ((value_json['pudyn:ProductUsageDyn']['pudyn:ConsumableSubunit']['pudyn:Consumable'])|selectattr('dd:MarkerColor','eq','Cyan')|first)['dd:ConsumableRawPercentageLevelRemaining'] }}"
      unit_of_measurement: '%'

    - name: "Cyan cartridge count"
      value_template: "{{ ((value_json['pudyn:ProductUsageDyn']['pudyn:ConsumableSubunit']['pudyn:Consumable'])|selectattr('dd:MarkerColor','eq','Cyan')|first)['dd2:CumulativeConsumableCount'] }}"

I refresh this once per day in the sensor config, but I also have a pretty good idea of when it’s printing as it shares a UPS with my NAS, and causes a spike in UPS load which I use to trigger an update via this automation:

- alias: "Printer - update sensors"

  description: >
    Updates printer REST data when UPS load suggests printing
    or when page count increases (ongoing job).

  id: b1b97a07-252b-4f4b-ab20-68ccfee3e571

    - platform: numeric_state
      entity_id: sensor.ups_load
      above: '10'
    - platform: state
      entity_id: sensor.printer_total_page_count

    - delay: 60
    - service: homeassistant.update_entity
        entity_id: sensor.printer_total_page_count

I used to do the same thing, then found this custom component:

Not quite all what you have though

I’m aware of the integrations but prefer rolling my own code where I can. I’m then less reliant on other people dealing with breaking changes between updates, for example.

Is it correct to assume that the update_entity service on that single target sensor in your example in reality triggers the REST call so that ALL sensors defined are updated, or only that single sensor is updated?

I stumbled in this thread because I’m switching from RESTful sensor to REST integration, and I used update_entity in an automation for the single sensor, worked ok. Now that I’m migrating to the REST integration with multiple sensors for the endpoint, I didn’t know if I had to target all sensors for the update_entity or one would be enough to trigger the rest call and update the sensors.

Thanks for any clarification on this point.

The REST update causes re-calculation of all the related sensors.

That’s what I thought, just wanted to be sure. :slight_smile: