Trying to make a scrape sensor and failing miserably

I have a webpage which is my media player on the Pi, J.River Media Center. The address is: http://192.168.3.1:16000/mcws/v1/Playback/Info?Zone=-1

This is what comes back:

<Response Status="OK">
<Item Name="ZoneID">0</Item>
<Item Name="ZoneName">Hovedsone</Item>
<Item Name="State">2</Item>
<Item Name="FileKey">13659</Item>
<Item Name="NextFileKey">4034279</Item>
<Item Name="PositionMS">149492</Item>
<Item Name="DurationMS">255791</Item>
<Item Name="ElapsedTimeDisplay">2:29</Item>
<Item Name="RemainingTimeDisplay">-1:46</Item>
<Item Name="TotalTimeDisplay">4:15</Item>
<Item Name="PositionDisplay">2:29 / 4:15</Item>
<Item Name="PlayingNowPosition">0</Item>
<Item Name="PlayingNowTracks">9</Item>
<Item Name="PlayingNowPositionDisplay">1 of 9</Item>
<Item Name="PlayingNowChangeCounter">12</Item>
<Item Name="Bitrate">256</Item>
<Item Name="Bitdepth">64</Item>
<Item Name="SampleRate">44100</Item>
<Item Name="Channels">2</Item>
<Item Name="Chapter">0</Item>
<Item Name="Volume">1</Item>
<Item Name="VolumeDisplay">100% (+0,0 dB)</Item>
<Item Name="ImageURL">MCWS/v1/File/GetImage?File=13659</Item>
<Item Name="Artist">AC/DC</Item>
<Item Name="Album">Back in Black</Item>
<Item Name="Name">Back in Black</Item>
<Item Name="Status">Playing</Item>
</Response>

I am looking for Status (in this case “Playing”), Artist, Name (which is song title) and PositionDisplay (which is minutes/seconds played of total minutes and seconds). But I have been hitting a brick wall for an hour, I just can’t understand how to configure this. Can somebody please help me out?

sensor:
  - platform: rest
    resource: http://192.168.3.1:16000/mcws/v1/Playback/Info?Zone=-1
    name: "Music Player Info"
    value_template: "OK"
    xml_attributes: false
    scan_interval: 30
    json_attributes:
      - Artist
      - Name
      - Status
      - PositionDisplay
    value_template: "{{ value_json.Response.Status }}"

  - platform: template
    sensors:
      music_player_artist:
        value_template: "{{ state_attr('sensor.music_player_info', 'Artist') }}"
        friendly_name: "Artist"
      music_player_name:
        value_template: "{{ state_attr('sensor.music_player_info', 'Name') }}"
        friendly_name: "Song Name"
      music_player_status:
        value_template: "{{ state_attr('sensor.music_player_info', 'Status') }}"
        friendly_name: "Playback Status"
      music_player_position:
        value_template: "{{ state_attr('sensor.music_player_info', 'PositionDisplay') }}"
        friendly_name: "Position Display"

I’m old school, I know that it might be possible to extract it with the UI helper but I’m still using the yaml version in my config.

I can’t test what I wrote as I don’t have access to your internal url http://192.168.3.1:16000/mcws/v1/Playback/Info?Zone=-1 obviously.

But looking at what I’ve done, it should work. Even if it is XML, not JSON, Home Assistant should be able to parse it as JSON.

You can adjust the scan_interval but be carefull to not kill your LAN

Thanks for answering! :+1: I had to remove XML_attributes, because that is not a valid option for the REST sensor. But still the sensors are shown as “unknown”. I get this in the log:

Logger: homeassistant.helpers.template
Kilde: helpers/template.py:2651
Førs oppstått: 21:13:24 (8 hendelser)
Sist logget: 21:16:54

Template variable warning: 'dict object' has no attribute 'Status' when rendering '{{ value_json.Response.Status }}'

So it seems it doesn’t see that. And I see in the REST sensor that it should convert it, but I don’t understand the specs:

A list of keys to extract values from a JSON dictionary result and then set as sensor attributes. If the endpoint returns XML with the text/xml, application/xml or application/xhtml+xml content type, it will automatically be converted to JSON according to this specification

Edit: I found a double value_template in the first sensor, one with OK and one with {{ value_json.Response.Status }} but removing the OK one didn’t help.

Hmmm … I wish I could help but I don’t have the time to setup a webserver that reply the same XML now.
Sorry for the mistakes, copy/paste from multiple sensors was not an easy task :wink:

I’ll have a look later. Hold on tight :slight_smile:

Thanks, I of course understand that you don’t have time to setup a webserver for me. :slight_smile: If you need to see the info, it would be one easy install of the trial version of J.River Media Center 32, and you’ll have the same server (you’d just need to set up Tools/Options/Media Network/Use Media Network, and then the menu 7 points below that, Advanced, TCP Port to 16000, that would make an identical setup). And I’m not in a hurry, I can’t solve this myself anyway.

Ok, first, as your XML is more than 255 characters, the rest sensor will not work as such, sorry, it will need some remorking.

Scrape could work but you’ll end with selectors like #folder0 > div.opened > div:nth-child(15) > span:nth-child(2) to pickup the Artist which I don’t find very convenient and maintanable.

I’ll try to help you again later

Ok, got a solution, I think

First, create a RESTFull command

rest_command:
  get_playback_info:
    url: "http://192.168.3.1:16000/mcws/v1/Playback/Info?Zone=-1"
    method: GET

Then, do an automation to fill various input_text (create them via the UI), I did for one but you can do all the others by adding a new action to the list

alias: Get Playback Info
description: ""
trigger:
  - platform: time_pattern
    seconds: /30
action:
  - action: rest_command.get_playback_info
    response_variable: playback_info
    data: {}
  - delay: "00:00:02"
  - action: input_text.set_value
    data:
      entity_id: input_text.media_player_info_artist
      value: >-
        {{ (playback_info | regex_findall_index('<Item
        Name="Artist">(.*?)</Item>')) }}

Not sure that you need the delay, adjust it or remove it after tests

1 Like

Thank you very much for your help! That works! :grin:

1 Like

That’s an inelegant solution when REST sensors can read XML. Here’s a RESTful integration version that directly populates sensors and does not require regex:

rest:
  - resource: http://192.168.3.1:16000/mcws/v1/Playback/Info?Zone=-1
    scan_interval: 30
    sensor:
      - name: "Artist"
        value_template: >
          {{ (value_json['Response']['Item']|selectattr('@Name','==','Artist')|first)['#text'] }}
      - name: "Name"
        value_template: >
          {{ (value_json['Response']['Item']|selectattr('@Name','==','Name')|first)['#text'] }}
      - name: "Status"
        value_template: >
          {{ (value_json['Response']['Item']|selectattr('@Name','==','Status')|first)['#text'] }}

I tried that one after my initial comment, but the error was

homeassistant.exceptions.InvalidStateError: Invalid state with length 1245. State max length is 255 characters.

But @Mastiff, feel free to change, it is indeed a more elegant solution if it is working for you.

You’d get that error if you omitted the value_template, as it would try to store the entire response in the sensor state, which is over the limit.

The RESTful integration allows multiple sensors to be populated from a single call to the resource, without needing the entire response to fit in 255 characters.

1 Like

That worked, Thanks! And I’m guessing with a bit less processing. I am going to fire this every 5 seconds, so a bit less processing is good. :+1: