Help Using the Scrape Sensor

I’m sure I’m just an idiot on multiple fronts, but I both can’t figure out the right place to put this post and I’m having trouble using the Scrape sensor for my intended purposes.

Background: I have a CNC machine controlled via UGS running on an old laptop. It hosts a local webpage that can be accessed (for instance on one’s phone) so you can control the machine without interacting directly with said laptop (i.e. use it as a ‘pendant’). I’d really like to be able to pull into HA a basic ‘state’ of my CNC machine and this local webpage has a field which identifies exactly this. Unfortunately I know nothing about HTML and have been beating my head against a wall trying to pull this in.

Here’s a screenshot of both the webpage and the relevant line I’m trying to scrape (Note the current state of ‘IDLE’:

and here’s the ‘inspect’ pane zoomed in with the relevant line highlighted (I couldn’t find a way to copy/paste the text):
image

I got the scrape sensor setup/created but I’m struggling to configure it to grab this field (the only value I’ve ever gotten out of it is ‘unknown’. Could someone help me out?

Thanks an absolute ton in advance!

In my continued attempt to get this working, I came across this post:

However, even when using this method, all I get is a value of “unknown”.

I also went ahead and created a sensor using the example from the HA Scrape page to pull the current HA release and it works without any issues, so I’m extra-stumped as clearly my methodology ‘works’, just not with my specific webpage and value I’m trying to pull.

Any ideas?

This may not work at all and apologies if it doesn’t help but give this a go:

scrape:
  - resource: https://your-url/your-page.html
    sensor:
      - name: State
        select: "div"
        index: 7

I’m hoping this returns the contents of the 7th DIV at the url you provide.

I’ve always done scraping in Node-Red, so haven’t tried the Scrape Sensor personally but interested if it works.

Thanks for the quick reply! Unfortunately, no luck; still just got ‘unknown’.

Looks like the site is built with Angular, and it’s probably retrieving your desired value via a different (json) request, which makes it impossible to scrape.
You might want to look in the network tab of your browser’s developer tools and see if any of those requests there is retrieving the value you want.

To check, do View Source rather than Inspect — Ctrl-U on Chrome / Windows, for example. If the state that you want is not visible in there, then you’ll need to investigate other methods as @danieldotnl suggests.

@danieldotnl & @Troon

Looks like it’s pulling in the data via javascript. When doing ‘inspect’ there isn’t any ‘data’, just a single line that to my layman’s eyes looks like it’s using to pull everything in:

<script src="runtime.js" type="module"></script><script src="polyfills.js" type="module"></script><script src="main.js" type="module"></script>

When looking at the network tab, it looks like a waterfall of data but it all looks basically identical to me. Does it makes sense that it would be pulling everything in at once like that? Guessing there’s no way for me to parse out what I want?

Thanks a ton for your all’s help!

Have a look at the response for one of those getStatus lines. You might be able to call that URL directly if it’s a simple setup with no authentication.

@Troon
No authentication, etc. is required.

Here’s one of the responses (took me longer than I care to admit to figure out how to find this):

{"machineCoord":{"x":0.0,"y":175.0,"z":0.0,"a":"NaN","b":"NaN","c":"NaN","units":"MM"},"workCoord":{"x":-176.0,"y":95.5,"z":28.463,"a":"NaN","b":"NaN","c":"NaN","units":"MM"},"feedSpeed":0.0,"spindleSpeed":0.0,"state":"IDLE","rowCount":0,"completedRowCount":2,"remainingRowCount":0,"fileName":"Za Lord.nc","sendDuration":0,"sendRemainingDuration":-1}

A little over halfway is what I’m after: “state”:“IDLE”.

Is there a simple way for me to pull this value (in this case, “IDLE”) into Home Assistant?

I apologize for being completely ignorant about this; this isn’t my world at all. Thanks a ton for being patient.

Rest sensor:

Use the getStatus URL as the resource , and this:

value_template: "{{ value_json['state'] }}"

Thank you an absolute ton! I’d started down that path but I knew I wouldn’t be able to parse out the correct formatting.

Now I can even pull in some other data that I don’t really need, but why not?!

1 Like