Multiscrape setup for an idiot

Hi all,
Reluctantly posting after searching for hours and trying millions of different Configs. Starting up with this home assistant is tricky!

The goal, this date into home assistant to trigger stuff.:

https://remote.mkovedata.com/send_dataview.php?id=um5iexUA&identifier=The+Shack

Fail. Nothing is showing. Logs are blank.

Currently have all the multiscrape files in its folder. Have tried reloading.

Just have this added in configuration.yaml at the moment as a generic to just get something to show.

Have tried rest, but that was also a fail. It’s like HA just doesn’t see anything.

Thanks so much for any help!

Google Photos

No need for multiscrape.
A standard REST sensor will do the job because it’s already JSON.
Oh wait.

The connection stays open, and keeps streaming data. I’m not sure if there is an integration in Home Assistant that can cope with that.

No specific integration to do what sorry?

I thought it was common people retrieved data like this to control other devices through automations?

So what options do I have to get the data into home assistant so I can set up different rules to trigger other events?

Yes they do, but usually you request data, and the data is delivered and then the connection closed.

This resource does not do that, it keeps the connection open and keeps sending new data. That’s much harder to deal with.

In the meantime though - I write the code for fixing the JSON so if we can find a way to get the data, then we can convert it to something we can use.

{% set json = "{" ~ value|replace("data:","\"data\":")|replace("[\"","[")|replace("}\"","}") ~ "}"|from_json %}

At that point we can then access the values like: {{ json['data']['volts'] }}

I’ve been testing against that server to see if we can get it to spit out ONE bit of data and then close the connection. But it’s ignoring the headers sent to it including Connection: close

Thank you for looking into it.

I left my HA for a few hrs and got these logs.

Page_response_headers

Headers({‘date’: ‘Wed, 27 Dec 2023 16:09:36 GMT’, ‘server’: ‘Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/8.1.26’, ‘x-powered-by’: ‘PHP/8.1.26’, ‘cache-control’: ‘no-cache’, ‘x-accel-buffering’: ‘no’, ‘connection’: ‘close’, ‘transfer-encoding’: ‘chunked’, ‘content-type’: ‘text/event-stream;charset=UTF-8’})

Page_soup

data:["{\"volts\":52.41,\"amps\":-8.24,\"watts\":-432.03,\"soc\":55.65,\"basz\":280,\"batyp\":1}","2023-12-28 03:09:22"]

data:[“{"volts":52.41,"amps":-8.22,"watts":-430.68,"soc":55.63,"basz":280,"batyp":1}”,“2023-12-28 03:09:38”]

data:[“{"volts":52.43,"amps":-5.34,"watts":-279.75,"soc":55.62,"basz":280,"batyp":1}”,“2023-12-28 03:09:53”]

data:[“{"volts":52.44,"amps":-5.35,"watts":-280.47,"soc":55.62,"basz":280,"batyp":1}”,“2023-12-28 03:10:08”]

data:[“{"volts":52.44,"amps":-5.50,"watts":-288.66,"soc":55.61,"basz":280,"batyp":1}”,“2023-12-28 03:10:24”]

data:[“{"volts":52.44,"amps":-5.54,"watts":-290.72,"soc":55.60,"basz":280,"batyp":1}”,“2023-12-28 03:10:39”]

data:[“{"volts":52.45,"amps":-5.36,"watts":-281.20,"soc":55.59,"basz":280,"batyp":1}”,“2023-12-28 03:10:54”]

data:[“{"volts":52.45,"amps":-5.36,"watts":-281.21,"soc":55.58,"basz":280,"batyp":1}”,“2023-12-28 03:11:10”]

data:[“{"volts":52.45,"amps":-5.36,"watts":-281.23,"soc":55.57,"basz":280,"batyp":1}”,“2023-12-28 03:11:25”]

data:[“{"volts":52.45,"amps":-5.32,"watts":-279.19,"soc":55.57,"basz":280,"batyp":1}”,“2023-12-28 03:11:40”]

data:[“{"volts":52.45,"amps":-5.47,"watts":-286.69,"soc":55.56,"basz":280,"batyp":1}”,“2023-12-28 03:11:55”]

data:[“{"volts":52.45,"amps":-5.65,"watts":-296.21,"soc":55.55,"basz":280,"batyp":1}”,“2023-12-28 03:12:11”]

data:[“{"volts":52.43,"amps":-9.01,"watts":-472.36,"soc":55.54,"basz":280,"batyp":1}”,“2023-12-28 03:12:26”]

data:[“{"volts":52.39,"amps":-13.36,"watts":-699.88,"soc":55.52,"basz":280,"batyp":1}”,“2023-12-28 03:12:41”]

data:[“{"volts":52.37,"amps":-14.61,"watts":-764.81,"soc":55.50,"basz":280,"batyp":1}”,“2023-12-28 03:12:57”]

data:[“{"volts":52.36,"amps":-14.55,"watts":-761.94,"soc":55.48,"basz":280,"batyp":1}”,“2023-12-28 03:13:12”]

data:[“{"volts":52.36,"amps":-11.70,"watts":-612.52,"soc":55.46,"basz":280,"batyp":1}”,“2023-12-28 03:13:27”]

data:[“{"volts":52.39,"amps":-8.48,"watts":-444.14,"soc":55.44,"basz":280,"batyp":1}”,“2023-12-28 03:13:43”]

data:[“{"volts":52.40,"amps":-8.36,"watts":-438.08,"soc":55.43,"basz":280,"batyp":1}”,“2023-12-28 03:13:58”]

data:[“{"volts":52.40,"amps":-8.30,"watts":-434.71,"soc":55.42,"basz":280,"batyp":1}”,“2023-12-28 03:14:13”]

data:[“{"volts":52.40,"amps":-8.19,"watts":-429.29,"soc":55.40,"basz":280,"batyp":1}”,“2023-12-28 03:14:28”]

data:[“{"volts":52.41,"amps":-8.06,"watts":-422.50,"soc":55.39,"basz":280,"batyp":1}”,“2023-12-28 03:14:44”]

data:[“{"volts":52.41,"amps":-8.27,"watts":-433.39,"soc":55.38,"basz":280,"batyp":1}”,“2023-12-28 03:14:59”]

data:[“{"volts":52.43,"amps":-5.52,"watts":-289.26,"soc":55.37,"basz":280,"batyp":1}”,“2023-12-28 03:15:14”]

data:[“{"volts":52.43,"amps":-5.35,"watts":-280.45,"soc":55.36,"basz":280,"batyp":1}”,“2023-12-28 03:15:30”]

data:[“{"volts":52.44,"amps":-5.32,"watts":-279.12,"soc":55.35,"basz":280,"batyp":1}”,“2023-12-28 03:15:45”]

data:[“{"volts":52.44,"amps":-5.47,"watts":-286.62,"soc":55.34,"basz":280,"batyp":1}”,“2023-12-28 03:16:00”]

Page_request_body looks much the same

Nothing showing in entities or devices

This is killing me.



default_config:

frontend:
  themes: !include_dir_merge_named themes

automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml

logger:
  default: warning
  logs:
    homeassistant.components.command_line: debug

command_line:
  sensor:
    name: "BBattery"
    unique_id: bbattery_sensor
    command: "bash /config/fetch_data.sh"
    json_attributes:
      - volts
      - amps
      - watts
      - soc
      - basz
      - batyp
    value_template: '{{ value_json[0].volts }}' 

////////////////////////////////////////////////////////
Is my latest test after trying a command_line.



//#!/bin/bash

while true; do
  //# Fetch data and extract the JSON string
  response=$(curl -A 'Mozilla/5.0' -m 8 -s https://remote.mkovedata.com/send_dataview.php?id=um5iexUA&identifier=The+Shack)

  //# xtract the JSON string from the array and remove the timestamp
  json_string=$(echo "$response" | awk -F'[][]' '{print $2}' | sed 's/,"[^"]*"$//')

 //# Check if the JSON string is not empty
  if [ -n "$json_string" ]; then
//# Output the raw JSON for debugging
    echo "Raw JSON: $json_string"

// # Attempt to parse the JSON for additional debugging
    jq . <<< "$json_string"
  else
    echo "Failed to fetch data. Full response: $response"
  fi

  sleep 15

////////////////////////////////////////////////////////////////

when running bash fetch_data.sh script i receive

[core-ssh ~]$ bash /config/fetch_data.sh

Raw JSON: "{\"volts\":55.02,\"amps\":0.26,\"watts\":14.29,\"soc\":100.00,\"basz\":280,\"batyp\":1}"

"{\"volts\":55.02,\"amps\":0.26,\"watts\":14.29,\"soc\":100.00,\"basz\":280,\"batyp\":1}"

/////////////////////////////////

So i believe its correctly being received.

In HA sensor is unknown…

Right, but your bash script is a loop, that never ends.
It’s exactly the same problem as trying to scrape the server.

Home Assistant needs ONE set of data to be spit out, so that it can be processed.
You can’t stream data to Home Assistant over http or a bash script.

If you don’t want to figure that out, then you might be better looking at creating an AppDaemon script which can sit and listen to the streaming server if you wish, and you can parse the data yourself and update a sensor yourself. But you will need to read up on how to write Python scripts to integrate with Home Assistant.

This is probably the only option: Python script calling the HA REST API.

Yeah I am willing to learn. This key data is extremely important in my life and will save SO many issues if I can use this data in automations.

Thank you. Where should I read up on how to do this?

ChatGPT has only got me so far haha.

The other possibility to experimenting with the timeout. It looks like it spits out data every 5 seconds or so, so providing the timeout is less than 5 seconds, and I’d recommend 2 or 3 to be honest, then you would only get one line of data. As long as you can get to the point where you only get one line of data, and then the connection is closed, it would be possible to help.

In the meantime though - I am going to experiment with NodeRed and see if that is a possibility, because it is more likely to be able to cope with streaming data.

Oh wow, thanks heaps. I have tried so much and spent days and just given up for now, discouraged. Outside my skill level. But would be very grateful. Thank you.

Can you send me a reminder during the day (GMT) because I completely forgot to look into this.