Error scraping APSystems ECU-3 webpage

Hi, first post here so let me know if this is in the wrong place :wink:

I’m trying to set up a scraper to get solar data from the solar controllers web page but I keep getting this error in the log:

Error fetching data: http://192.168.68.120/cgi-bin/home/ failed with illegal header line: bytearray(b'debug9')

In my YAML config I have this:
sensor:

 - platform: scrape
   name: APSystems 
   resource: "http://192.168.68.120/cgi-bin/home/" # your ecu ip address
   select: "tr:nth-child(2) > td:nth-child(2)"
   value_template: '{{ ((value.split(" ")[0]) | replace ("KW", "")) }}'
   unit_of_measurement: "kWh"
   headers:
      User-Agent: Mozilla/5.0

the pages source (sorry, I don’t know how to display it better):

<html><head><meta http-equiv=pragma content=no-cache><meta http-equiv=expire content=now><title></title><script type="text/javascript" src="http://gc.kis.v2.scr.kaspersky-labs.com/FD126C42-EBFA-4E12-B309-BB3FDD723AC1/main.js?attr=6vTldPB7mX88hZJxLeAMQcGRUOPM43nNwfnYm5N3ZKv8hr8BzpJgTQjKg36p7jFUX0gJClEopUR7Tn7pDDX9iA" charset="UTF-8"></script></head><body bgcolor=ffffff text=black><form action=config.cgi method=get><br><br><table align=center border=1 cellpadding=0 cellspacing=0 bordercolor=#008000 bordercolorlight=#ffffff borderdark=#808000 width=500><center><tr><td align=center>ECU ID</td><td align=center>203000016362</td></tr></center><center><tr><td align=center>Lifetime generation</td><td align=center>39567.56 kWh</td></tr></center><center><tr><td align=center>Last System Power</td><td align=center>879 W</td></tr></center><center><tr><td align=center>Generation Of Current Day</td><td align=center>13.81 kWh</td></tr></center><center><tr><td align=center>Last connection to website</td><td align=center>2021-04-11 16:23:09</td></tr></center><center><tr><td align=center>Number of Inverters</td><td align=center>10</td></tr></center><center><tr><td align=center>Last Number of Inverters Online</td><td align=center>10</td></tr></center><center><tr><td align=center>Current Software Version</td><td align=center>V3.11.4</td></tr></center><center><tr><td align=center>Database Size</td><td align=center>29298 kB</td></tr></center><center><tr><td align=center>Current Timezone</td><td align=center>Australia/Melbourne</td></tr></center><center><tr><td align=center>ECU Mac Address</td><td align=center>80:97:1B:00:40:3F</td></tr></center></table><br><br><hr></hr><center><tr><td>&copy2013 Altenergy Power System Inc.</td></tr></center></body></html>

I’ve googled the error but can’t find anything out about it. To my limited knowledge it looks like it is receiving something not expected, but I can’t figure out where I’m going wrong.

If anyone can help, it would be very much appriciated.
Thanks.

I’m now wondering if it because it is in /cgi-bin/home, can any web people out there tell me if that page would be available to home assistant, or is that page only available to a web browser?
Thanks

If anyone else is looking for answers to this; the APS ECU-3 I have is running software version V3.11.4
It seems that trying to scrape data from it’s webserver doestn’t work in YAML or NodeRed is it returns malformed pages.
I am a beginner when it comes to programming, so this is messy, and there are propably better ways to do it, but;
Installed Appdaemon
wrote this app by plagiarizing from the forums


import appdaemon.plugins.hass.hassapi as hass
import requests
from bs4 import BeautifulSoup
from datetime import datetime


#
# APS web scaper
#

class ApsScraper(hass.Hass):

  def initialize(self):
   
    now = datetime.now()
    #APS ECU only updates once per 5 mins :(
    self.run_every(self.get_values, now, 5 * 60)


  def get_values(self,kwargs):
      
      self.log("watch me do my thing")
      self.url = 'http://192.168.68.120/cgi-bin/home'
      self.sensorname = "sensor.aps"
      self.friendly_name = "APS Energy Monitor"

      try:
        response = requests.get(self.url,timeout=5)
      except:
        self.log("i couldnt read the APS page")
        return
      page = response.content

      soup = BeautifulSoup(page, "html.parser")

      table = soup.find('table')
      first_td = table.find('td')

      table_data = []
      table = soup.find('table')

      table_rows = table.find_all('tr')
      for row in table_rows:
        table_cols = row.find_all('td')
        table_cols = [ele.text.strip() for ele in table_cols]
        table_data.append([ele for ele in table_cols if ele])  # Get rid of empty values


      ecu_id = table_data[0][1]
      lifetime_generation = table_data[1][1]
      last_system_power = table_data[2][1]
      generation_of_current_day = table_data[3][1]
      last_connection_to_website = table_data[4][1]
      number_of_inverters = table_data[5][1]
      last_number_of_inverters_online = table_data[6][1]
      current_software_version = table_data[7][1]
      database_size = table_data[8][1]
      current_timezone = table_data[9][1]
      ecu_mac_address = table_data[10][1]

      self.set_state(self.sensorname, state = "on", attributes = {"ecu_id": ecu_id, "lifetime_generation": lifetime_generation, 
      "last_system_power": last_system_power, "generation_of_current_day": generation_of_current_day, "last_connection_to_website" : last_connection_to_website,
      "number_of_inverters":number_of_inverters,"last_number_of_inverters_online":last_number_of_inverters_online, "current_software_version":current_software_version,
      "database_size": database_size,"current_timezone":current_timezone, "ecu_mac_address":ecu_mac_address})

Hi,
Thanks a lot for your work. I had the exact same issue with my APS system.
Not sure why the /cgi-bin/home page can’t be read when the /cgi-bin/parameters has no issue (I’m using multiscrape to retrieve individual panel details).

For info, in the Supervisor page, to make this work, we need to add the BeautifulSoup package for AppDaemon:

system_packages: []
python_packages:
  - beautifulsoup4
init_commands: []

And I also updated your app script to use “now” instead of using datetime (I had issues with timezone I believe):

def initialize(self):
    self.log("Setting APS timer")
    #APS ECU only updates once per 5 mins :(
    self.run_every(self.get_values, "now", 5 * 60)

And exposed individual values as numbers to home assistant:

      self.set_state("sensor.panels_lifetime_generation", 
        state = float(lifetime_generation.split()[0]), 
        attributes = {"unit_of_measurement":"kWh","friendly_name": "Lifetime generation"})
      self.set_state("sensor.panels_last_system_power", 
        state = float(last_system_power.split()[0]),
        attributes = {"unit_of_measurement":"W","friendly_name": "Last System Power"})
      self.set_state("sensor.panels_generation_of_current_day", 
        state = float(generation_of_current_day.split()[0]),
        attributes = {"unit_of_measurement":"kWh","friendly_name": "Generation Of Current Day"})
      self.set_state("sensor.panels_number_of_inverters", 
        state = int(number_of_inverters),
        attributes = {"friendly_name": "Number of Inverters"})
      self.set_state("sensor.panels_last_number_of_inverters_online", 
        state = int(last_number_of_inverters_online),
        attributes = {"friendly_name": "Last Number of Inverters Online"})

Out of curiosity, have you done more since your last post or improved it?
Thanks a lot mate.

I’m glad it helped, thanks for pointing out it needs beautifulsoup4.
I haven’t done anything with it since then, it’s been happily working so I haven’t had a need to change it.

One thing I think I overlooked in mentioning previously is that the APS web page only seems to reload each 5 mins, so there is no point trying any more often (not that you had changed that).

Thanks for adding to it on here, hopefully your changes will help others in the future.

AJ.

Just wanted to say thankyou to both of you. I moved into a house with one of these systems 3 years ago and the lack of ability to track trends and generation history had been annoying me endlessly. I now have Home Assistant displaying and logging all of the available data. Thanks for taking the time to post your work :grinning:

Phil

Thanks for taking the time to post, I appreciate it. :+1:

Hello there,

owning an ECU-R ( with some DS3 ) + an ECU-3 ( with 11 YC-500 )
i thought it would be a piece of cake to throw the data in H.A.,
well, the ECU-R was done in a day with the KSheusemaker integration
but the ECU-3 is more difficult, older and less owners i guess.

could somebody please summerize in a short manual which components
are needed, and share this scraping script somewhere ?

thanks in advance
Jakkes

1 Like

Would also greatly appreciate a bit more of a how-to. I’m not familiar with AppDaemon

First let me say sorry I didn’t reply to this sooner.

An update along the way deleted my appdaemon configs, so I had to re learn how to get it working again.

  1. Goto Settings, Add-ons, and install AppDaemon
  2. Goto AppDaemon Addon, Configuration, Click 3 dots, select “Edit in YAML” and add beautifulsoup4
system_packages: []
python_packages:
  - beautifulsoup4
init_commands: []
  1. Samba to you home assistant share, under “addon_configs”, you should have a folder for appdaemon, in there should be an “apps” folder. Make a new file called “aps_scraper.py”, and paste this in:

import appdaemon.plugins.hass.hassapi as hass
import requests
from bs4 import BeautifulSoup
from datetime import datetime


#
# APS web scaper
#

class ApsScraper(hass.Hass):

  def initialize(self):
    if self.now_is_between("02:00:00", "23:00:00"):  
        self.get_values(self)
    now = datetime.now()
    handle = self.run_daily(self.reset_current_day, "00:00:00")
    #APS ECU only updates once per 5 mins :(
    self.run_every(self.get_values, now, 120)


  def get_values(self,kwargs):
      
      self.url = 'http://192.168.1.33/cgi-bin/home'

      time_is = self.time()
      self.log("getting aps stuff at %s", time_is)
   
      try:
        response = requests.get(self.url,timeout=5)
      except:
        self.log("i couldnt read the APS page")
        return
      page = response.content

      soup = BeautifulSoup(page, "html.parser")

      table = soup.find('table')
      first_td = table.find('td')

      table_data = []
      table = soup.find('table')

      table_rows = table.find_all('tr')
      for row in table_rows:
        table_cols = row.find_all('td')
        table_cols = [ele.text.strip() for ele in table_cols]
        table_data.append([ele for ele in table_cols if ele])  # Get rid of empty values


      ecu_id = table_data[0][1]
      lifetime_generation = table_data[1][1]
      last_system_power = table_data[2][1]
      generation_of_current_day = table_data[3][1]
      last_connection_to_website = table_data[4][1]
      number_of_inverters = table_data[5][1]
      last_number_of_inverters_online = table_data[6][1]
      current_software_version = table_data[7][1]
      database_size = table_data[8][1]
      current_timezone = table_data[9][1]
      ecu_mac_address = table_data[10][1]
      
      self.sensorname = "sensor.aps_lifetime_generation"
      self.friendly_name = "APS Energy Monitor Lifetime Generation"
      remove_non_digits = ''.join(ch for ch in lifetime_generation if ch.isdigit())
      lifetime_generation = remove_non_digits
      lifetime_generation_kW = int(lifetime_generation)/100
      self.set_state(self.sensorname, state = lifetime_generation_kW, attributes = {"unit_of_measurement": "kWh", "state_class": "total_increasing", "device_class": "energy" })
      
      self.sensorname = "sensor.aps_last_system_power"
      self.friendly_name = "APS Energy Monitor last_system_power"
      remove_non_digits = ''.join(ch for ch in last_system_power if ch.isdigit())
      last_system_power = remove_non_digits
      last_system_power_kW = int(last_system_power)/1000
      self.set_state(self.sensorname, state = last_system_power_kW, attributes = {"unit_of_measurement": "kW"})
      
      self.sensorname = "sensor.aps_generation_of_current_day"
      self.friendly_name = "APS Energy Monitor generation_of_current_day"
      remove_non_digits = ''.join(ch for ch in generation_of_current_day if ch.isdigit())
      generation_of_current_day = remove_non_digits
      generation_of_current_day_kWh = int(generation_of_current_day)/100
      self.set_state(self.sensorname, state = generation_of_current_day_kWh, attributes = {"unit_of_measurement": "kWh", "state_class": "measurement", "device_class": "energy", "last_reset": self.date()})
      
      
  def reset_current_day(self,kwargs):    

      self.sensorname = "sensor.aps_generation_of_current_day"
      self.friendly_name = "APS Energy Monitor generation_of_current_day"
      generation_of_current_day = 0
      self.set_state(self.sensorname, state = generation_of_current_day, attributes = {"unit_of_measurement": "kWh", "state_class": "measurement", "device_class": "energy", "last_reset": self.date()})
      
      time_is = self.time()
      self.log("i'm in def reset_current_day(self,kwargs): ")
      self.log("generation_of_current_day is %s", generation_of_current_day)

You will need to change the IP address.

  1. In the same folder should be apps.yaml, edit that to include:

aps_web_scraper:
  module: aps_scraper
  class: ApsScraper
  log: test_log

I think that is all, but others might be able to add more detail or a better way to do it.

AJ

1 Like

Hi

Has anyone got this working on a ECU-3 running firmware version 4.1?
In 4.1 the URL has changed from CGI -BIN to:
http://192.168.2.88/index.php/home | landing page
http://192.168.2.88/index.php/realtimedata) → realtime landing page
http://192.168.2.88/index.php/realtimedata/power_graph → power section
http://192.168.2.88/index.php/realtimedata/energy_graph → Energy section
http://192.168.2.88/index.php/management → management

Would be good to scrap the landing page as is gives this info:
ECU ID
Lifetime generation 2925.64 kWh
Last System Power 166 W
Generation of Current Day 0.29 kWh
Last Connection to website 2018-12-26 12:17:16
Number of Inverters 6
Last Number of Inverters Online 6

Also the realtime landing page as here you can get the details per inverter/panel

Someone an idea how to get this scraped/imported in HA?

Thanks for the support.

Bernard

For the longest i didnt think i could add my solar but this post is amazing. However im not sure how to implement it. Is this code sort of like github where do i need to put this web scrapping tool.