Brultech Greeneye Issues

So I did have Dashbox forward packets to HA and it worked for a while, but it broke at some point over the last few month. It looks like it is still working when I do this:

hass_netstat_8009

My IPs are as follows:
10.0.1.150 Dashbox
10.0.1.151 GEM #1
10.0.20.99 HA Server

But all my GEM sensor data shows as unavailable in my dashboards. So I went to my GEM with an Ethernet module and pointed it directly to HA as follows:

I then deleted the packet forwarding I had setup on the Dashbox to HA and ran netstat again from HA:

hass_netstat_8009_2

So it is intermittent for some reason, but maybe that is just because it only sends packets every 8 seconds

When I point a browser at the GEM, I have this on the Packet Send tab:

And the Network tab is blank as I apparently can only configure the network from the Network Utility shown above:

My config in HA is as follows:

GEM HA Config

After a restart none of the channels showing up as entity entries. But after about 10 minutes they finally did and I could see the values changing in real-time in my dashboards.

I then added a few more channels in my configuration.yaml file, restarted HASS, and now the dashboard just shows unavailable for all the values as seen here:

When I look at the entity entries, ones that were there already looks like this:

That red circle apparently means they have been “restored” and the new ones I added are not showing up at all.

Any idea what’s going on?

I also had packet forwarding from dashbox to HA briefly working but then it would stop. I never investigate much but it’s got something to do with GEM and HA “auto configuration”. Once that’s setup Dashbox packet forwarding will work and then stop if, for example, you restart home assistant. The error in the HA logs is “siobrultech_protocols.gem.protocol on home assistant that force the connection to close”

Ah ok, that makes sense.

So what would cause the issue with a GEM with an Ethernet module sending directly to HA also being unreliable? That’s what I’m fighting with now per the details above. Should I send packets every 5 seconds instead of every 8?

I have tried this without success. I miss having my GEM data in HA

I’ve found the GEM → HA to be pretty reliable. I have two GEMs that are sending every 8 seconds. One of the GEMs might have a weak wifi connection which would cause it to go offline until it was restarted. In my case I was able to solve this by using the watchdog (reset timer) for the appropriate COM port. I used a setting of 100.

1 Like

That’s good to hear. Since I just started doing GEM → HA yesterday, my experience is very limited. It seems that after a HA restart, it takes 15 minutes to “settle in” and then appears to be working. During the “settle in” period I see those red restored circles next to each entity entry. Then after that period of of time, those symbols go away and things start to work, and seems to stay working until the next time I restart HA after making a change.

I did a restart at 5:20AM and it took right around 15 minutes before the GEM sensors started reporting again:

restart_delay

Is anyone else seeing that, or is this unique to my setup?

I had some Brultech Etherport adapters laying around for my ECM-1240s, so I put those on my other 2 GEMs, and HA was able to pick up from them within seconds of a HA restart.

hass_netstat_8009_3

Every time I run the netstat command, both of the Etherport adapters report established, but its hit or miss with the GEM with the internal Ethernet daughterboard. I’ll try power cycling that GEM and check to make sure the Ethernet cable is plugged in good.

UPDATE:

Ben fixed me right up. The idle time was set to 2 seconds on the GEM with the Ethernet daughterboard. Once I changed that to 15 seconds, the issue disappeared completely.

1 Like

After a long period of stability my two GEMs are no longer sending data consistently to HA. I haven’t changed anything in my setup except for updating HA Core but I’m not sure when the issue actually started.

  • If I restart the GEM or enter setup and exit then some data starts flowing to HA
  • Then it stops. Usually I’ll only get data for under a minute before something goes wrong
  • The logs do have some interesting / related errors but I am at a loss as to what caused this (see below)
  • I’ve restarted HA, the GEMs, and all of my networking equipment just in case
2022-12-07 17:08:46.716 WARNING (MainThread) [siobrultech_protocols.gem.protocol] 140207094692352: Connection lost due to exception                                                                                                                                   
Traceback (most recent call last):                                                                                                                       
  File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 854, in _read_ready__data_received                                                                                                                                                                 
    data = self._sock.recv(self.max_size)                                                                                           
ConnectionResetError: [Errno 104] Connection reset by peer 

Anyone else experiencing this or have any ideas what’s going on?

Has anyone managed to get this to be stable, especially with a configuration of GEM → Dashbox → HA? I have that setup and I have a GEM sending to Dashbox using Bin32-NET format. That’s working fine at the dashbox. I have this in my configuration.yaml

Setup Brultech GreenEye integration

greeneye_monitor:
port: 8001
monitors:
- serial_number: “XXXXXXXX”
channels:
number: 1
name: total_power
voltage:
- number: 1
name: house_voltage

If I do a capture on the network I can see that the Dashbox is delivering TCP packets to port 8001 as requested. I’ve parsed the packets myself and they are valid Bin32-NET packets according to the Brultech specs. But I can’t get HA to read these. I get a ton of log entries like this:

Logger: greeneye.monitor
Source: /usr/local/lib/python3.10/site-packages/greeneye/monitor.py:431
First occurred: March 3, 2023 at 2:28:42 PM (8253 occurrences)
Last logged: 9:21:42 AM

Bad logger message: Exception while calling the listener! ((AssertionError(),))
Traceback (most recent call last):
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 429, in _consumer
await self._listener(message)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 522, in _handle_message
await self._add_monitor(serial_number, message.protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 536, in _add_monitor
await self._set_monitor_protocol(monitor, protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 546, in _set_monitor_protocol
await monitor._set_protocol(protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 292, in _set_protocol
await self._sync_with_settings(self._protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 295, in _sync_with_settings
settings = await get_all_settings(protocol, self.serial_number)
File “/usr/local/lib/python3.10/site-packages/greeneye/api.py”, line 61, in get_all_settings
return await f(None)
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/api.py”, line 83, in send
return api.receive_response(protocol)
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/api.py”, line 70, in receive_response
return self._parse_response(protocol.receive_api_response())
File “/usr/local/lib/python3.10/site-packages/greeneye/api.py”, line 68, in _parse_all_settings
assert response.startswith(_ALL_SETTINGS_RESPONSE_PREFIX)
AssertionError

I don’t really understand where it’s going wrong. I also got a couple of messages like this:

Logger: greeneye.monitor
Source: /usr/local/lib/python3.10/site-packages/greeneye/monitor.py:431
First occurred: 9:21:48 AM (2 occurrences)
Last logged: 9:21:48 AM

Bad logger message: Exception while calling the listener! ((ProtocolStateException(),))
Traceback (most recent call last):
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/api.py”, line 88, in call_api
yield send
File “/usr/local/lib/python3.10/site-packages/greeneye/api.py”, line 61, in get_all_settings
return await f(None)
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/api.py”, line 83, in send
return api.receive_response(protocol)
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/api.py”, line 70, in receive_response
return self._parse_response(protocol.receive_api_response())
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/protocol.py”, line 296, in receive_api_response
response = bytes(self._api_buffer).decode()
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xfe in position 0: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 429, in _consumer
await self._listener(message)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 522, in _handle_message
await self._add_monitor(serial_number, message.protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 536, in _add_monitor
await self._set_monitor_protocol(monitor, protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 546, in _set_monitor_protocol
await monitor._set_protocol(protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 292, in _set_protocol
await self._sync_with_settings(self._protocol)
File “/usr/local/lib/python3.10/site-packages/greeneye/monitor.py”, line 295, in _sync_with_settings
settings = await get_all_settings(protocol, self.serial_number)
File “/usr/local/lib/python3.10/site-packages/greeneye/api.py”, line 60, in get_all_settings
async with call_api(_GET_ALL_SETTINGS, protocol, serial_number) as f:
File “/usr/local/lib/python3.10/contextlib.py”, line 217, in aexit
await self.gen.athrow(typ, value, traceback)
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/api.py”, line 90, in call_api
protocol.end_api_request()
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/protocol.py”, line 307, in end_api_request
self._expect_state(
File “/usr/local/lib/python3.10/site-packages/siobrultech_protocols/gem/protocol.py”, line 327, in _expect_state
raise ProtocolStateException(actual=self._state, expected=expected_state)
siobrultech_protocols.gem.protocol.ProtocolStateException: Expected state to be RECEIVED_API_RESPONSE, or SENT_PACKET_DELAY_REQUEST; but got SENT_API_REQUEST!

Interstingly, the invalid start byte that the integration is complaining about is 0xfe which is the correct first byte of a Bin32-NET packet.

Not sure what to do next to troubleshoot this. Any suggestions?

Thanks

Update: I did realize that there was a missing ‘-’ in the collection of channels, so I added that but I do still get all the same errors. So, on to the code.

It seems that what happens over and over is that a packet gets received, it is properly recognized as a Bin32-NET packet and parsed and then the decides it has “Discovered a new monitor”. After this it is trying to set the protocol on the new monitor. It determines that the protocol is currently not known and this triggers and API call which is an attempt to send an RQSALL message back to the GEM which will cause it to respond with all its settings.

This seems to be the problem. This sequence completes most times but the result is a 0-byte string, so it can’t parse - this is where it complains that the message doesn’t start with _ALL_SETTINGS_RESPONSE_PREFIX. In the other failure case the state machine gets broken because there is no response.

In truth, I can’t see a request even being sent back, and certainly the dashbox isn’t listening for it. This is the core problem, there is no way for this to work with a middlebox like the dashbox in place. I have confirmed that if I point the GEM directly at the HA box then this sequence completes and all is happy (at least for now), but that renders the dashbox unusable.

It would be great for the integration to not need to query the GEM, I’m not sure completely what it needs from that - maybe some things for net metering? I haven’t found the use for that data yet, but I’ve only just started on the code really.

Anyway, that’s the issue as I see it, and the only way forward as far as I can tell.

Thanks

So, overnight this seemed to work well. I was able to create sensors and helpers in HA. I like having the Dashbox but my long term direction is HA for sure. One thing that did happen is that the sensors stopped updating at 8:45 AM today, about 20 hours after I got it working. Restarting HA didn’t help and I couldn’t connect to the GEM. I ended up having to reboot the GEM.

I did not confirm during this that the GEM wasn’t sending packets to the HA server. I didn’t leave it configured for packet capture, I’ll have to do that so that if/when this happens again I can isolate this to one end or the other.

UPDATE:
This actually turns out to be a bit more of a problem than I initially realized. The basic situation is this: When the integration starts it loads the configuration but then it’s just waiting for a packet at the specified port. When it gets the first packet it looks at that and determines that it doesn’t have configuration for that monitor so it sends a query back to the GEM, expecting a response at the same location.

Now, if the response is the next thing received then all is good because the received response satisfies the integration. But if, by chance, the GEM sends another data packet before it sends the response to the RQSALL query, then the result is that the inegration decides it’s a malformed packet. It doesn’t have the ability to realize that it could be either that response or more data, at least that I can see in the code.

This becomes a startup race condition. On startup I’ve had all the sensors from one of my GEMs disappear and there are log entries that say there’s this error parsing the response. HA says that the integration is no longer providing the data. If I restart I can get it to work, might take a couple of restarts. But this results in behavior after upgrades that is problematic. @jkeljo I don’t see a clean fix for this since the code that has sent the request is never in a position to process anything but the response. Maybe recognize it as a data packet and just discard it and wait for a valid response, at least for a couple of tries?

Hey all, I’m moving the greeneye_monitor integration from core to HACS. The HACS version has energy dashboard support, ECM-1220/1240 support, is Dashbox-tolerant, and more. Check out this thread for more info: Brultech GreenEye Monitor and ECM-1220/1240 integration

Thanks @jkeljo I’ll look at the version in HACS. Curious though why you want to move from core to HACS?

I wasn’t getting enough traction in core, unfortunately. My pull requests to add UI configuration support sat blocked for months and months waiting for a review, to the point where I actually got demotivated and walked away from the project entirely for a year. :frowning:

I think the volume of pull requests is just so overwhelming for the maintainers that nontrivial PRs for less-used integrations tend to languish. Until something changes about that, I think HACS will be a better place for integrations like this one. (I suspect what it would take is significant investment in automation of reviews and/or Nabu Casa hiring people to provide an SLA for review turnaround time.)

2 Likes

Yeah, that makes sense. Thanks

I’m having trouble migrating from the core integration to the HACS one. I installed the HACS version and it read my existing config, but it shows no devices or hubs, and when I try to configure the hub I get the following:

I have commented out the legacy configuration bits but even after restart still don’t have any devices or entities. Any thoughts?

Just a follow up; I finally fixed this. Turns out my DHCP configuration was corrupted around the same time as the upgrade so my GEM was pointing at the wrong local IP for the HA instance. Simple fix, and working great now. Thanks.

@jkeljo - I saw an issue you opened in github to allow changing pulse counter config after setup. It says it needs to be done in YAML now. Is there a pointer to how this is done in YAML?

I have added a pulse counter. When I download the diagnostics I can see that the Greeneye add-in is getting the pulse data, it’s just not available in HA. I tried adding back a section of YAML to the config that just has the pulse counter. I see this YAML in the diagnostics, but it isn’t merged into the config. Would appreciate any pointers to how to add this in YAML. I’m totally fine adding it in YAML, just can’t figure out how to get it to take.

I actually got this to work. I added the old YAML section and restarted again (not reload YAML and reload add-in, but HA restart). That gave me the “this no longer uses YAML” warning again and it did merge in the setting.

@jkeljo - There is one nasty little issue here. If you add just the YAML for the counter, the add-in seems to recreate its config - mostly. In my case I had 4 temp sensors originally configured in YAML. When the HACS add-in imported that they had the correct degrees-F as a unit. When I made this change to add the counter the add-in seems to have decided that it was degrees-C. The result is that it took the Greeneye reported measurements - which are in F - and used that as C. Consequently in my display I had really wrong numbers as 130C translates to way more than 130F!

If you look at the diagnostics after I created the pulse counter you can see that it got the counter but switched the units to C

    "monitors": [
      {
        "serial_number": XXXXXXX,
        "temperature_unit": "\u00b0C",
        "net_metering": [],
        "pulse_counters": [
          {
            "number": 2,
            "device_class": null,
            "counted_quantity": "gal",
            "counted_quantity_per_pulse": 1.0,
            "is_aux": false
          }
        ]

My solution was to add the temp sensors to the YAML and restart so it could re-import that. This worked as the disgnostics show:

    "monitors": [
      {
        "serial_number": XXXXXXX,
        "temperature_unit": "\u00b0F",
        "net_metering": [],
        "pulse_counters": [
          {
            "number": 2,
            "device_class": null,
            "counted_quantity": "gal",
            "counted_quantity_per_pulse": 1.0,
            "is_aux": false
          }
        ]
      },

I can find no other way to change this. The configure link in the add-in produces and error only.