Image recognition (not facial) input to drive automations?

So, here’s my challenge…
I have a display (light emitting) which will tell me when I have enough power from my solar panels to justify running an optional pond pump. I don’t need to run it all the time so I’d like to run it when I’ve got power to waste.
I already have a home automation that tracks production of power and I have a threshold production where I automatically start the pump - using a couple of TAPO power sockets BUT I don’t know what the sum total of all other consumption is around the house so I don’t known if I have spare power.
Now I have an indicator that lights up when I have excess power - but it is only a visual indication and I can’t ‘read’ it using any of the usual options. So I have built a light detector to drive an ESP-01m (still waiting for some parts for the restart tri-state interlock circuit). This is a bit prone to light pollution where the sensitivity changes depending on how bright the incidental light on the screen is. So the preferred solution would be a camera that literally looks at the display and when this field is active, starts the pump.

I have some ESP32s, ESP8266s, several Raspberry pies and I’m running Home assistant on a mini PC. My background is in SW development, but I haven’t used C++ and I dabble in simple electronics. Can anyone suggest a way to reliably monitor this display for this event, on a device that can report to HomeAssistant when the status changes?
I’ll need to be able to restrict the detection area of the camera or produce a mask for the the icon I am watching for.

My first thought was an ESP with a camera… but I can’t see a way to monitor for a specific part of the field of view on devices running ESPHOME, or even to report changes of image. Maybe there is a suitable HA add on to do image recognition in the PC?
Then I considered a Raspberry pi Zero with a camera, there should be some suitable SW available to the image processing - but how do I make an RPi into a HA client?

All to save a couple of pence worth of power… Still it’s a fun little project, and any suggestions will be most welcome.

This is soo true, the effort and material spent on this by many of us will push break-even further and further :rofl: but the fun part is what counts for me too.
When you say that the display lighs up, this also means that something is triggering that, can’t you capture that trigger/signal? Is the light-up part possibly on a webpage? Then you may scrape it possibly with/out more complex stuff like selenium?