I’m slowly getting my home together with Home Assistant and it’s actually really cool using things like the Hikvision line crossing binary sensor to activate a door bell before the people press the button. Or using konnected.io to monitor PIR’s instead of paying >$1k for a locked-in security vendor.
I have however entered a brave new frontier of trying to read the status of things that aren’t 7 numbers. I have a cobbled together process which:
From a linux bash script:
- grabs a frame from an RSTP feed
- slightly rotates and cleans it up using image magick
- runs it through the Tesseract OCR engine
- places the file so file sensor can bring it into Home Assistant
For me that was quite the linux environment learning curve but did give me a sense of accomplishment.
But it did get me thinking!! How many times would a generic OCR service work for other things? A silly example would be tapping into the feed of a camera in a garage and sensing when the number plate was there or not there. Another option would be an appropriately sized sign which says “OPEN” visible from a camera when something is open, but obviously not visible when it is closed.
There are probably another 100+ examples that I could come up with and looking through the discussions I haven’t found similar ideas apart from meter reading.
So this post is about me looking for a skilled collaborator to try and give back to the community through the introduction of a re-usable OCR integration?
Key component building blocks:
- https://github.com/tesseract-ocr/ (or other)
- imagemagick (or other)