Image processing and OCR beyond SSOCR - Tesseract?

markss · December 2, 2019, 3:02am

I’m slowly getting my home together with Home Assistant and it’s actually really cool using things like the Hikvision line crossing binary sensor to activate a door bell before the people press the button. Or using konnected.io to monitor PIR’s instead of paying >$1k for a locked-in security vendor.

I have however entered a brave new frontier of trying to read the status of things that aren’t 7 numbers. I have a cobbled together process which:

From a linux bash script:

grabs a frame from an RSTP feed
slightly rotates and cleans it up using image magick
runs it through the Tesseract OCR engine
places the file so file sensor can bring it into Home Assistant

For me that was quite the linux environment learning curve but did give me a sense of accomplishment.

But it did get me thinking!! How many times would a generic OCR service work for other things? A silly example would be tapping into the feed of a camera in a garage and sensing when the number plate was there or not there. Another option would be an appropriately sized sign which says “OPEN” visible from a camera when something is open, but obviously not visible when it is closed.

There are probably another 100+ examples that I could come up with and looking through the discussions I haven’t found similar ideas apart from meter reading.

So this post is about me looking for a skilled collaborator to try and give back to the community through the introduction of a re-usable OCR integration?

Key component building blocks:

https://github.com/tesseract-ocr/ (or other)
imagemagick (or other)

Any takers?

coderanger · January 3, 2020, 8:25pm

For what it’s worth, I think this would be a valuable addition to HA. I was just looking for something like this to read my natural gas meter and ran across your post.

I found a couple of people doing similar things that might be helpful for someone building this out.

This guy is reading a water meter using OCR. He has it split into two separate services. An image provider and the OCR server. I could see HA being a great asset here as it could be a generic image provider for the system. Maybe someone could take some of his ideas and make it into a hassio add-on: https://github.com/jomjol/water-meter-measurement-system

I really like what this other guy did as far as hardware and how it isn’t dependant on any external services, but it only working with 7-segment displays. https://medium.com/@trumpetgod/integrating-my-neptune-water-meter-with-home-assistant-896712a8c893

ronald1705 · December 4, 2021, 5:12pm

Sorry for digging up this old topic, but since this the top google result for many searches and has 1200 views I don´t feel too bad.

I am in need of a way to read the text from an LCD. Would you mind elaborating a bit on how you got your setup to work?

markss · December 5, 2021, 9:02pm

Hi @ronald1705 I only managed to bodge it together using scripts with the information I provided above. Unfortunately I never got as far as getting it working neatly .

The process was pretty much as I said above:

Grab image (RTSP grab) - HA can do this
isolate text in image using imagemagick command
read with tessereact
pass output

I did have this working 100% reliably but my linux knowledge was not sufficient to ‘package’ this together in a docker. I might give it another go over the xmas break though.

markss · December 16, 2021, 7:06am

@ronald1705 I have an update for you!

There is now a docker with API containerised version of Tesseract here on dockerhub: Docker Hub - though it is only linux/amd64

Depending on your use case, it could be as cough simple as creating a restful command

and/or a restful sensor:

That is where I am currently at, I don’t have unlimited time so would appreciate help from anyone else who is interested!!

emeyedeejay · June 2, 2022, 7:47pm

Hi … did you get any further with this? I want to read my electricity meter and am looking at options. Thanks!

LaVillaBlanche · August 18, 2022, 3:19pm

I’m looking for something similar, i got some TTGO T-journals lying around and was hoping to finally put them to use.
However i’m stuck with writing software for it, i once wrote something for it that took a picture and posted the picture on some sort of fishy chinese Big Data platform.

If anyone would have anything lying around that has simple code for the TTGO Tjournal to be deployed, and to be able to send the data to my HAOS Server, i could run a 2nd container on that server which takes care of the OCR results.

Thanks for any input/help!

markss · August 18, 2022, 10:29pm

Three years is a long time isn’t it

I’m still interested in this topic and with my job I have been playing with AI and API’s so if I can find the time between other things I’d be interested in teaming up with someone to build this out.

My current thoughts are to leverage one of the many up to X uses per month API’s.