Neural Network Human Presence Detection

Yes, most of the processing is offloaded. The only stuff the CPU is doing is grabbing the camera frames, doing the image transformations, sending to the stick, and doing the processing to display the image. You get even better CPU performance running it in headless mode “-no_image” since the CPU isn’t working to display the window with the camera preview.

OpenVINO with the compute stick is designed to be used on even a Rasperry Pi.

cool - they look quite affordable, may have to get one.

1 Like

is something you build or buy? I couldn’t find information (online sale)

Yeah, if you went that route, I’d go for the 2nd gen model since it’s $10 more, but has 8x the performance.

The great thing about YOLOv3 is that it’s actually detecting 80 different classes of objects, I’m just ignoring the other 79 in my repo’s program.

Also, if you take a look at some of their other samples in OpenVINO, you can actually load up multiple neural networks into a single stick, so you can also do facial recognition along with yolo simultaneously, if you wanted. Although, I haven’t really experimented with that yet.

https://www.mouser.com/ProductDetail/Intel/NCSM2485DK?qs=byeeYqUIh0OB4GXNqgW8aw%3D%3D

Here’s where I got mine.

Ok thanks.

So the stick is the hardware and YOLOv3 + OpenVINO is the software. Great that it works on the raspberry, though I am confused I thought would run only on Ubuntu 16.04 , need to study a bit more lol, also 90 Euro for having a presence detection is a bit steep for me, to be considered in the future

They’ve recently come out with a version that installs on Raspbian, although I have not attempted that.

https://software.intel.com/en-us/articles/OpenVINO-Install-RaspberryPI

Also, you can run this on a CPU without the stick, it’s just I use the stick to keep my CPU utilization low.

Yes clear, how accurate is it from 1 to 10? Will you add other cameras? I wish to know one stick how many cameras can handle

For example deepstack/facebox to me are like 60% accurate, something I can’t use for automation/scripts

Yes, I have been able to run OpenVino sames, both of them demo_security_barrier_camera.sh and demo_squeezenet_download_convert_run.sh

I have also copied the file libcpu_extension.so to the /neural_security_system/lib folder.

I have also build both YOLOv3 FP32 Models (full and tiny) following the instructions in your git repo

The this is the comand line I use:

./neural_security_system -i http://192.168.10.60:8081 -m ./models/yolov3/FP32/frozen_yolov3_model.xml -d CPU -t 0.2 -u xxxxx -p xxxxxx -tp cameras/front_door/humans -mh tcp://192.168.10.6:1883 -cr 150

This one works ok

./neural_security_system -i http://192.168.10.60:8081 -m ./models/tiny_yolov3/FP32/frozen_tiny_yolov3_model.xml -d CPU -t 0.2 -u user -p password -tp cameras/front_door/humans -mh tcp://192.168.10.6:1883 -cr 150

This one gives a core dumped

[ INFO ] Parsing input parameters

MQTT Username: CPU
Connecting to server ‘tcp://192.168.10.6:1883’…OK

InferenceEngine:
API version … 1.4
Build … 19154
[ INFO ] Reading input
[ INFO ] Loading plugin

API version ............ 1.5
Build .................. lnx_20181004
Description ....... MKLDNNPlugin

[ INFO ] Loading network files
[ INFO ] Batch size is forced to 1.
[ INFO ] Checking that the inputs are as the demo expects
[ INFO ] Checking that the outputs are as the demo expects
[ INFO ] Loading model to the plugin
[ INFO ] Start inference
Segmentation fault (core dumped)

And this one (removing -t 0.2 and -cr 150) runs ok …

./neural_security_system -i http://192.168.10.60:8081 -m ./models/tiny_yolov3/FP32/frozen_tiny_yolov3_model.xml -d CPU -u user -p password -tp cameras/front_door/humans -mh tcp://192.168.10.6:1883

… but as soon as I get in front of the camera, I get again a core dump

I use ubuntu 16.04, intel sdk_2018.5.455

Any clue?

Honestly, in detecting people, I’ve found that even Tiny YOLOv3 is pretty near 90% in near to medium range. You can get more accuracy with the larger “non-tiny” version, but it’s slower performance.

For instance, I have never had a time when a person was on my porch or on the front lawn when it did not detect the person. There are plenty of times people across the street were not detected, but I guess that could be expected because of how blurry it is (the focus of my camera is on the mid-range). I just crop out that section anyways, since I’m more concerned about people on my property.

Keep in mind, however, that even though YOLOv3 is one of the best in terms of accuracy, it still won’t find a person in every frame of video (hence why the bounding box in my video above flickers on and off). That’s why I implemented a timeout where it won’t publish to MQTT the lack of humans until no humans are seen for the timeout period.

I believe I might have an idea. You said the FP32 version of Tiny YOLOv3 works in your last example, but it crashes as soon as you step into the frame.

This leads me to believe that has something to do with what runs when it detects a person. It doesn’t seem to be MQTT related (since you have the output of it successfully connecting to your server).

What is the resolution of your camera feed? If turn your camera resolution down to 1024x768 will it work then?

I have just tested with 1024x768 / 24 frame rate and I still get a core dump.
I have tried with diferent resolutions / frame rate and same result …

It runs ok, but as soon as I get in front of the camara, I get a core dump

Actually, I might’ve figured it out. My instructions in the readme are wrong. When you copy over coco.names into the FP32 folder, it needs to be named the same as the XML file, except for “.labels” as the extension.

I’ve edited my instructions to fix that. Thanks for pointing out this issue. It was a copy-paste error, haha

EDIT: So, I believe the crash is due to the neural net detecting something, and then it tries to lookup the label for the corresponding class index from the “.labels” file, and it can’t open the file since it is not named what it expects.

If I set -no_image then it runs ok and I dont get any coredump.

./neural_security_system -i http://192.168.10.60:8081 -m ./models/tiny_yolov3/FP32/frozen_tiny_yolov3_model.xml -u user -p password -tp cameras/front_door/humans -no_image -mh tcp://192.168.10.6:1883

EDIT1: You are right, if I rename the label file it works perfect !!

frozen_tiny_yolov3_model.bin
frozen_tiny_yolov3_model.labels
frozen_tiny_yolov3_model.xml

Thanks a lot for your support !!

1 Like

Awesome, glad I was able to figure that out for you! What FPS are you getting with the Tiny YOLOv3 on your system? It should be displaying it on the video window as it is running.

Also, you can switch on “async” mode with the tab key, and that might have an effect on performance.

It seems a bit weird to detect the remote control as a cake …

That is definitely strange. Those are low confidence values though. Although I’m fairly certain is is accurately detecting the chair (that’s one of the YOLOv3 classes).

Also, remote is a valid class as well. Perhaps the ordering of the labels in the file are wrong. I’d have to do more testing. Does it still say apple when you bring it closer?

Well, increasing the -t to something arround 0.4 it start to be more acurate…

By the way, could you please describe a bit about how to integrate all this with home assistant + node-red?

Ah yes, the low probability matches would be all over the place. Also, accuracy is improved in the “non-tiny” version, although the FPS is much less.

The way that I’ve integrated with Home Assistant is create a binary sensor that picks up from my MQTT topic. That way the sensor is on when people are in the frame and off when there are none.

So, if you have this all working, the next step would be to create a binary sensor that matches the same MQTT topic you are running this program with. Then you should see live results.

You should then be able to create automations based around the state of that binary sensor.