Face and person detection with Deepstack - local and free!

wmaker · February 23, 2021, 11:29pm

Thanks for letting me know. I’ve been testing out the recent CPU versions, and I’ve seen someone else on this thread seeing the same problem (but said it did work on an older CPU version).

EDIT UPDATE: I just tried the Windows10 version (CPU 2021.01.2) and the face register actually works!! The one I was having a problem with was Docker (CPU 2021.01.2).

FrontBottom · February 24, 2021, 4:57am

Hmm; Might just be an issue with that “CPU” image. The jetson one is specific to jetson’s too: deepquestai/deepstack:jetpack-2021.02.1

markss · March 2, 2021, 3:54am

@Alex_Pupkin amazing work on your TorchServer integration, can I respectfully suggest that as well as posting it here, you create a thread just for it?

Another approach to consider (or perhaps it is what you mean) is to standardise the API’s across them all like @synesthesiam has here: GitHub - synesthesiam/opentts: Open Text to Speech Server
^^ THIS is a truly awesome piece of work

Argo · March 4, 2021, 7:52am

@robmarkcole may I get some help with this question?
Maybe you could at least comment if it is on deepstack side, or I got something wrong with your component? Sorry for tagging, but situation is pretty confusing, seems like everything is done by docs

markss · March 4, 2021, 10:04am

Can you confirm you have it licensed properly?

Also, try placing a picture in the www folder and creating a camera of the type local file:

Then do your check on that camera to make sure the basic config is working, this is my “test” picture setup which I have left there for the moment

homeassistant:
  whitelist_external_dirs:
    - /config/www/cameras

camera:
  - platform: local_file
    name: file_front_door
    file_path: /config/www/cameras/motion-snapshot-frontdoor1.jpg

FrontBottom · March 4, 2021, 11:52am

Good suggestions Mark. Also, to take HA out of the equation all together, try querying Deepstack directly. e.g

curl -X POST -F [email protected] ‘http://192.168.1.26:5000/v1/vision/detection’

Let us know what response you get back.

Argo · March 4, 2021, 1:46pm

I created that camera based on my image and changed image_processing config. I restarted Home Assistant, tried to call service and got the same results:
Depstack error : Timeout connecting to Deepstack, the current timeout is 15 seconds, try increasing this value

I am not sure how to check it, here is deepstack’s page

P.S.
Here is my config:

camera:
  - platform: local_file
    name: file_front_door
    file_path: /config/www/images/Argo.jpg

image_processing:
  - platform: deepstack_face
    ip_address: 192.168.1.14
    port: 83
    timeout: 15
    save_file_folder: /config/snapshots/
    save_timestamped_file: True
    save_faces: True
    save_faces_folder: /config/faces/
    source:
      - entity_id: camera.file_front_door
        name: file_front_door

Argo · March 4, 2021, 1:48pm

I opened up my laptop’s terminal and went to directory with my image (Argo.jpg)
than I tried:
curl -X POST -F [email protected] ‘http://192.168.1.14:83/v1/vision/detection’
And got:
curl: (6) Could not resolve host: ‘http

What did I do wrong here?

P.S.
http://192.168.1.14:83/v1/vision/detection - gets me 404 in browser

wmaker · March 4, 2021, 5:05pm

Maybe check the type of quotation mark in you curl statement. Should look more like:
curl -X POST -F [email protected] 'http://192.168.1.14:83/v1/vision/detection'

Argo · March 4, 2021, 6:48pm

{"success":false,"error":"Detection endpoint not activated","duration":0} **%**

Restarting the container did not help

wmaker · March 4, 2021, 7:27pm

When you started the container, did you use the following environmental variable?
VISION-DETECTION=True

Argo · March 4, 2021, 7:43pm

I had only VISION-FACE to True
should I add detection?

divemasterjm · March 7, 2021, 8:45am

Just start using face and object recognition (deepstack), but i have a doubt in face recognition, i never have unknown faces even is the face is not recognised, also having confidence high system always return a name i have already registered with lower confidence but not unknown.
Can you please help me?

BruceH5200 · March 7, 2021, 9:10am

Hi, I’m running DeepStack on a Jetson Nano.
Only just started with it, but have got face recognition, which is great.

If anyone else is using the jetson nano, do you know should I use " –gpus all"

reading https://docs.deepstack.cc/nvidia-jetson/
it is unclear.

they show

sudo docker run --runtime nvidia -e VISION-DETECTION=True -p 80:5000 deepquestai/deepstack:jetpack

But just above that, they say you should use -gpus all

Any idea?

Thanks

kanebullen · March 7, 2021, 11:52am

Yeah, I haven’t done a lot of “real world” testing yet, but I’m certainly finding it pretty inaccurate. The only two real world tests I’ve done have 75% + confidence on a match on me, for people that really don’t look anything like me.

FrontBottom · March 8, 2021, 1:29am

My understanding is the “–runtime nvidia” is what you need/takes place of the (I assume) older -gpus flag.

mist · March 9, 2021, 9:14am

So perhaps i’m making up a strange use case but would it be possible to exclude custom models? For example i’d like to train deepstack to know which cars i own, and to only notify me if there is an unknown car on the driveway. Or is there another way to do this?

Alex_Pupkin · March 9, 2021, 1:33pm

That was my motivation to move to torchserve server vs deepstack. I have built in fastrcnn detecting cars and then trained densenet121 to recognize mine. It also recognizes me and my wife and our dogs vs other people and dogs. Also trained to recognize delivery trucks.

Alex_Pupkin · March 9, 2021, 1:49pm

Thats where i am going, @markss. Inference APIs are typically very simple and so it should be possible to build one HA component that focuses on HA use case on top. What do you think @robmarkcole?

I added some pipeline capabilities to my component:

models: #lables_in | model_name | label remap (* means pass on, null means dump) | filter python expression
      - '> | fastrcnn | {"person": ">", "car,truck,bus": "car", "dog,cat,bear,teddy bear,sheep,cow": "animal", "*": "null"} | "(\"946e\" not in self.entity_id or obj[\"centroid\"][\"y\"]>0.46) and (\"946e\" in self.entity_id and obj[\"confidence\"][\"y\"]>76) and (\"car\" not in obj[\"name\"] or obj[\"box_area\"]>0.01)"'
      - 'person,car,animal | unifiprotect-densenet121-ethereal-sweep-78 | {"object,package": "null", "*": ">"} | *'

Here is what this does:

Takes image_processing.scan image (">")
Sends it to fastrcnn object detection model (“fastrcnn”
Remaps labels (person is apssed on as-is, all vehicle labels mapped to “car”, all animal labels mapped to “animal” and all other labels ("*") discarded (“null”)
3 filters applied to my front camera that has “946e” in its name: a) filter for lower part of the image since it is a busy street above by checking centroid.y b) filtering for confidence of 76% or higher c) filtering out “car” detections that are too small - for some reason some plants are recognized as cars sometimes
Take person, car, animal detections
Send them to unifiprotect-densenet121-ethereal-sweep-78 image classification model (my custom model)
Discard object and package labels (“null”), pass all others ("*": “>”). This model is trained to recognize my car, me, my wife, delivery trucks, usps, my dogs, other docs - 14 classes of common front of my house objects.
Send these classifications for processing where HA events are fired and crops are saved for iOs notifications etc.

This allows me to build Computer Vision-based automations for front of the house activity.

Instead of 2 components for 2 different backends (torchserve, deepstack) it would be great to have 1 component and focus on more usable HA component with easier syntax, etc.

Alex_Pupkin · March 9, 2021, 1:54pm

By the way another direction to go to perhaps here is instead of image_processing compoenent, have a docker image scan RTSP send them to deepstack or torchserve docker and do all the processing and send MQTT images to HA. I am going to test this route out because i want to do high-FPS processing of camera streams and send events to HA. using image_processing.scan seems very inefficient for it - it results in delayed events on recognition.