Local realtime person detection for RTSP cameras

Very interesting component, nicely done! I was attempting something similar myself but it wouldn’t have turned out this nicely. I am trying it out but am getting “Unable to capture video stream”. Here is what I’m given when I go to the RTSP URL of my camera. Do you see anything it would choke on? Is H.264 a problem?

<StreamingChannel xmlns="http://www.hikvision.com/ver10/XMLSchema" version="1.0">
<id>1</id>
<channelName>Back</channelName>
<enabled>true</enabled>
<Transport>
<rtspPortNo>554</rtspPortNo>
<maxPacketSize>1000</maxPacketSize>
<sourcePortNo>8200</sourcePortNo>
<ControlProtocolList>
<ControlProtocol>
<streamingTransport>RTSP</streamingTransport>
</ControlProtocol>
</ControlProtocolList>
<Unicast>
<enabled>true</enabled>
</Unicast>
<Multicast>
<enabled>true</enabled>
<destIPAddress>0.0.0.0</destIPAddress>
<destPortNo>8600</destPortNo>
</Multicast>
</Transport>
<Video>
<enabled>true</enabled>
<videoInputChannelID>1</videoInputChannelID>
<videoCodecType>H.264</videoCodecType>
<videoScanType>progressive</videoScanType>
<videoResolutionWidth>1280</videoResolutionWidth>
<videoResolutionHeight>720</videoResolutionHeight>
<videoQualityControlType>VBR</videoQualityControlType>
<constantBitRate>2048</constantBitRate>
<fixedQuality>100</fixedQuality>
<maxFrameRate>600</maxFrameRate>
<keyFrameInterval>6</keyFrameInterval>
<BPFrameInterval>0</BPFrameInterval>
<snapShotImageType>JPEG</snapShotImageType>
</Video>
</StreamingChannel>

I would try getting a RTSP url working in VLC. If that works, you should be able to pass the same url to the container.

1 Like

Got an error - but shortly after realized my mistake.
Now I’m happy to report I got this working!
Followed your suggestions above (for .pb and label map).
Set up four zones (thinking this would help distribute the load across multiple cores).
When it does detection, the video feed seems to lag a bit.
Feeding it low res at only 2 fps - so I don’t know if the detection is this slow, or only the output feed.

Either way - just needs some tweaking - but “it’s alive!”

Thanks again!

quick follow up - I checked the MQTT messages and it seemed like it was pumping out a massive number of them… is there a way to adjust the frequency of both the detection and (probably related) the MQTT message sends?

Okay, so I needed to add the port (554) and then it choked on my password which had a ! in it (though VLC managed with it), so I changed my password, and now I’m happy to report it’s working as well!! Here is the URL format for my HikVision camera in case it helps anyone:

rtsp://USERNAME:PASSWORD@IP_ADDRESS:554/Streaming/channels/1

Very nice and easy way to implement it, I’m very impressed, thanks so much for sharing this, I think it will get a lot of love.

A few questions:

Is there any way to define rectangular regions? I think I’d just prefer having it scan the whole stream all at once, or is that a problem?

If not, can regions overlap?

And finally, is there any way to have it dump a photo to disk of what it saw? I use this with the current tensorflow integration to show a little dashboard of what was detected in case I didn’t get a chance to see it in realtime on the stream. It’s pretty handy.

EDIT: From FAQ it probably looks like not yet on that one, but open to any creative ideas.

Thanks again for a really nice component, this opens up a lot of new possibilities for me, especially in finally getting off of my overburdened Pi and onto a more robust containerized environment.

I have thought about rate limiting the MQTT messages. I will add it to my list.

1 Like

In theory, yes. The models are mostly trained on square images, so when the detection runs, it will squish a rectangle into the square. I believe this will throw off the aspect ratio and decrease accuracy. Also of note, the fast models are all trained on 300x300 pixel images, and your region is resized to that before processing by tensorflow. I chose my camera resolution so that my smallest region would be as close to that size as possible. Because the images are resized, tensorflow will have a harder time picking up smaller items in your region. For best accuracy, you want to create your regions so that a person will fill as much of the area as possible.

Regions can overlap, but you will waste some CPU cycles processing the same thing multiple times. Also, you will detect 2 people when one person stands in the overlapping parts.

The DEBUG parameter will write images to disk at /lab/debug as well as a few other debug images. I do have the /best_person.jpg endpoint so I can see the last person I missed integrated as a camera in homeassistant. This container is one part of a full docker based NVR solution I built from scratch. I haven’t open sourced the rest of it yet, but I am working towards all the same use cases.

1 Like

Okay that all makes sense. Additional accuracy is definitely the goal for me, I’m willing to throw some CPU cycles at it, so far it doesn’t seem too bad.

So I will make some big rectangles in the front and some small ones in the back for when people are further away. I would guess about half a person’s worth of overlap would probably be optimal so that it can see that person regardless of if he’s standing between the regions, yes, otherwise it would be “blind” to anything between the regions, wouldn’t it?

I will try it with debug and see what comes out for viewing later.

1 Like

So, I made 2 720x720 regions and I’m actually pretty impressed with how well it works even when I’m far from the camera. Much better than the one I had running on my Pi. It works at night which the other one couldn’t do, has a high framerate, and accuracy is much better. I don’t think anyone could make it past it…

Couple more questions…

DEBUG parameter, what are the options? I set it to 1, is that right? I don’t see anything in /lab/debug (absolute path) inside the container. The folder doesn’t exist. I made one but didn’t see anything come there.

best_person.jpg is good enough for now, though… But is always good to have persistent storage to look back at.

I’ve started trying to set up the MQTT component, but I get
Socket error on client <unknown>, disconnecting.
in the log when it tries to connect… Anything I should know about the client? Does it require SSL or authentication or anything? I’m trying it on a new test instance so quite possible it’s something on my side, but it does seem to connect to hass okay. I might try tomorrow from my production instance which I know works well (but also often cracks under the pressure of chatty MQTT clients).

Yea. 1 is the only option for debug. If the debug folder isn’t there on startup, it might cause some of the subprocesses to fail until you restart the container. I usually mount a volume from my local at that location when running the container. It only writes on motion or objects and you can search the source to see what I’m writing. The mqtt client doesn’t currently work if you have a username and password set. You could test with mosquitto_pub command line tool from within the container.

1 Like

Finally got the MQTT going… Hass.io MQTT addon really doesn’t like operating without users and security, that would be a good addition someday, along with the ability to change the MQTT subscriber name. But it’s pretty amazing to have something up and running in a docker container with this much functionality and this little fuss, so great work again.

If anyone wants a setup which works, I used the MQTT with Web Client Hass.io addon with the below config

{
  "log_level": "debug",
  "certfile": "fullchain.pem",
  "keyfile": "privkey.pem",
  "web": {
    "enabled": true,
    "ssl": false
  },
  "broker": {
    "enabled": true,
    "enable_ws": false,
    "enable_mqtt": true,
    "enable_ws_ssl": false,
    "enable_mqtt_ssl": false,
    "allow_anonymous": true
  },
  "mqttusers": []
}

One more question, is there any way to set the detection threshhold for /best_person.jpg, it’s been showing my umbrellas occasionally as people with 65% confidence, real people are always above 80%, usually high 90’s, so I think 80% is a good cutoff to not find them (they are unfortunately on wheels so hard to mask).

Also, 3.5GB of debug data today, very comprehensive ;-), but probably a bit much to use every day. Any way to turn off the motion detection debug stuff? Another nice feature would be to be able to set the interval between captures, but now I’m just being annoying.

Thanks again for the great work.

I actually just added support for MQTT username and password this morning and pushed a new image. You could comment out the lines that write on motion and rebuild the container yourself. Can you create github issues for feature requests? I am losing track of the things people are asking for.

1 Like

Sorry, absolutely will do. Thanks again.

This is off to a great start, thank you for creating and sharing. I had two quick questions.

  1. The person score sensor/mqtt values and the bounding boxes of the images. I usually see scores in the 90s for the bounding box but the person score is usually lower. Is this from the /5 calculation that occurs? Is it the sum of detected objects of all objects or just person objects?

  2. The NUM_CLASSES is set to 90 and the mapping files I was using have 90 but frigate only reports on person. I built a new container with NUM_CLASSES set to 1 and trimmed down the mapping file to just person but person dection stopped working. The goal was to see if this reduced cpu load. (I’m very new to tensorflow/ML so I’m fumbling around a bit) Should this work? I saw a comment about making this dynamic so perhaps you have ideas in the works. FWIW I think the tensforflow component lets us select objects types.

thanks again for your time

  1. Sum of only person scores in the past second divided by 5.
  2. That is the number of classes in the model. In order to reduce the number of objects you would have to retrain or use a different model. However, from my understanding, the additional objects improve the accuracy of person detection because it has learned what isn’t a person. It is in fact looking for all 90 object types. I am just only reporting on person. Adding an objects parameter would be fairly easy to do.
1 Like

Hi, I was just trying to get up docker compose going and started getting the error

Unable to find image 'frigate:latest' locally

docker: Error response from daemon: pull access denied for frigate, repository does not exist or may require 'docker login'.

when I try to rebuild it with

docker build -t frigate .

I get this now (didn’t happen first time):

Step 5/17 : RUN GIT_SSL_NO_VERIFY=true git clone -q https://github.com/tensorflow/models /usr/local/lib/python3.5/dist-packages/tensorflow/models
 ---> Running in 773b58481072
error: RPC failed; curl 56 GnuTLS recv error (-54): Error in the pull function.
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
The command '/bin/sh -c GIT_SSL_NO_VERIFY=true git clone -q https://github.com/tensorflow/models /usr/local/lib/python3.5/dist-packages/tensorflow/models' returned a non-zero code: 128

Did anything change that could have caused this or is it something on my side? Now it doesn’t work with compose or command line.

Am running Ubuntu docker-ce and struggling with the volume setting

docker-compose.yml is run from /home/user/frigate

yet it dosn’t matter what format I add the path this error shows:

tensorflow.python.framework.errors_impl.FailedPreconditionError: /label_map.pbtext; Is a directory

I figured it would be:
- /home/user/frigate:/label_map.pbtext:ro

any ideas?

Hi cooloo,

It should actually be:

  • /home/user/frigate/label_map.pbtext:/label_map.pbtext:ro

You should have the label map, inside your frigate folder.

Regards

I just want to share something for those of you like me, who might try to run the container on an older cpu.

It appears that Tensorflow, starting from version 1.6.0, requires AVX instruction set capable CPU.
However, i was trying to run it on an Intel Xeon E5520 CPU, which apparently dates back to 2010 and does not support AVX.

The solution to this was to specify the version of tensorflow insdide the Dockerfile, more specifically version 1.5.0 is the last one which does not require AVX or for that matter SSE as well.

@blakeblackshear Thanks for the amazing work. I did manage to implement the mqtt authentication as well. Although running the on the aforementioned CPU, it does use quite a lot of the resources. Currently running it with 4 cores, it utilizes all of the at 30% when idle, and when scanning it jumps up to about 85 - 90 %. If there is any way, decrease the load, it would be great.

Regards

1 Like

legend @Memphisdj thank you sir that was the path

not sure how our older cpu’s will fair with multiple cams

but the functionality of @blakeblackshear work is worth a new system, awesome