Local realtime person detection for RTSP cameras

For anyone interested in comparing - I found out last night that HA docker has tensorflow already installed. I configured it following the instructions in the HA component page, and got it working.

Currently using BlakeBlackshear’s setup I seem to get a decent delay before I get a notification. Also due to the averaging I fail to get some alerts (because the average is low). I haven’t yet finished the HA Tensorflow automation to check, but I am interested in comparing the two to see.

@blakeblackshear are you aware that TensorFlow was already a HA component? I know you are probably doing more/different things, and so you like having control - but just wanted your thoughts on this.

The existing tensorflow component is designed to run periodically against a captured image in another camera component. My goal with frigate is to have a way to process several video streams at 5fps+ in near real-time. It already accomplishes this for me, but it requires more processing power than I would like for something I plan to run 24/7. I will ultimately aim and zoom my ptz camera based on other still cameras. Homeassistant is simply not architected to manage multiple processes which consume significant CPU load. Trying to run frigate inside of it would likely peg the CPU at 100% and make everything else slow.

The problems you are having with frigate are probably due to running on a lower powered machine, and I don’t have all the parameters configurable enough to adjust it. At the moment I am more interested in incorporating Google Coral than tuning it for lower powered CPUs.

2 Likes

Thanks for the reply.
My machine I don’t think is the problem.
Running on an AMD RYZEN 5 2400G with 16GB of RAM.

The delay I don’t think is from the object detection.
I am actually wondering if somehow the MQTT part is slowing it down this much.
I know this project is still very new - I just wanted to clarify I think it is awesome, and by no means am trying to poke holes in it.

Great work!

Yea. I would expect it to run fine then. When you view the MJPEG stream on port 5000, does the video feed have a significant delay?

1 Like

look like TF 1.8 may work for CPU with no AVX also. I will test with this.

EDIT: 1.8 did not work for me

I’ve been running the standard tensorflow homeassistant component since pretty much the day it came out, and my issues with it are the following:

  • Must run on the same server that hass is installed on
    (in my case a rPi which isn’t strong enough to do much tensorflow heavy lifting, meaning one frame every 10 seconds with a low accuracy mobilenet model… An intruder could easily get past the system just by moving quickly out of camera range)
  • Cannot be dockerized at all (at least not in a different container than hass is in)
  • Does not work with hass.io (at least in any maintainable way)

Frigate fixes all those issues and allows me to run much more powerful hardware in a docker container on a different server and get much better accuracy and analyze every single frame of the video. There are still a few drawbacks (main one being hardware overhead), but overall those things are well worth it already for me to switch. When he finishes up the Coral integration, you can spend $100 to have pretty much all hardware overhead solved too, meaning you could probably run it at a high frame rate on a rPi, which is impossible with the standard integration.

Progress update on Google Coral. A raspberry pi is able to process 8 regions for objects simultaneously on a 5fps video stream without breaking a sweat. That is with the reduced clock speed on the Coral and a lack of USB 3.0 speeds which supposedly help too. It takes just as much CPU to decode the video stream (I haven’t looked at hardware accelerated decoding yet) as it does to do everything else. Pretty promising.

7 Likes

That’s awesome! Does resolution make a difference or does it just resize everything? (I’m using rcnn_inception_v2)

I’ve got my finger hovering over the purchase button for when you say it’s ready :wink: .

How did the Pi do on handling 8 regions of motion detection? And is resolution a factor there? I just migrated my hass from a 3b+ ,so that would be a great solution with a coral rather than chewing up a whole i7 laptop like I am now :crazy_face:.

It resizes everything. The Coral requires models to be exported in a special format. You should be able to use the same model, but it will require a conversion.

As far as motion detection goes, I am just bypassing it altogether for now. I have some ideas to make it much more efficient, but if it ends up requiring too much CPU, it may not be worth running at all given how little processing power it is now taking to just look for objects instead.

1 Like

Yeah, I don’t really have a problem with having a $75 USB stick running full bore all the time analyzing everything… Might catch something the motion detection misses… I’d probably just leave motion detection off if that was an option.

How hard is the model conversion?

Seemed pretty simple, but I just downloaded a preconverted ssd model.

1 Like

Though, thinking about it, if you really wanted to match the resolution of most of these models at 300x300, you’d probably want about 7 regions per stream without overlap or 15 with overlap, and then motion detection would come in handy to direct the correct region to analyze it. So for peak accuracy, it would still be good to have. Having it analyze 60 streams is probably still a bit much to ask…

How to install it on rpi?
What is the image name for rpi

So I have never tried anything w/ Tensorflow so I guess I am having a little trouble getting started. I am trying to understand what map and models you want me to download from Tensorflow ?

Is there any reason you dont bundle these with your project so that it os more of a simple on-boarding experience for folks ?

Perhaps I just need an idiots guide to person detection w/ your project… :slight_smile:

Thanks

2 Likes

Check comment 9 in this topic. The instructions are pretty clear

1 Like

I decided to see how big of a difference it would make to run it on a laptop with USB 3.0 rather than a RPi3b+. It is a HUGE difference. Inference times went from ~50ms to <4ms. That means a single Coral will be able to look at 250+ 300x300 regions per second. If you assume 5fps on your cameras, that means you can look for objects in 50 regions simultaneously.

4 Likes

Great code, thank you for sharing this – it is very awesome! I have been trying to build something similar for the last few months off and on, and never got very far. Right now I have been settling for emailing camera clips to AWS Rekognition to offload the object detection (but sadly this means I need motion triggers, which are riddled with false positives (noise) and false negatives (missed persons).

After reading this thread I ordered a Google Coral!

Rather than or in addition to getting best_person, having short video clips or images with the bounding box and score of an alert would be great (I see that you can get the debug clips but those are just of the one region, rather than the full image.) and the ultimately being able to send push notifications with the content!

Anyway, I just wanted to thank you for sharing your efforts on this and I hope you keep building on it, very exciting stuff!

2 Likes

Wow that’s amazing. So I guess I will put it on my new mac mini instead of the Pi… Though you are still using mobilenet I believe, so I will probably something more like 30 per stick with Inception Resnet v2 Faster R-CNN… I’d probably go for something like 15 regions optimally, so I might still want 2 sticks if I weren’t using motion detection with my 4 cameras, and maybe 3 if I were to go to 6 cameras… They are cheap enough for that not to be too much of an issue.

Is the coral support released for general consumption with multiple regions yet?

I just started running it myself today. It only supports a single camera at the moment, but you can checkout the edgetpu branch of the repo and build the container yourself. I haven’t updated the docs yet either.

1 Like

I’ll wait, I’ve got 4 cameras working pretty well on the current version. I look forward to it, though!