Local realtime person detection for RTSP cameras

ebendl · December 11, 2019, 2:12pm

Got my Google Coral stick today, really looking forward to setting this up (running two 2Mpx Hikvision IP cams, some cheapie 720p no-name cams and some Wyze cams).

bline79 · December 12, 2019, 6:56pm

If you do pursue reporting on any object type, it would be great to be able to specify the object types per bounding box. If that’s too tall of an order I’d guess that you could just pull another RTSP stream and define the objects to detect on the additional stream(s).

blakeblackshear · December 12, 2019, 7:10pm

Absolutely. I intend to allow everything to be set at the region, camera, or global levels. Setting a value at the camera level would override the global default, and region would override the camera default.

calypso · December 14, 2019, 6:53pm

Update working fine on Unraid too, just hit the update button and edit the config file. Not too many changes.

P.S. had anyone had much success with hardware acceleration? It all seems a bit vague to me. I run pretty huge detection areas so would like to bring CPU usage down where i can.

blakeblackshear · December 14, 2019, 9:11pm

Do you have an Intel processor? The CPU usage comes from decoding the video stream and resizing your regions. You should be able to enable hardware acceleration for ffmpeg if you have an intel processor with the example in the docs. You also want to use the stream from your camera that results in your smallest region being as close to 300 as possible. You get no additional accuracy with a higher resolution. If you just have one large region for the camera, you will get the same accuracy with a 360p video feed as you do with an 8k video feed. Using the higher resolution just makes your machine work really hard to resize the large image to 300x300px.

blakeblackshear · December 14, 2019, 9:28pm

Just pushed up a new beta release: https://github.com/blakeblackshear/frigate/releases/tag/v0.3.0-beta

Breaking Changes:

Configuration file changes to support all objects in the model. See updated example.
Images are now served up at /<camera_name>/<object_name>/best.jpg
MQTT messages are published to <camera_name>/<object_name> and <camera_name>/<object_name>/snapshot

Changes:

Frigate now reports on every object type in the model. You can configure thresholds and min/max areas for each object type at a global, camera, or region level.

Image is available with docker pull blakeblackshear/frigate:0.3.0-beta

tc23 · December 15, 2019, 1:35am

Stupid question, but where is the updated example? Is it the one linked in the readme?

blakeblackshear · December 15, 2019, 2:11am

github.com

blakeblackshear/frigate/blob/e1c4aa94f4d6ddb51f1e8147b757db4c02e9d034/config/config.example.yml

web_port: 5000

mqtt:
  host: mqtt.server.com
  topic_prefix: frigate
#  client_id: frigate # Optional -- set to override default client id of 'frigate' if running multiple instances
#  user: username # Optional -- Uncomment for use
#  password: password # Optional -- Uncomment for use

#################
# Default ffmpeg args. Optional and can be overwritten per camera.
# Should work with most RTSP cameras that send h264 video
# Built from the properties below with:
# "ffmpeg" + global_args + input_args + "-i" + input + output_args
#################
# ffmpeg:
#   global_args:
#     - -hide_banner
#     - -loglevel
#     - panic

This file has been truncated. show original

tc23 · December 15, 2019, 2:33am

Thank you! Very excited to try this new beta

scstraus · December 16, 2019, 12:09am

Wow, that was quick! Looking forward to trying this one.

calypso · December 16, 2019, 5:49am

Yep, i have intel processors. I just found that uncommenting any of the hardware acceleration features stopped it from starting.

I’ll give it another go though.

freshcoast · December 16, 2019, 7:35am

A couple questions about the Docker container builds, regarding OpenCV:

Do you know if there are any particular differences between the version that is compiled manually, versus the one that is provided by opencv-python? I believe they will use different versions of libraries (manual build will pull in its own copies), otherwise I’m not sure.
Any reasons not to use opencv-python?
Any reasons not to use OpenCV 4.1? I am now using opencv-python 4.1.0.25 and it seems to work fine.

I’m asking because I have a couple of Dockerfiles for Raspberry Pi. One builds OpenCV manually (mimicking the github Dockerfile as much as possible), while the other just installs opencv-python-headless (via piwheels). The OpenCV build takes forever (possibly hours) on an RPi (or even cross-compiling). It’s much simpler and faster to just install the wheel.

danfulton · December 16, 2019, 12:44pm

New beta for all objects is working well, with one small thing.

If there is a car parked in the detection area it is detected and a message fired off almost continually (128 messages in 4 minutes). Is there a way to only alert on new / change?

Thanks

blakeblackshear · December 16, 2019, 1:01pm

I don’t remember why I am installing from source. When I start building and publishing ARM images, I will look at optimizing it again.

blakeblackshear · December 16, 2019, 1:07pm

If you already have your threshold at 0.5 for cars, there isn’t a good way to do anything about it without some significant changes.

danfulton · December 16, 2019, 1:25pm

OK - thanks for replying I was guessing that a level of refactoring would be needed - even to say ‘has the detection area / confidence level’ changed.

I’d much rather see local face recognition though

Great work - really nice project

blakeblackshear · December 16, 2019, 8:35pm

What next?

Performance enhancements for running on Raspberry Pi to lower CPU usage and support more cameras per device
Official ARM docker builds to support Raspberry Pi
Dynamic regions that resize and follow detected objects (at the moment, people are often missed when they stand between regions and this would allow counting and tracking speeds)
Face detection (recognition to be added after)
Save detections for training custom models or transfer learning

0 voters

scstraus · December 17, 2019, 9:36am

Dynamic regions also would be a first step towards video recording the person to send in notifications, I assume.

blakeblackshear · December 17, 2019, 12:22pm

You can create video clips with the record service in homeassistant now for alerts if you want.

Tracking an object across many frames would help prevent blind spots and give me more information to filter out false positives. If the person is completely stationary or moving too fast, I could prevent alerts. I could also compute whether or not the person is approaching or walking away and select the best face the camera saw for that object. It is a much clearer representation of an “event” with attributes such as: object type, time in the frame, min speed, max speed, best face, recognized face, current position, direction of movement, etc.

Once I finish a few more of these issues with Frigate, I want to build a “homeassistant native” NVR that let’s you view historical footage, event history intermixed with any other homeassistant events, and realtime low-latency video streams.

blakeblackshear · December 17, 2019, 10:51pm

I’m going to start working on dynamic regions and object tracking. This is going to be a bigger architecture change than was required to track all object types. I’m hoping I can finish this over the holidays.