Local realtime person detection for RTSP cameras

just sent you logs from my container. After a while of running rc3, I see the /camera feeds freeze up and detections stop working.

Happening here as well.

That is correct. The process that runs detections on the Coral keeps somewhat of a rolling average of the time to run inference with tensorflow. The lower that speed the better the performance. I havent seen anything lower than 8-10ms, so I think that may be as good as it gets. My Atomic Pi gets ~18-20ms. The theoretical max FPS is 1000/coral_inference_speed. That should give you an idea of what % of your Coral’s capacity you are using. Do keep in mind that frigate will often run inference multiple times on a single frame depending on where the motion is and what objects it is currently tracking, so 100FPS on your Coral doesn’t necessarily mean 100FPS from the camera.

1 Like

Ok. I wasn’t expecting these changes to make a big difference, but I pushed up rc4. This version has been running successfully for 13 hours with my setup. It isn’t as aggressive in ensuring your cameras are still sending frames, but I want to handle that in a different place anyway.

Changes:

  • Increase the buffer size for the ffmpeg subprocess
  • Stop killing the camera process from the main process
  • Print some more information to the logs when the ffmpeg process isn’t returning data

Docker image is available with docker pull blakeblackshear/frigate:0.5.0-rc4

2 Likes

Thanks Blake, will give it a go.

As luck will have it, it has gone hours on the last run without stopping this time round (still on RC3)…

Screenshot 2020-02-26 at 08.59.14

It crashed 20 mins after I posted my previous message. Currently testing out RC4 with a notification it if stands still for longer than 5 minutes, will then grab the logs for you.

hello

Still the same crash with the Rc4 update (ok between11h41 and 12h14 ) like ebendl my rc3 was ok between 16h00 and 10h00 this morning

unlike the Rc3 when the Rc4 crash the debug Mpeg endpoint for my five camera was freeze at the time of the crash (still image) !

Yes, unfortunately RC4 also “froze” on my side (only major difference is that it didn’t freeze on Coral FPS with 0 FPS, but with 0.3).

Screenshot 2020-02-26 at 17.05.26
(Yellow bar marks RC3, green bar marks RC4)

Haven’t had time to look at the logs yet, will do that once I’m home.

Hopefully the additional logging in rc4 will help track down what is happening.

Mine finally stopped too. It looks like the process running detection on the coral is getting hung up sometimes on the invoke method within tensorflow. I was able to catch mine in the stuck state and grab stack traces. I don’t see anything unusual about the data being passed to tensorflow. I am going to look into monitoring that process and see if we can restart it if it gets into this state.

1 Like

Sometime overnight one of my 2 cameras stopped working. It appear rc4 successfully restarted only one of the camera processes. This is what I found in container log:

6016 bytes. The number of bytes in use (before this eviction) is 399385344.

/arrow/cpp/src/plasma/eviction_policy.cc:134: There is not enough space to create this object, so evicting 87 objects to free up 80246016 bytes. The number of bytes in use (before this eviction) is 399385344.

/arrow/cpp/src/plasma/eviction_policy.cc:134: There is not enough space to create this object, so evicting 87 objects to free up 80246016 bytes. The number of bytes in use (before this eviction) is 399385344.

/arrow/cpp/src/plasma/eviction_policy.cc:134: There is not enough space to create this object, so evicting 87 objects to free up 80246016 bytes. The number of bytes in use (before this eviction) is 399385344.

back: ffmpeg_process exited unexpectedly with 0

back: exiting subprocess

/arrow/cpp/src/plasma/store.cc:738: Disconnecting client on fd 7

garage: ffmpeg_process exited unexpectedly with 0

rage: exiting subprocess

Process for back is not alive. Starting again…

Camera_process started for back: 4986

Starting process for back: 4986
ffprobe -v panic -show_error -show_streams -of json “rtsp://xxxxxx:[email protected]:554//h264Preview_01_sub”

{‘error’: {‘code’: -101, ‘string’: ‘Network is unreachable’}}

Process Process-4:

Traceback (most recent call last):

File “/usr/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap

self.run()

File “/usr/lib/python3.7/multiprocessing/process.py”, line 99, in run

self._target(*self._args, **self._kwargs)

File “/opt/frigate/frigate/video.py”, line 130, in track_camera

frame_shape = get_frame_shape(ffmpeg_input)

File “/opt/frigate/frigate/video.py”, line 41, in get_frame_shape

video_info = [s for s in info['streams'] if s['codec_type'] == 'video'][0]

KeyError: ‘streams’

Process for back is not alive. Starting again…

Camera_process started for back: 4994

Starting process for back: 4994

ffprobe -v panic -show_error -show_streams -of json “rtsp://xxxxx:[email protected]:554//h264Preview_01_sub”

/arrow/cpp/src/plasma/store.cc:738: Disconnecting client on fd 8

Process for garage is not alive. Starting again…

Camera_process started for garage: 5906

Starting process for garage: 5906

ffprobe -v panic -show_error -show_streams -of json “rtsp://xxxxx:[email protected]:554//h264Preview_01_sub”

{‘streams’: [{‘index’: 0, ‘codec_name’: ‘h264’, ‘codec_long_name’: ‘H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10’, ‘profile’: ‘High’, ‘codec_type’: ‘video’, ‘codec_time_base’: ‘0/2’, ‘codec_tag_string’: ‘[0][0][0][0]’, ‘codec_tag’: ‘0x0000’, ‘width’: 640, ‘height’: 480, ‘coded_width’: 640, ‘coded_height’: 480, ‘has_b_frames’: 0, ‘sample_aspect_ratio’: ‘0:1’, ‘display_aspect_ratio’: ‘0:1’, ‘pix_fmt’: ‘yuv420p’, ‘level’: 51, ‘chroma_location’: ‘left’, ‘field_order’: ‘progressive’, ‘refs’: 1, ‘is_avc’: ‘false’, ‘nal_length_size’: ‘0’, ‘r_frame_rate’: ‘12/1’, ‘avg_frame_rate’: ‘0/0’, ‘time_base’: ‘1/90000’, ‘start_pts’: 15030, ‘start_time’: ‘0.167000’, ‘bits_per_raw_sample’: ‘8’, ‘disposition’: {‘default’: 0, ‘dub’: 0, ‘original’: 0, ‘comment’: 0, ‘lyrics’: 0, ‘karaoke’: 0, ‘forced’: 0, ‘hearing_impaired’: 0, ‘visual_impaired’: 0, ‘clean_effects’: 0, ‘attached_pic’: 0, ‘timed_thumbnails’: 0}}, {‘index’: 1, ‘codec_name’: ‘aac’, ‘codec_long_name’: ‘AAC (Advanced Audio Coding)’, ‘profile’: ‘LC’, ‘codec_type’: ‘audio’, ‘codec_time_base’: ‘1/16000’, ‘codec_tag_string’: ‘[0][0][0][0]’, ‘codec_tag’: ‘0x0000’, ‘sample_fmt’: ‘fltp’, ‘sample_rate’: ‘16000’, ‘channels’: 1, ‘channel_layout’: ‘mono’, ‘bits_per_sample’: 0, ‘r_frame_rate’: ‘0/0’, ‘avg_frame_rate’: ‘0/0’, ‘time_base’: ‘1/16000’, ‘start_pts’: 0, ‘start_time’: ‘0.000000’, ‘disposition’: {‘default’: 0, ‘dub’: 0, ‘original’: 0, ‘comment’: 0, ‘lyrics’: 0, ‘karaoke’: 0, ‘forced’: 0, ‘hearing_impaired’: 0, ‘visual_impaired’: 0, ‘clean_effects’: 0, ‘attached_pic’: 0, ‘timed_thumbnails’: 0}}]}

ffmpeg -hide_banner -loglevel panic -avoid_negative_ts make_zero -fflags nobuffer -flags low_delay -strict experimental -fflags +genpts+discardcorrupt -vsync drop -rtsp_transport tcp -stimeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://xxxxx:[email protected]:554//h264Preview_01_sub -f rawvideo -pix_fmt rgb24 pipe:

Any chance you can tie the process health to the health check endpoint? Or will that only be for the http serving health?

I think this corresponds to what I’m seeing too. But I think there’s 2 bugs.

On the one hand it does look like some of the ffmpeg processes fails to restart after a couple of time. I’ve had one camera which was fairly noisy cause the Coral FPS to increase, so I was able to see when it dies. The mmjpeg endpoint also stopped updating for it, but the process didn’t recover. This is also a fairly cheap camera so not too concerned.

The other issue is where the detection process that stops. I’m not sure if it sometimes kills individual capture processes too and sometimes just the detection. I think I’ve seen cases where all the cameras’ detection FPS stops changing along with the Coral’s and sometimes just the Coral.

I’ve gone back to 0.4.0 for now but will happily spin up 0.5.0 again.

I plan on finishing a new rc this weekend that should address both of these issues.

5 Likes

Hello, so I gave up on my plans using the mpcie coral on my ESXi6.7 server.
I couldn’t get PCI passthrough working. It’s very likely an ESXi issue as someone else here also tried with their hardware.

So plan B: I re-purposed an old laptop (Lenovo R61 - these things are indestructible and still running after 12 years!) and then swapped the wifi card with the mpcie coral device.

Tada:

$ sudo docker run --rm --privileged -v /dev/apex_0:/dev/apex_0 -v /etc/localtime:/etc/localtime:ro -v /data/frigate:/config:ro -p 5000:5000 -e FRIGATE_RTSP_PASSWORD='password' blakeblackshear/frigate:0.5.0-rc4 python3.7 -u benchmark.py
Average inference time: 12.54ms

Now I will get started with the actual fun part :slight_smile:

This version has been running without any issues for 24 hours.

Changes:

  • Handle ffmpeg restarts in the camera subprocess to prevent zombies
  • Restart the detection process if it gets stuck
  • Switch the detection process to use a queue and the plasma store to avoid locks
  • Update the benchmark script to run detection the way frigate does
  • Allow the mqtt password to pull from env vars just like RTSP (cc @jon102034050)

Docker image is available with docker pull blakeblackshear/frigate:0.5.0-rc5

3 Likes

This is a great idea. I have an old laptop without USB3 but with Wifi. I plug the laptop into ethernet in any case, so I could do the same!

Thank you for the updates.

I keep getting this error as soon as there is motion:

Detection appears to be stuck. Restarting detection process

Any pointers?

in the Example docker-compose: , you show

    restart: unless-stopped

there is no mentioned of it in Run the container with section

Is the restart part missing or is it not required?

It’s not required.