Local realtime person detection for RTSP cameras

I assume you need to set MONITORDIR at the top of the script. Also, do you run this nohup, or in a screen session? Or from cron?

great info. you are able to access GPU inside docker using QEMU & GVT-g?
I have enabled it on my windows QEMU VM, but note sure how to do it as well for QEMU docker. Anything to configure on docker?
Right now, I am unable to create second virtual GPU, need to check hardware bios.

Iā€™m not using Docker with frigate, Iā€™m just running it in the VM. But the Docker config should be the same as for bare metal. I think you should add something like " --device /dev/dri/renderD128:/dev/dri/renderD128" to the Docker command line.

With QEMU I use the QXL video (for console video). If you choose virtio it will create a new render device making the GVT-g not use the default /dev/dri/renderD128 so that is a problem. Also if you use Q35 chipset you will need to configure the PCIe addresses very specifically to allow your GVT-g device on PCIe address 0000:00:02.0.

For the VM, I did not create a GVT-g virtual display in QEMU. I also added ā€œi915.disable_display=1ā€ to the guest VM kernel command line to get the virtual i915 to initialize properly.

Some UEFI do not allow for editing the aperture size unfortunately. But there is a workaround as shown here: https://github.com/intel/gvt-linux/issues/131

1 Like

I am using q35 on my windows VM. Works overnight so far, but it seems there are some hickups until they move to newer linux kernel. I will let it run for some weeks and see, but the CPU load has greatly reduced on my blueiris windows machine (15% CPU, including Windows & 4 camera 1080p@15fps doing motion detection).

You say you run out of docker: are there any instructions on how to do this?
Sub question (probably more for @blakeblackshear : could I run Frigate on my windows machine (with or without Docker, preferably without)?

Docker is a very thin layer and introduces an insignificant amount of overhead. Itā€™s performance is nearly identical to bare metal. I am not aware of any reason to go through the hassle of trying to maintain it directly in a VM, but you should be able to follow the dockerfile. You will have to rebuild your VM from scratch with each update since many system packages change with each release.

I would be very surprised if it was possible to get it working on windows given how many dynamically linked linux libraries are required in the underlying python packages. I wonā€™t ever add or maintain windows support myself.

2 Likes

thanks, I will try to optimize the perf of my docker QEMU and enable HW acceleration.
Is there any issue with running at avg 60ms inference speed?
Iā€™ve limited to a single camera stream 1080p@5fps until HW acceleration is enabled.
Avg CPU use for this config is 5-10%.

{"coral":{"detection_start":0.0,"fps":5.7,"inference_speed":56.55,"pid":22},"plasma_store_rc":null,"z5":{"camera_fps":5.0,"detection_fps":5.7,"ffmpeg_pid":33,"frame_info":{"detect":1601468995.087345,"process":0.0,"read":1601468995.087345},"pid":35,"process_fps":5.1,"read_start":0.0,"skipped_fps":0.0}}

Iā€™m not a fan of Docker because it seems to have some weird issues. For example it doesnā€™t support cgroups v2 which means it doesnā€™t work by default in Fedora 32. Iā€™d rather not have another layer of abstraction with other volumes and images that are hard to access.

I donā€™t know any official instructions to run out of Docker. I had a bit of trouble because my OS did not support Python3.7 which libedgetpu1 (and therefore frigate) require. But looking at github I see Python3.8 support so maybe it will come to stable soon. https://github.com/google-coral/edgetpu

As mentioned I just used the Dockerfile to see the dependencies.

It is still faster than not using a Coral, but for comparison, both the Raspberry Pi 4 and Atomic Pi get ~15ms which is 4x faster. Your NUC should be ~10ms. There is a bottleneck somewhere in the layers of virtualization for your USB 3 passthrough.

The CPU load wonā€™t increase linearly with each camera you add since some of the processing is shared across cameras. My ffmpeg processes are about 5% of CPU per 1080p 5fps stream with hwaccel using quick sync on my i3 NUC.

I also run 4 1080p 5fps cameras on the J4125 mini pc linked in the Readme. Doesnā€™t even break a sweat and I run several other services on it too.

The next version of frigate will use python 3.8

2 Likes

Sorry for stupid questions, but can you explain in a few words (or give link) explaining what inference speed is? I understand now it represented somehow the speed of communication between CPU & Coral device? So enabling HW acceleration for H264 decode will have no impact on this.
In that case, I will first look at solving this USB3 forward issue.

Here is what I have so far:

[    1.629471] usb usb3: We don't know the algorithms for LPM for this host, disabling LPM.
[    1.630260] usb usb3: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.04
[    1.630916] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.631602] usb usb3: Product: xHCI Host Controller
[    1.632248] usb usb3: Manufacturer: Linux 5.4.0-48-generic xhci-hcd
[    1.632942] usb usb3: SerialNumber: 0000:01:1b.0
...
[    2.621515] usb 3-1: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd
[    2.646069] usb 3-1: LPM exit latency is zeroed, disabling LPM.
[    2.646440] usb 3-1: New USB device found, idVendor=18d1, idProduct=9302, bcdDevice= 1.00
[    2.646462] usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
...
[   65.409349] usb 3-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
[   65.429977] usb 3-1: LPM exit latency is zeroed, disabling LPM.

and USB devices as follow:

sleepy@docker:~$ lsusb
Bus 003 Device 002: ID 18d1:9302 Google Inc. 
Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd QEMU USB Tablet
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
1 Like

It is the amount of time it takes to run object detection on a single 300x300 px image. Said another way, it is the time it takes to execute the tensorflow AI model.

3 Likes

I see that frigate use ssd_mobilenet_v2_coco_quant_postprocess_edgetpu.tflite model. I have some false-positive results sometimes and Iā€™ve read on others threads some people saying that Inception model could improve the objects detection.

So I would have to volume mount the inception_v4_299_quant_edgetpu.tflite file and thatā€™s it ? Or that would be too easy :grin:

Thanks !

That is an image classification model, not an object detection model. The shape of the inputs and outputs are completely different. That may be a good model to do downstream confirmations if the existing model canā€™t be improved. I would rather try and improve the existing model first before adding another step in processing that would slow down the entire pipeline.

1 Like

I am having avg of 20ms now for inference speed. I saw somehow I messed up with enabling USB3. I guess it will be hard to have faster speed now given the fact that I go through Host=>Proxmox=>QEMU=>Docker.
I will try to enable virtual GPU, but right now, CPU is between 7 & 15% depending on whether there is movement or not. I would say this is acceptable for now.
Now need to work on better handling the zones & associated MQTT events + notifications.
thanks for this great program.

correct me if i am wrong but yolov5 would give the most accurate detection, right? and they already run it on xavier https://www.youtube.com/watch?v=0RqvMBj7A4I

Or on coralā€¦

1 Like

it would be great to make it work with frigate and we could test it

2 Likes

I am facing multple restart of frigate, issue is that itā€™s only visible in the logsā€¦
the ā€œonlineā€ status in MQTT would need to be refreshed sometimes: maybe it is already done, in that case, how often?

2020-10-01T08:56:00.201491233Z Detection appears to be stuck. Restarting detection process
2020-10-01T08:56:00.201548617Z Waiting for detection process to exit gracefully...
2020-10-01T08:56:30.231687322Z Detection process didnt exit. Force killing...
2020-10-01T08:56:30.249847191Z Starting detection process: 1590
2020-10-01T08:56:30.249877884Z Attempting to load TPU as usb
2020-10-01T08:56:33.059774812Z TPU found
2020-10-01T08:57:20.292767526Z Detection appears to be stuck. Restarting detection process
2020-10-01T08:57:20.292878023Z Waiting for detection process to exit gracefully...
2020-10-01T08:57:50.323175863Z Detection process didnt exit. Force killing...
2020-10-01T08:57:50.343283158Z Starting detection process: 1600
2020-10-01T08:57:50.343889590Z Attempting to load TPU as usb
2020-10-01T08:57:53.155763407Z TPU found
2020-10-01T08:59:10.421967072Z Detection appears to be stuck. Restarting detection process
2020-10-01T08:59:10.422003579Z Waiting for detection process to exit gracefully...
2020-10-01T08:59:40.452301109Z Detection process didnt exit. Force killing...
2020-10-01T08:59:40.468399846Z Starting detection process: 1609
2020-10-01T08:59:40.468888389Z Attempting to load TPU as usb
2020-10-01T08:59:43.294166999Z TPU found

ā€¦ continues like this ā€¦
I have moved back to a single camera to see if still facing this issue.

is there a way to enable more debug information? how? need to add additional volume to the docker?
thanks

There isnā€™t anything else to debug. The detection process is dedicated to communicating with the Coral. Those messages mean that the Coral has stopped responding and the process is being restarted in an attempt to fix it. I have seen this happen when the Coral does not have enough power, but it is probably some hiccup in all the virtualization layers.

I did not have good luck with the USB Coral when passed through using QEMU. I have a feeling that QEMU USB passthrough adds a lot of latency. For a different project I have had much better results with VirtualHere but I havenā€™t specifically tried it with the USB Coral.

Now I am using the PCIe A+E Key Coral, using VFIO passthrough on QEMU with much better results.