Object detection for video surveillance

This becomes a larger concern once we start multiplying cameras as you can see, the image processing python process goes up as expected and is quite hard on the CPU in spite of using the gpu as well but ffmpeg is really multiplying. I wonder if there is any way to curb this.

@rafale77, though I don’t use this great project, but been reading and following this thread out of interest. I too have a surveillance software I been building, and got caught with this same results you found, I thought I was crazy, but glad someone else seen it.

OpenCV really uses less resources compared to FFMPEG, and even though I had activated hardware acceleration in my FFMPEG process, it was still same results. What I think happens is that, as I use FFMPEG for my custom built NVR, the hardware acceleration works great as both the decoding and encoding happens in GPU using h264_vaapi. But when using FFMPEG to get frames, I think the GPU is bypassed, if it has to send it back to memory for some strange reasons. I had the exact same CPU utilisation results running FFMPEG with and without GPU.

The one good thing though abt using FFMPEG compared to OpenCV, is that since FFMPEG uses subprocess, the core running the code’s resources is not used up as much as using OpenCV. Though that is easily fixed by using multi-processing which is why I am re-writing the way my frames are captured now using OpenCV. Another good reason is if wanting to rescale the image in GPU before retrieving, FFMPEG does it better in terms of CPU resources.

I am sorry barging in like this, but just glad someone else seen what I had been battling with for some days now :smile:

Regards

1 Like

Thanks for confirming I am not insane… Watsor is a very impressive piece of work and it’s really worth giving it a shot. It is amazing though that in spite of using the GPU FFMpeg actually uses more CPU resources and almost the same GPU ressources as the tensorRT part of it. I will run it for some time but will likely build my own component using openCV…

From my understanding, OpenCV is using ffmpeg libraries under the covers anyway. I wonder if it has something to do with conversion between pixel formats. I know that when I was testing, the hardware acceleration available on the RPi did not support YUV420 to RGB24 pixel format conversion and that almost negated all the hwaccel benefits. The downside to using OpenCV directly is that your frames will be BGR24 and you will have to convert to RGB24 for tensorflow. It adds an extra pixel format conversion. If we can figure out why OpenCV is more efficient, we should be able to adjust ffmpeg params and get those improvements plus the benefits of hwaccel.

@blakeblackshear, thanks for your replay and your frigate is another project I follow with a lot of interest, even though I don’t use it :wink: . Actually I do open your config quite often when I was trying to figure the ffmpeg setting for rtsp, as didn’t I really understand the prams.

Anyway on what you said abt the conversion, I do use BGR24 in my system as its heavily built around OpenCV, so I have no use for the RGB24 used by TensorFlow; so I have no use for the extra conversion. Though I still needed to convert from the NV12 used by VAAPI, and avoid doing it in CPU, and boy did I try different combinations in the filter_complex until I ended up with this format=nv12|vaapi,hwupload,hwdownload,format=bgr32,format=bgr24 when no scaling anyway.

As to why OpenCV is more efficient, I just see it as the guys at Intel being freaking smart cookies lols, at least they understand it more than anyone I would expect since it came from them. I even attempted emailing one that I saw his email in one of the source codes I was reviewing to understand what is going on, haven’t gotten a response yet.

Anyway will keep looking until I find something.

Regards

@blakeblackshear, My facial recognition component, relies on a caffe model run by opencv for face detection and in spite of running this pixel conversion from YUV to BGR, the full 3MP conversion is 2-3x more efficient than using FFMpeg directly. I really am not sure why.
I just evaluated the difference in wattage consumption per real-time stream of the video capture only using the GPU on my setup.
OpenCV: 3W//3MP 20fps stream
Ffmpeg: 10W/1MP 20fps stream, 25W/3MP.
The additional load is on the CPU. The additional GPU decoder load is about the same.

video+processing:
Video processing Face detection+face recognition, 3MP 20fps using openCV image processing+ caffe DNN model+Dlib encoding+ SciKit linear classifier: 8W/stream
Watsor person detection, 1MP(downscaled). 20fps: 20W/stream. At full 3MP resolution the consumption goes to ~50W/stream.

Further observations:
As you know I have rewritten the FFmpeg camera component to use openCV instead of FFmpeg directly so I though I would compare the CPU load by swapping my code with the original code and look at the CPU load comparison, in both case, the GPU is being used.
The CPU load for FFMpeg on my 3MP increases by 32% when I open the MJPEG stream on the HA UI with FFMpeg and by 4% when I use openCV and I verified that both use the nvdec. There is something else going on here as the output in both case should be the same. I suspect that FFmpeg is re-encoding the stream and I don’t know how to prevent it from doing it.

Edit: Not meaning to beat a dead horse to death but this is another piece of data:

I moved 2 of the cameras to my own FFmpeg camera components (which uses OpenCV) within Home Assistant in full size and you can compare to the other 2 cameras still on Watsor/FFmpeg.
The increase in CPU load for Home assistant is… 7%, ~3.5% per stream while FFMpeg required ~40% per stream full size and 15% downsized. GPU loading has not budged.

Edit2: I moved all 4 camera streams I had on Watsor to openCV within Home Assistant and my CPU load has indeed dropped from 15%/stream to 3.5/stream. I have also rewritten the opencv component in HA to run object detection using a tensor flow mobilenet v3 SSD model available on openCV. Unfortunately there is still some bugs with the latest cuda and cudnn causing home assistant to crash with a GPU memory allocation error so I moved the model to run on the cpu and to my surprise, the loading on the cpu is about the same as when I was running watsor inference. Result:
Watsor 40% CPU + 40W GPU inference, 60% CPU + 20W GPU ffmpeg stream. (Downscaled)
Now on openCV: 40% CPU + 0W GPU inference, 14% CPU + 20W GPU for video stream. (Full frame)

Hello @rafale77,

I really like the results you giving, and will wish we discuss this more. But not to hijack this thread, is there any other place we can continue this?

Regards

Hi,

Can someone post a working config.yaml for Pi4 with coral USB?

Thanks

Hi @Eeeeeediot,

this is what works for my two Foscams, my to Linksys camera don’t work with the HW Acceleration enabled, though:

# Optional HTTP server configuration and authentication.
# http:
  # port: 8080
  # username: !env_var "USERNAME john"
  # password: !env_var "PASSWORD qwerty"


# Optional MQTT client configuration and authentication.
mqtt:
  host: 192.168.7.10
  port: 1883
  username: mqtt_username
  password: ''


# Default FFmpeg arguments for decoding video stream before detection and encoding back afterwards.
# Optional, can be overwritten per camera.
ffmpeg:
  decoder:
    - -hide_banner
    - -loglevel
    -  error
    - -nostdin
    - -fflags
    -  nobuffer
    - -flags
    -  low_delay
    - -fflags
    -  +genpts+discardcorrupt
    - -c:v
    -  h264_mmal
    - -i                          # camera input field will follow '-i' ffmpeg argument automatically
    - -filter:v
    -  fps=fps=10,scale=-1:480
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24

#  encoder:                        # Encoder is optional, remove the entire list to disable.



# Detect the following labels of the object detection model.
# Optional, can be overwritten per camera.
detect:
  - person:
      area: 10                    # Minimum area of the bounding box an object should have in
#                                  # order to be detected. Defaults to 10% of entire video resolution.
      confidence: 50              # Confidence threshold that a detection is what it's guessed to be,
#                                  # otherwise it's ruled out. 50% if not set.
  - car:
      area: 20
      confidence: 50

# List of cameras and their configurations.
cameras:
# WORKING
  - outdoorcam:                   # Camera name
      width: 640                 #
      height: 480                 # Video feed resolution in pixels
      input: !ENV "rtsp://user:[email protected]:88/videoMain"
#      mask: /etc/watsor/outdoorcam.png

Thanks! I got it working but finding it difficult to tailor to my needs as there is so much pf it i don’t understand.

Hi @asmirnou! Great work on the Addon. I have a few questions if you don’t mind.

  1. Are there any recommended FFMPEG decoder/encoder settings for a RPi4 + Coral USB? Even after reading up on FFMPEG arguments my understand of what they do is very poor and it would be helpful to have recommended settings for such a popular set up.

  2. Moreover, I am experiencing high CPU usage even though i have a coral attached. My usual CPU usage is 10%~. What can i do to get it back down to this level?

  1. I am unable record movement for some reason and i believe that the following logs are linked to the reason why - any suggestions? I receive these every few seconds.
watchdog         WatchDog                 WARNING : Thread front_door (FFmpegEncoder) is not alive, restarting...
watchdog         WatchDog                 WARNING : Thread garden (FFmpegEncoder) is not alive, restarting...
front_door       FFmpegEncoder            INFO    : /config/www/watsor/front_door.mp4: Read-only file system
garden           FFmpegEncoder            INFO    : /config/www/watsor/garden.mp4: Read-only file system

Run a “ps aux” command from the shell and you will see a breakdown. I suspect a very large load is due to the FFmpeg streams.

I’m on Hass.io so when I tried the command it didn’t work.

I thought the Coral USB was meant to be utilised so that the CPU didn’t get used?

The coral doesn’t help with the video decoding which I found FFmpeg to be very ressource hungry at.

Hello,

instaled the addon succesfully. createad a file /config/watsor/config.yaml:

# Optional HTTP server configuration and authentication.
http:
  port: 8080
  # username: !env_var "USERNAME john"
  # password: !env_var "PASSWORD qwerty"


# Optional MQTT client configuration and authentication.
mqtt:
  host: localhost
  # port: 1883
  # username: !secret mqtt_username
  # password: !secret mqtt_password


# Default FFmpeg arguments for decoding video stream before detection and encoding back afterwards.
# Optional, can be overwritten per camera.
ffmpeg:
  decoder:
    - -hide_banner
    - -loglevel
    -  error
    - -nostdin
    - -hwaccel                   # These options enable hardware acceleration of
    -  vaapi                     # video de/encoding. You need to check what methods
    - -hwaccel_device            # (if any) are supported by running the command:
    -  /dev/dri/renderD128       #    ffmpeg -hwaccels
    - -hwaccel_output_format     # Then refer to the documentation of the method
    -  yuv420p                   # to enable it in ffmpeg. Remove if not sure.
    - -fflags
    -  nobuffer
    - -flags
    -  low_delay
    - -fflags
    -  +genpts+discardcorrupt
    - -i                          # camera input field will follow '-i' ffmpeg argument automatically
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24
  encoder:                        # Encoder is optional, remove the entire list to disable.
    - -hide_banner
    - -loglevel
    -  error
    - -hwaccel
    -  vaapi
    - -hwaccel_device
    -  /dev/dri/renderD128
    - -hwaccel_output_format
    -  yuv420p
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24
    - -i                          # detection output stream will follow '-i' ffmpeg argument automatically
    - -an
    - -f
    -  mpegts
    - -vcodec
    -  libx264
    - -pix_fmt
    -  yuv420p
    - -vf
    - "drawtext='text=%{localtime\\:%c}': x=w-tw-lh: y=h-2*lh: fontcolor=white: box=1: [email protected]"


# Detect the following labels of the object detection model.
# Optional, can be overwritten per camera.
detect:
  - person:
      area: 20                    # Minimum area of the bounding box an object should have in
                                  # order to be detected. Defaults to 10% of entire video resolution.
      confidence: 60              # Confidence threshold that a detection is what it's guessed to be,
                                  # otherwise it's ruled out. 50% if not set.
  - car:
      zones: [1, 3, 5]            # Limit the zones on mask image, where detection is allowed.
                                  # If not set or empty, all zones are allowed.
                                  # Run `zones.py -m mask.png` to figure out a zone number.
  - truck:


# List of cameras and their configurations.
cameras:
  - voordeur:                        # Camera name
      width: 2560                  #
      height: 1440                 # Video feed resolution in pixels

      input: !ENV "rtsp://admin:[email protected]:554/Streaming/Channels/101/"

      #mask: porch.png             # Optional mask. Must be the same size as your video feed.

      detect:                     # The values below override
        - person:                 # detection defaults for just
        - car:                    # this camera

#  - backyard:                     # Camera name
#      width: 640                  #
#      height: 480                 # Video feed resolution in pixels

#      input: !ENV "rtsp://${RTSP_USERNAME}:${RTSP_PASSWORD}@192.168.0.20:554/cam/realmonitor?channel=1&subtype=2"
#      output: !ENV "${HOME}/Videos/backyard.mp4"

      ffmpeg:                     # These values override FFmpeg defaults
        decoder:                  # for just # this camera
          - -hide_banner
          - -loglevel
          -  error
          - -nostdin
          - -hwaccel
          -  vaapi
          - -hwaccel_device
          -  /dev/dri/renderD128
          - -hwaccel_output_format
          -  yuv420p
          - -i                    # camera input field will follow '-i' ffmpeg argument automatically
          - -filter:v
          -  fps=fps=15
          - -f
          -  rawvideo
          - -pix_fmt
          -  rgb24
        encoder:
          - -hide_banner
          - -loglevel
          -  error
          - -hwaccel
          -  vaapi
          - -hwaccel_device
          -  /dev/dri/renderD128
          - -hwaccel_output_format
          -  yuv420p
          - -f
          -  rawvideo
          - -pix_fmt
          -  rgb24
          - -i                    # detection output stream will follow '-i' ffmpeg argument automatically
          - -an
          - -f
          -  mp4
          - -vcodec
          -  libx264
          - -pix_fmt
          -  yuv420p
          - -vf
          - "drawtext='text=%{localtime\\:%c}': x=w-tw-lh: y=h-2*lh: fontcolor=white: box=1: [email protected]"
          - -y

But it refuses to start. No error in log. though I do get some strange message on editing the file, but I am sure the string is correct (it wor in hass and via VLC).
error:
unknown tag !<!ENV> at line 88, column 83:
… .20:554/Streaming/Channels/101/"
^

this happened to me too and it was because I had DOODS running as another addon and it uses the same port.

Sorry, forgot to mention I used several ports I’ve 8090, 8092, etc. No difference.

try removing the !ENV from this line

input: !ENV "rtsp://admin:[email protected]:554/Streaming/Channels/101/"

same error… no difference :frowning:

Any other tips?