Object detection for video surveillance

So I am paying the price of living on the bleeding edge with cuda… I can’t launch it and I am getting an cuda initialization failure running cuda11.0, cuDNN 8.0.1 and TensorRT 7.1.3.4. I was able to compile and run opencv but not watsor…

Will try in the next weeks!

If VAAPI doesn’t work for you, use hardware acceleration on Tesla K80:

          ...
          - -c:v
          -  h264_cuvid
          - -i 
          ...

or

          ...
          - -hwaccel
          -  cuvid
          - -i 
          ...

GPU will be responsible for decoding and detecting, which is less effective than using dedicated HW accelerated decoder, but better than no acceleration at all.

Managed to got mine up and running but am struggling with an rtsp udp stream. I keep on getting tcp connection refusals in spite of having added the rtsp_tranport udp. Seems to be more of a problem with ffmpeg but… I have no problem running ffmpeg on that stream…

edit: streaming directly from the camera instead of a proxy apparently fixed it.
Under metrics, I am seeing inference time of 5.0. Is it in seconds?

milliseconds, it’s the time spent by the neural network on processing one frame

1 Like

Just installed the add-on in my supervised install of HA.
I’m wondering how to run the command to deteck what hardware accels I have available. In the comments of the sample config you said # ffmpeg -hwaccels.
Thing is that as I understand it the supervised HA has its own ffmpeg (maybe I misunderstood that), but anyway I don’t know where to run that command.
Any advice?

Thanks

Thanks asmirnou, had some time today to get back into this, when trying to refer to any form of hardware configuration the system stops responding, removing all hardware references in the config:

  decoder:
    - -hide_banner
    - -loglevel
    -  error
    - -nostdin
    - -fflags
    -  nobuffer
    - -flags
    -  low_delay
    - -fflags
    -  +genpts+discardcorrupt
    - -i                          # camera input field will follow '-i' ffmpeg argument automatically
    - -filter:v
    -  fps=fps=10,scale=-1:360
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24


  encoder:                        # Encoder is optional, remove the entire list to disable.
    - -hide_banner
    - -loglevel
    -  error
    - -i 
    -  rawvideo
    - -pix_fmt
    -  rgb24
    - -f

Works but gives errors,


watchdog         WatchDog                 WARNING : Thread def1 (FFmpegEncoder) is not alive, restarting...
def              FFmpegEncoder            INFO    : pipe:: Invalid data found when processing input
unifi            FFmpegEncoder            INFO    : pipe:: Invalid data found when processing input
def1             FFmpegEncoder            INFO    : pipe:: Invalid data found when processing input
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 34 32, bytestream -30
def1             FFmpegDecoder            INFO    : [h264 @ 0x55f1dd05dc20] error while decoding MB 24 42, bytestream -29
Thread-244       werkzeug                 INFO    : 192.168.1.125 - - [22/Jul/2020 13:25:09] "GET /metrics HTTP/1.1" 200 -
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 60 32, bytestream -16
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 40 32, bytestream -26
Thread-245       werkzeug                 INFO    : 192.168.1.125 - - [22/Jul/2020 13:25:14] "GET /metrics HTTP/1.1" 200 -
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 82 29, bytestream -13
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 33 30, bytestream -2
watchdog         WatchDog                 WARNING : Thread unifi (FFmpegEncoder) is not alive, restarting...
watchdog         WatchDog                 WARNING : Thread def (FFmpegEncoder) is not alive, restarting...
watchdog         WatchDog                 WARNING : Thread def1 (FFmpegEncoder) is not alive, restarting...
unifi            FFmpegEncoder            INFO    : pipe:: Invalid data found when processing input
def              FFmpegEncoder            INFO    : pipe:: Invalid data found when processing input
def1             FFmpegEncoder            INFO    : pipe:: Invalid data found when processing input
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 18 30, bytestream -29
Thread-246       werkzeug                 INFO    : 192.168.1.125 - - [22/Jul/2020 13:25:20] "GET /metrics HTTP/1.1" 200 -
unifi            FFmpegDecoder            INFO    : [h264 @ 0x56212a58a8c0] error while decoding MB 8 30, bytestream -14

But yet in my Metrics reads that the GPU is being used:



So I think the GPU is being used but not very much :thinking: any ideas thanks. Andy.

hi,

ive installed Watsor via Hass`s Addon . My hass is running on Ubuntu Intel NUC i3 8gb ram

http:
  port: 8080
ffmpeg:
  decoder:
    - -hide_banner
    - -loglevel
    -  error
    - -nostdin
    - -hwaccel                   # These options enable hardware acceleration of
    -  vaapi                     # video de/encoding. You need to check what methods
    - -hwaccel_device            # (if any) are supported by running the command:
    -  /dev/dri/renderD128       #    ffmpeg -hwaccels
    - -hwaccel_output_format     # Then refer to the documentation of the method
    -  yuv420p                   # to enable it in ffmpeg. Remove if not sure.
    - -fflags
    -  nobuffer
    - -flags
    -  low_delay
    - -fflags
    -  +genpts+discardcorrupt
    - -i                          # camera input field will follow '-i' ffmpeg argument automatically
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24
  encoder:                        # Encoder is optional, remove the entire list to disable.
    - -hide_banner
    - -loglevel
    -  error
    - -hwaccel
    -  vaapi
    - -hwaccel_device
    -  /dev/dri/renderD128
    - -hwaccel_output_format
    -  yuv420p
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24
    - -i                          # detection output stream will follow '-i' ffmpeg argument automatically
    - -an
    - -f
    -  mpegts
    - -vcodec
    -  libx264
    - -pix_fmt
    -  yuv420p
    - -vf
    - "drawtext='text=%{localtime\\:%c}': x=w-tw-lh: y=h-2*lh: fontcolor=white: box=1: [email protected]"
detect:
  - person:
      area: 10                   # Minimum area of the bounding box an object should have in
                                  # order to be detected. Defaults to 10% of entire video resolution.
      confidence: 60              # Confidence threshold that a detection is what it's guessed to be,
                                  # otherwise it's ruled out. 50% if not set.
  - car:
      area: 15                    # Minimum area of the bounding box an object should have in
                                  # order to be detected. Defaults to 10% of entire video resolution.
      confidence: 60   
cameras:
  - porch:                        # Camera name
      width: 640                  #
      height: 480                 # Video feed resolution in pixels

      input: !ENV "rtsp://admin:[email protected]:554/Streaming/Channels/102"
      detect:                     # The values below override
        - person:                 # detection defaults for just
        - car:       

I did not see any error on Watsors Log . Ive setup my Hikvisions IPCAM Substream to 640x480 , 6 fps . ive played with multiple fps ranging from 4 - 12 . Here is the Metric output

{
    "cameras": [
        {
            "name": "porch",
            "fps": {
                "decoder": 6.1,
                "sieve": 6.1,
                "visual_effects": 6.1,
                "snapshot": 6.1,
                "encoder": 6.1
            },
            "buffer_in": 10,
            "buffer_out": 0
        }
    ],
    "detectors": [
        {
            "name": "CPU",
            "fps": 6.1,
            "fps_max": 11,
            "inference_time": 94.5
        }
    ]
}

Htop shows some processing done perhaps by Watsor

http://192.168.1.3:8080/snapshot/porch/person yield nothing even when i walk in front of the cam .

if i browse http://192.168.1.3:8080/video/mjpeg/porch and i walk in front of the camera, it should produce my video with bounding box. no?

any tips? tq

@asmirnou, I am also finding some oddities with ffmpeg and I am pretty sure it is more related to ffmpeg itself than watsor. Have you thought about using opencv (which also use ffmpeg) to manage the video streams? It seems a little more efficient and has python API. That’s what I ended up using on my facial recognition component and made my python code very simple, only a dozen of lines and also somehow significantly reduced my cpu utilization: For one video stream this is what I am observing:
ffmpeg with cpu: 30%
ffmpeg with GPU: 40%
opencv with GPU: 6%

This is my camera code to maintain the video stream on:

That paragraph in documentation describes a different type of deployment, where the command line is available on the machine. In Hassio that’s difficult cause a lot is hidden behind the supervisor, the only way I know of to get inside through the SSH add-on. Then one needs to “exec” into docker container where Watsor is running in order to run ffmpeg -hwaccels.

What machine are you running HA on? Maybe someone already set it up.

Getting the camera to work efficiently in FFmpeg is the hardest part of the whole setup. Unfortunately, due to the variety of devices, formats and hardware, it is difficult to advise without seeing…

A good practice is to get FFmpeg working with the camera outside of Docker container by running FFmpeg from the command line and then move the options to the application config.

Right. The only thing I can think of is your camera is mounted high and a person walking in front of it occupies less than 10% of the image. Set the area threshold to zero, will it show something up?

      detect:
        - person:
            area: 0
        - car:
            area: 0

I hardly believe that OpenCV can be so much more effective than FFmpeg. Decoding is not performed by any of them anyway, but by the avcodec library. Most probably the hardware acceleration wasn’t enabled in FFmpeg or the input conditions were different.

I am scratching my head too to be honest but running FFmpeg as a stand-alone with the same command gets me the same result. I can’t explain it either and indeed openCV uses FFmpeg and the avcodec library and I did verify that hardware acceleration is enabled from the nvidia-smi output. The CPU usage is significantly higher and even higher than using CPU decoding alone… it is strange. I see the exact opposite using openCV on the same camera:
openCV CPU: +20%
openCV GPU: +6%
FFMpeg CPU: +30%
FFMpeg GPU: +40%

The output is a bit different with FFMPEG the RGB24 it appears…

ive reduce both person and car area to 0 and even reduce confidence to 10. that car in the picture wasnt detected. few min ago ive seen car pass the camera but non detected. hermm.

it could be detected with object type “truck” ?

Due to the absence of any hardware accelerator the detection falls over to CPU. To minimize the load (as far as CPU is not suitable for such kind of mathematical task) the lightest possible model is used trained on a smaller dataset. That model doesn’t recognize the yellow van in your example and nothing else on that image. The bigger models I tried were managed to detect several objects of very low confidence ~10%. The van was detected 17% certain. This is quite unusual as most often any model even the lightest one provides decent accuracy.

Try rotating the camera slightly to get better angle.

car
person

hi manage to get it working for a while. … i change the area to 1 for both car and person. for the picture it can capture. for the video the bounding box was not ?permanent? it only appear briefly and reapear again at random. then it stop working . i had to restart watsor for it to start working again. after successfully detecting some object it stop working again. it needs to be restarted to be working again.

i`ll try reduce source video to 640/480 2 fps and see what happen

@asmirnou

Just to show what I am talking about this is a screenshot output of ps aux on the machine I am running home assistant on:

You can see that watsor created 4 processes and I am using the gpu driven version. Looking at the FFmpeg process, it drives more cpu utilization even though I already downsized the video stream to <1MP down from 3MP. For reference I am comparing with the home-assistant process which processes a 3MP video stream through the GPU using opencv with the same codec and is also processing facial recognition though the CPU, facial detection is done by the GPU. Without the video processing, HA typically is around 5-6% CPU utilization.

And you see below that FFmpeg and watsor are indeed using the GPU.
Screen Shot 2020-07-23 at 09.45.56

As I said earlier, if I do not downsize the video which is an FFmpeg filter, the CPU utilization shoots to 40% so I don’t think it is the decoding but the output processing which uses the CPU causing the high load.

edit: with the video downsizing, using cpu decoding doubles the cpu utilization of FFmpeg to 30%, about the same as without the downsizing. Not sure what is going on.

FFmpeg CPU decoding with or without downsizing: CPU ~30%
FFmpeg GPU decoding with downsizing: CPU ~15%
FFmpeg GPU decoding without downsizing: CPU ~40%

openCV with CPU decoding without downsizing: CPU ~20%
openCV with GPU decoding without downsizing: CPU ~6%

This becomes a larger concern once we start multiplying cameras as you can see, the image processing python process goes up as expected and is quite hard on the CPU in spite of using the gpu as well but ffmpeg is really multiplying. I wonder if there is any way to curb this.