Jeonghun (James) Lee: EVM-Jetson AGX Xavier

레이블이 EVM-Jetson AGX Xavier인 게시물을 표시합니다. 모든 게시물 표시

8/02/2019

Deepstream SDK 4.0 변화 및 PlugIn 구조 및 생성방법 ( Gstreamer 변화 )

1. DeepStream SDK 4.0 변화 (Gstreamer 변화)

기존 DeepStream SDK 3.0과 호환되지 않는 부분이 많으며, 우선 빨리 파악하기 위해서 PlugIn Manual 과 소스를 분석하여 어떻게 변경되었는지 알아야겠다.

기존의 DeepStream SDK 3.0에서 동작되었던 , Gstreamer 명령어들이 동작되지 않는 것들이 많다.

DeepStream 관련전체문서 (필독)

https://docs.nvidia.com/metropolis/index.html

상위 전체문서 중에 많이 보게될 문서는 아래 3 문서가 될 것 같다.

DeepStream Release Note

이전버전과 변경사항 및 x86과 Jetson의 차이와 신기능들을 확인하자
https://docs.nvidia.com/metropolis/deepstream/4.0/DeepStream_4.0_Release_Notes.pdf

DeepStream Quick Guide 기본사용법

설치는 sdkmanger로 쉽게 하면될 것이고, 개발 및 관련 설명을 쉽게 정리해서 보기 편하다
https://docs.nvidia.com/metropolis/deepstream/4.0/dev-guide/index.html

DeepStream 개발시 PlugIn Manual 과 DeepStream API

DeepStream 관련부분을 개발할 경우, PlugIn의 정보와 기능을 비롯하여 내부에서 사용하는 API들을 알아야하는데 관련문서들이므로, 필수로 보자
https://docs.nvidia.com/metropolis/deepstream/4.0/DeepStream_Plugin_Manual.pdf
https://docs.nvidia.com/metropolis/deepstream/4.0/dev-guide/DeepStream_Development_Guide/baggage/index.html

1.1 Jetson AGX Xavier 의 INT8 특징

다른 Jetson과 다른게 DLA라는 것이 존재하며, 이는 INT8 Inference기능을 제공을 하고 있다.
이외에도 OpenVX 기능도 존재하지만, 이부분이 OpenCV에도 적용이 되는지는 좀 더 알아봐야할 것 같다.
DeepStream 4.0부터 지원되는 기능은 아니며, 기존부터 존재했다고 하지만, Xavier를 처음 사용하기에 이 를 간단히 정리하며, Jetson Nano , TX2는 이 부분에서 제외

INT8 Inferece 관련문서 (Jetson AGX Xavier 지원)

http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf

INT8기능이 지원이 되려면 아래와 같이 GPU의 SM 61 Version 이상이어야 하며, 아래와 같은 연산이 지원되어야 가능하다

SM61 관련내용

https://devtalk.nvidia.com/default/topic/1026069/jetson-tx2/how-to-use-int8-inference-with-jetsontx2-tensorrt-2-1-2/

INT8 Inferece의 목적은 FP32/FP16에 비해 정확성 손실이 크게 없이 빠른계산을 위해 INT8로 변경하여 속도향상이 목적이다

Bias의 필요성 ( 절대값이 아닌 상대값으로 보면, 필요가 없어질것으로 추측 )

하지만 아래와 같이 2개의 곱에서 전체의 Bias가 필요가 없어지는데, 이 부분이 좀 혼동이 된다
이 부분은 좀 알아봐야 할 것 같다.

아래의 FP32 Bias가 불필요해서 제거한다고 함

양자화(Quantization, 비율로 INT8에 맞게 양자화 진행 )

아래와 같이 양자화할 경우 Threshold를 설정하여 Saturation 을 조절

INT8 Inference의 정확성 비교

상위 INT8의 문서를 간단히 정리하면, Weight는 그대로 두고, Bias를 제거후 일종의 Hash Table 같은 것을 만들어서
FP32를 INT8로 Table를 통해 Mapping하는 방식으로 구현한다 (양자화)
Bias의 불필요성은 값을 절대값이 아닌 상대값으로 보기 때문에 필요가 없어지는 것 같으며, 상위문서를 잘 봐도 크게 데이타 손실은 없을 것 같다.(추측)

재미있는 부분은 양자화할때의 정확성부분이며, 이때 Threshold를 설정하여 Saturation 을 조절도 가능하다는 점이다.
그리고, 불필요하다면, Threshold를 설정하여, 잘라 내어 제거한다
이 부분의 필요성이 언제 필요한지는 추후에 알아봐야할 것 같다

양자화(Quantization) 할때 Mapping시 Hash Table 사용했는지는 모르지만, 예전에 내가 비슷한 것을 구현했던 경험이 있어,
Hash Table을 이용했기때문에, 나라면 Hash Table을 이용했을 것 같다.

INT8의 Inference의 기능도 꽤 재미있는 기능이며, 이 부분에 관심이 많아졌다.
다만 상위문서를 설명을 듣고 싶은데 문서로만 봐서 안타까울 뿐이다.

Config File의 IN8 Inference 확인사항

model-engine-file : TensorRT model-engine (serialized 된 상태의 INT8)
int8-calib-file : CalibrationTable File 이며 각 TensorRT의 Version 정보표시
network-mode : 0=FP32, 1=INT8, 2=FP16 mode , 처음 model-engine이 없을 경우 이 기준으로 생성

Config File의 Example for Jetson AGX Xavier

model-engine-file=model_b1_int8.engine
int8-calib-file=yolov3-calibration.table.trt5.1
int8-calib-file=../../models/Primary_Detector/cal_trt4.bin

1.2 DeepStream SDK 3.0 과 4.0 비교

3.0에서 4.0으로 변경되면서 많은 기능이 추가되었지만 호환되지 않는 부분이 많이 생겨, 관련부분을 정리가 필요할 것 같다.
기존 DS3.0에서 이것저것 만들어보고 Porting해보고 했는데, DS4.0에서 많이 지원되는 것 같은데, 관련부분도 다 테스트를 해야한다.

NVIDIA의 문서를 보면 가장 큰 변화사항은 Jetson 과 dGPU Platform 기반의 단일화된 변화라고 하는데, 간단히 정리하면, 최적화를 통한 성능향상이 될 것 같다.
세부사항은 역시 PlugIn Manual로 다 봐야 알겠다.

더불어 이제 x86만 지원가능했던 NGC Docker도 ARM에서도 지원을 해주기 때문에 설치환경이 편하게 될 것 같다.

Gst-nvinfer 변화정리

UFF/ONNX/Caffe 이외의 Custom Model 위한 New Interface제공 (TensorRT IPlugin)
Segmentation/Gray model 지원
FP16 / INT8 DLA (Jetson Xavier) 지원 (기존 INT8만 지원)
Source Code 제공 (이 부분은 나중에 분석)

New PlugIns

Gst-V4L2 기반의 H265+H264 encode 와 decode 지원 ( 기존과 변경됨)
JPEG+MJPEG decoder 지원
gst-nvvideoconver (기존 gst-nvconv 확장)
gst-nvof ( Optical flow )
nvofvisual / nvsegvisul 지원을 해준다고 하는데, 설정으로 테스트 진행을 해봐야겠다.
dewarper 도 제공해주며, gst-msgbroker도 많이 확장되었다.
이외 기존 Plugin들의 이름이 호환되지 않는다.

자세한 내용 아래의 Release Note를 참고해서 보자
https://docs.nvidia.com/metropolis/deepstream/4.0/DeepStream_4.0_Release_Notes.pdf

1.3 DeepStream 4.0의 PlugIn 관련사항

DS4.0 PlugIn Manual
https://docs.nvidia.com/metropolis/deepstream/4.0/DeepStream_Plugin_Manual.pdf

기존처럼 gst-inspect를 이용하여 PlugIn 기능확인을 할 수 없기 때문에 오직 상위 Manual로 세부사항을 알아야겠다.

gst-inspect 명령어로 Element 기능확인

Terminal에서 확인이 잘되지만, SSH로 연결시 문제가 발생하는 부분이 gst-inspect 부분이다.

$ ssh  nvidia@192.168.55.1  
$ echo $DISPLAY  // 설정이 없음 

$ gst-inspect-1.0 -a   // 모든 PlugIn 확인가능    
.......
$ gst-inspect-1.0 -a |  grep dsexample     
.....
$ gst-inspect-1.0 dsexample    
Factory Details:
  Rank                     primary (256)
  Long-name                DsExample plugin
  Klass                    DsExample Plugin
......
GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseTransform
                         +----GstDsExample


$ ssh -X  nvidia@192.168.55.1         // X Protocol 지원 

$ echo $DISPLAY   // 상위와 다르게 설정되었으며, 이로 인해 오작동됨  
localhost:10.0

$ export DISPLAY=:1   // 1 or 0 설정  반드시 =:를 사용  

$ gst-inspect-1.0 dsexample    
Factory Details:
  Rank                     primary (256)
  Long-name                DsExample plugin
  Klass                    DsExample Plugin
......
GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseTransform
                         +----GstDsExample

Gstreamer gst-inspect
https://gstreamer.freedesktop.org/documentation/tools/gst-inspect.html?gi-language=c#

SSH에서 gst-inspect 사용시 주의 사항

만약 SSH X Protocl 과 같이 접속시 상위의 DISPLAY를 미설정시에는 상위 Command 미동작

미동작원인을 몰랐는데, NVIDIA에서 정확히 알려줘서 해결
https://devtalk.nvidia.com/default/topic/1058525/deepstream-sdk/gst-inspect-is-not-work-properly-in-ds4-0/post/5368380/#5368380

Ubuntu DISPLAY 관련설정
https://help.ubuntu.com/community/EnvironmentVariables

만약 문제생길 경우 아래와 같이 Cache 삭제

$ rm ~/.cache/gstreamer-1.0/registry.aarch64.bin   //문제가 생기면, 아래와 같이 Gstreamer Cache를 지우고 다시 해보자. 
......

1.4 DeepStream PlugIn 구조 및 위치확인

이전 DS SDK 3.0과 동일하며, 아래의 소스에서 Sample PlugIn을 선택해서 이름을 변경해서 Sample을 만들고 테스트를 진행하자.

DeepStream Gst-PlugIn 예제 구성

아래와 같이 Gstreamer 의 PlugIN 구조를 파악을 하고 예제로 주어진 dsexample을 이름을 변경하여 만들어서 간단히 테스트를 진행하면된다.

$ cd ~/deepstream-4.0/sources/gst-plugins
$ tree .
.
├── gst-dsexample   // 이것 기준으로 동일하게 이름을 변경해서 테스트 진행 
│   ├── dsexample_lib
│   │   ├── dsexample_lib.c
│   │   ├── dsexample_lib.h
│   │   ├── dsexample_lib.o
│   │   ├── libdsexample.a
│   │   └── Makefile
│   ├── gstdsexample.cpp
│   ├── gstdsexample.h
│   ├── gstdsexample.o
│   ├── libnvdsgst_dsexample.so
│   ├── Makefile
│   └── README
├── gst-nvinfer                 // 새로 추가된 nvinfer 
│   ├── gstnvinfer_allocator.cpp
│   ├── gstnvinfer_allocator.h
│   ├── gstnvinfer_allocator.o
│   ├── gstnvinfer.cpp
│   ├── gstnvinfer.h
│   ├── gstnvinfer_meta_utils.cpp
│   ├── gstnvinfer_meta_utils.h
│   ├── gstnvinfer_meta_utils.o
│   ├── gstnvinfer.o
│   ├── gstnvinfer_property_parser.cpp
│   ├── gstnvinfer_property_parser.h
│   ├── gstnvinfer_property_parser.o
│   ├── libnvdsgst_infer.so
│   ├── Makefile
│   └── README
├── gst-nvmsgbroker
│   ├── gstnvmsgbroker.c
│   ├── gstnvmsgbroker.h
│   ├── gstnvmsgbroker.o
│   ├── libnvdsgst_msgbroker.so
│   ├── Makefile
│   └── README
└─── gst-nvmsgconv
      ├── gstnvmsgconv.c
      ├── gstnvmsgconv.h
      ├── gstnvmsgconv.o
      ├── libnvdsgst_msgconv.so
      ├── Makefile
      └── README

Gst PlugIn 및 DeepStream PlugIn 위치파악

DeepStream Plugin 위치 및 Gstreamer PlugIn 위치를 알아보기 위해 아래와 같이 찾아보았다.

$ ls /opt/nvidia/deepstream/deepstream-4.0/lib/gst-plugins/    //DeepStream PlugIn만 설치위치확인  
libnvdsgst_dewarper.so   libnvdsgst_msgbroker.so    libnvdsgst_multistreamtiler.so  libnvdsgst_osd.so        libnvdsgst_tracker.so
libnvdsgst_dsexample.so  libnvdsgst_msgconv.so      libnvdsgst_of.so          libnvdsgst_infer.so      libnvdsgst_multistream.so  libnvdsgst_ofvisual.so       
libnvdsgst_segvisual.so

$ cat /etc/ld.so.conf.d/deepstream.conf              //DeepStream 동적 Library 연결확인 
/opt/nvidia/deepstream/deepstream-4.0/lib

$ echo $PATH   // PATH는 아시다시피, BIN파일을 어느위치에서 실행가능한 환경변수 
/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

$ echo $LD_LIBRARY_PATH   // LD의 동적 LIBRARY_PATH로 상위 ld.so.conf 설정도 참조 
/usr/local/cuda-10.0/lib64:

/*
상위 Gstreamer 환경변수 참조하여 관련설정 전부확인했으나 파악실패 
*/
 
$ echo $GST_PLUGIN_PATH  // 설정없음   ,
$ echo $GST_PLUGIN_PATH_1_0 //설정없음 
$ echo $GST_PLUGIN_SYSTEM_PATH  // 설정없음 
$ echo $GST_PLUGIN_SYSTEM_PATH_1_0  // 설정없음
$ ls ~/.local/share/gstreamer-1.0/presets/    // 아무것도 없음 

/*
 Gstreamer PlugIn을 위치를 직접 찾겠다. 
*/

$ find / -name gstreamer-1.0 2> /dev/null    // 관련 Directory 파악완료 
/usr/share/gstreamer-1.0
/usr/include/gstreamer-1.0
/usr/lib/aarch64-linux-gnu/gstreamer1.0/gstreamer-1.0
/usr/lib/aarch64-linux-gnu/gstreamer-1.0
/home/nvidia/.cache/gstreamer-1.0
/home/nvidia/.local/share/gstreamer-1.0

$ cat /home/nvidia/.cache/gstreamer-1.0/registry.aarch64.bin   // 이 안에 파일을 분석하면, directory 구조 파악가능 

$ ls /usr/lib/aarch64-linux-gnu/gstreamer-1.0  // 다른 Gstreamer PlugIn 부분과 DeepStream Plugin 연결 확인완료 (DeepStream로 심볼링크됨)
deepstream                 libgstcurl.so                libgstisomp4.so            libgstomx.so              libgsttaglib.so
include                    libgstcutter.so              libgstivfparse.so          libgstopenal.so           libgsttcp.so
libcluttergst3.so          libgstdashdemux.so           libgstivtc.so              libgstopenexr.so          libgstteletext.so
libgst1394.so              libgstdc1394.so              libgstjack.so              libgstopenglmixers.so     libgsttheora.so
..............

1.5 DeepStream의 PlugIn 개발

상위와 같이 기본동작구성을 알았으니, 기본으로 Gstreamer PlugIn 관련 개발 Manual을 숙지해두고 알아두자

Gstreamer PlugIn 개발 (필독)

https://gstreamer.freedesktop.org/documentation/plugin-development/basics/boiler.html?gi-language=c

Gstreamer PlugIn 개발시 Pad 부분 (필독)

https://gstreamer.freedesktop.org/documentation/plugin-development/basics/pads.html?gi-language=c

Gstreamer 의 Properites 설정 (필독)

https://gstreamer.freedesktop.org/documentation/plugin-development/basics/args.html?gi-language=c

이외 Callback 함수들 연결

Chain function: chain 함수를 만들어서 Callback 으로 호출하는데, 내부 Data 처리할때 사용
Event Function: Pad에게 Callback Function 넣고 State에 따라 pad에게 event 생성가능
Query Function: Query를 받았을 때 Callback

Gstreamer Write Guide

PlugIn 구조를 세부적으로 알기위해서 아래의 Write Guide를 좀 자세히 보자
https://gstreamer.freedesktop.org/documentation/plugin-development/index.html?gi-language=c

SAMPLE의 Gstreamer 구성의 예

Gstreamer PlugIn의 함수는SAMPLE이라는 이름으로 생성하고자 한다면 아래와 같이 만들면 된다.
함수이름 역시 gst_sample_xxx으로 구성을 하면된다.

 vi sample.h 
G_BEGIN_DECLS
....
#define GST_TYPE_SAMPLE (gst_sample_get_type())
#define GST_SAMPLE(obj) (G_TYPE_CHECK_INSTANCE_CAST((obj),GST_TYPE_ROI,GstSAMPLE))
#define GST_SAMPLE_CLASS(klass) (G_TYPE_CHECK_CLASS_CAST((klass),GST_TYPE_ROI,GstSAMPLEClass))
#define GST_SAMPLE_GET_CLASS(obj) (G_TYPE_INSTANCE_GET_CLASS((obj), GST_TYPE_SAMPLE, GstSAMPLEClass))
#define GST_IS_SAMPLE(obj) (G_TYPE_CHECK_INSTANCE_TYPE((obj),GST_TYPE_SAMPLE))
#define GST_IS_SAMPLE_CLASS(klass) (G_TYPE_CHECK_CLASS_TYPE((klass),GST_TYPE_SAMPLE))
#define GST_SAMPLE_CAST(obj)  ((GstSAMPLE *)(obj))

..
G_END_DECLS

dsexample로 본인이 원하는 PlugIn 생성

dsexmple을 복사하여 이름만 변경해서 그 구성을 만들어서 일단 테스트를 진행을 해보면 쉽게 동작되는 것을 확인가능하다.
설치가 되면, gst-inspect 로도 쉽게 관련설명을 확인 할수 있다.
자세한 세부설명은 생략 (Gstreamer Manual 참조)

dsexample 과 nvmsgbroker 비교분석

두개의 PlugIn의 구성 동작방식이 다르며, 이는 아래와 같이 간단히 비교가능하다.
gst_xxxxx_class_init 함수에서 사용되는 구조체와 이와 관련된 함수들을 비교 분석할 필요가 있다.

//msgbroker 
//GstBaseSinkClass  , Sink Pad의 중점으로 동작되도록 구성

  GObjectClass *gobject_class = G_OBJECT_CLASS (klass);
  GstBaseSinkClass *base_sink_class = GST_BASE_SINK_CLASS (klass);
.....
  base_sink_class->set_caps = GST_DEBUG_FUNCPTR (gst_nvmsgbroker_set_caps);
  base_sink_class->start = GST_DEBUG_FUNCPTR (gst_nvmsgbroker_start);
  base_sink_class->stop = GST_DEBUG_FUNCPTR (gst_nvmsgbroker_stop);
  base_sink_class->render = GST_DEBUG_FUNCPTR (gst_nvmsgbroker_render);

// dsexmple 
//GstBaseTransformClass  , PlugIn 내부에서 Data 변경중심으로 동작 (이때 Data를 어떻게 trasform 시키는지 확인) 

  GObjectClass *gobject_class;
  GstElementClass *gstelement_class;
  GstBaseTransformClass *gstbasetransform_class;

  gobject_class = (GObjectClass *) klass;
  gstelement_class = (GstElementClass *) klass;
  gstbasetransform_class = (GstBaseTransformClass *) klass;

  /* Overide base class functions */
  gobject_class->set_property = GST_DEBUG_FUNCPTR (gst_dsexample_set_property);
  gobject_class->get_property = GST_DEBUG_FUNCPTR (gst_dsexample_get_property);

  gstbasetransform_class->set_caps = GST_DEBUG_FUNCPTR (gst_dsexample_set_caps);
  gstbasetransform_class->start = GST_DEBUG_FUNCPTR (gst_dsexample_start);
  gstbasetransform_class->stop = GST_DEBUG_FUNCPTR (gst_dsexample_stop);

상위와 같이 각각의 Class에 Callback Function을 넣고 동작을 하는데, 언제 호출이되는지를 파악하자.

GstBaseSink 와 GstBaseTransform 관련구조 파악

둘 다 구조를 보면 GstElement 가 부모 Class 이므로 GstElement 하위 클래스 특징을 알아두자
각각의 method들을 파악하자
https://gstreamer.freedesktop.org/documentation/base/gstbasetransform.html?gi-language=c#GstBaseTransform
https://gstreamer.freedesktop.org/documentation/base/gstbasesink.html?gi-language=c#GstBaseSink

DeepStream Program 과 PlugIn을 작성중에 NVIDIA에게 직접질문사항

dsexample의 properties 중 full-frame의 설정에따라 openCV 와 crop기능이 동작이 되는데, 이 부분을 내 소스에 적용하여 동작되는 것은 확인했다.
이를 정확하게 이해하고자 하면, 반드시 DeepStream SDK API 문서를 보고 각각의 동작을 이해해야한다.
https://devtalk.nvidia.com/default/topic/1061422/deepstream-sdk/how-to-crop-the-image-and-save/

ROI를 PlugIn을 이용하여 이미 개발을 했는데, Line으로 가능하다고하는데 추후 테스트진행
https://devtalk.nvidia.com/default/topic/1061791/deepstream-sdk/about-roi-in-ds4-0-on-xavier-/post/5378191/#5378191

1.6 각 모델의 성능비교 (TensorRT)

이전에 TensorRT를 하면서 trtexec 제대로 사용할 줄을 몰랐는데, 이제 사용법을 제대로 알겠다.
이 Tool은 UFF/Caffe/ONNX Model, TensorRT Engine의 성능측정을 위해서 사용되어진다고 한다.

//각 모델의 성능을 측정을 해보기 위해서 trtexec 사용을 해보자 
$ /usr/src/tensorrt/bin/trtexec --help
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --help
[I] help

Mandatory params:
  --deploy=          Caffe deploy file
  OR --uff=          UFF file
  OR --onnx=         ONNX Model file
  OR --loadEngine=   Load a saved engine

Mandatory params for UFF:
  --uffInput=,C,H,W Input blob name and its dimensions for UFF parser (can be specified multiple times)
  --output=      Output blob name (can be specified multiple times)

Mandatory params for Caffe:
  --output=      Output blob name (can be specified multiple times)

Optional params:
  --model=          Caffe model file (default = no model, random weights used)
  --batch=N               Set batch size (default = 1)
  --device=N              Set cuda device to N (default = 0)
  --iterations=N          Run N iterations (default = 10)
  --avgRuns=N             Set avgRuns to N - perf is measured as an average of avgRuns (default=10)
  --percentile=P          For each iteration, report the percentile time at P percentage (0<=P<=100, with 0 representing min, and 100 representing max; default = 99.0%)
  --workspace=N           Set workspace size in megabytes (default = 16)
  --safe                  Only test the functionality available in safety restricted flows.
  --fp16                  Run in fp16 mode (default = false). Permits 16-bit kernels
  --int8                  Run in int8 mode (default = false). Currently no support for ONNX model.
  --verbose               Use verbose logging (default = false)
  --saveEngine=     Save a serialized engine to file.
  --loadEngine=     Load a serialized engine from file.
  --calib=          Read INT8 calibration cache file.  Currently no support for ONNX model.
  --useDLACore=N          Specify a DLA engine for layers that support DLA. Value can range from 0 to n-1, where n is the number of DLA engines on the platform.
  --allowGPUFallback      If --useDLACore flag is present and if a layer can't run on DLA, then run on GPU. 
  --useSpinWait           Actively wait for work completion. This option may decrease multi-process synchronization time at the cost of additional CPU usage. (default = false)
  --dumpOutput            Dump outputs at end of test. 
  -h, --help              Print usage

//trtexec의 정보를 얻기위해서 각 config 파일 파악 
$ cd ~/deepstream-4.0/sources/apps/sample_apps/deepstream-test2
$ cat dstest2_pgie_config.txt

# Following properties are mandatory when engine files are not specified:
#   int8-calib-file(Only in INT8)
#   Caffemodel mandatory properties: model-file, proto-file, output-blob-names            // Caffe Model 은 3가지 정보가 필수 , model-file / proto-file , out-blob-names
#   UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names                            // UFF , uff-file, input-dims, uff-input-blob-name, output-blob-names 
#   ONNX: onnx-file                                                                                                 //  ONNX: onnx-file 

model-file=../../../../samples/models/Primary_Detector/resnet10.caffemodel
proto-file=../../../../samples/models/Primary_Detector/resnet10.prototxt
...
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid

//TensorRT Engine 
$ /usr/src/tensorrt/bin/trtexec   --loadEngine=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel_b4_int8.engine 
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --loadEngine=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel_b4_int8.engine
[I] loadEngine: /home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel_b4_int8.engine
[I] /home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel_b4_int8.engine has been successfully loaded.
[I] Average over 10 runs is 1.01944 ms (host walltime is 1.10421 ms, 99% percentile time is 1.06387).
[I] Average over 10 runs is 1.00943 ms (host walltime is 1.06669 ms, 99% percentile time is 1.02454).
[I] Average over 10 runs is 1.00519 ms (host walltime is 1.06144 ms, 99% percentile time is 1.01184).
[I] Average over 10 runs is 1.00898 ms (host walltime is 1.07056 ms, 99% percentile time is 1.02982).
[I] Average over 10 runs is 1.00417 ms (host walltime is 1.06018 ms, 99% percentile time is 1.02707).
[I] Average over 10 runs is 1.00541 ms (host walltime is 1.06557 ms, 99% percentile time is 1.02682).
[I] Average over 10 runs is 1.00323 ms (host walltime is 1.0602 ms, 99% percentile time is 1.03834).
[I] Average over 10 runs is 1.00476 ms (host walltime is 1.06061 ms, 99% percentile time is 1.02954).
[I] Average over 10 runs is 1.00358 ms (host walltime is 1.05957 ms, 99% percentile time is 1.00902).
[I] Average over 10 runs is 1.00232 ms (host walltime is 1.05585 ms, 99% percentile time is 1.00704).

//Caffe Model 과 TensorRT 상위비교가능 (소요시간은 알겠지만, percentile time 은 무슨의미인지 퍼센트?)
$ /usr/src/tensorrt/bin/trtexec --deploy=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt \
  --model=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel \
  --output=conv2d_bbox

&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --deploy=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt --model=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel --output=conv2d_bbox
[I] deploy: /home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt
[I] model: /home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel
[I] output: conv2d_bbox
[I] Input "input_1": 3x368x640
[I] Output "conv2d_bbox": 16x23x40
[I] Average over 10 runs is 4.64663 ms (host walltime is 4.72585 ms, 99% percentile time is 4.7319).
[I] Average over 10 runs is 4.62393 ms (host walltime is 4.69601 ms, 99% percentile time is 4.64422).
[I] Average over 10 runs is 4.6295 ms (host walltime is 4.69154 ms, 99% percentile time is 4.64858).
[I] Average over 10 runs is 4.62978 ms (host walltime is 4.68834 ms, 99% percentile time is 4.64538).
[I] Average over 10 runs is 4.62103 ms (host walltime is 4.68236 ms, 99% percentile time is 4.63843).
[I] Average over 10 runs is 4.62193 ms (host walltime is 4.68143 ms, 99% percentile time is 4.64042).
[I] Average over 10 runs is 4.61595 ms (host walltime is 4.67465 ms, 99% percentile time is 4.62768).
[I] Average over 10 runs is 4.61807 ms (host walltime is 4.67505 ms, 99% percentile time is 4.63514).
[I] Average over 10 runs is 4.61827 ms (host walltime is 4.68276 ms, 99% percentile time is 4.62362).
[I] Average over 10 runs is 4.62702 ms (host walltime is 4.69345 ms, 99% percentile time is 4.64864).
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --deploy=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt --model=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel --output=conv2d_bbox

$ /usr/src/tensorrt/bin/trtexec --deploy=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt \
  --model=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel \
  --output=conv2d_cov/Sigmoid
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --deploy=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt --model=/home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel --output=conv2d_cov/Sigmoid
[I] deploy: /home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.prototxt
[I] model: /home/nvidia/deepstream-4.0/samples/models/Primary_Detector/resnet10.caffemodel
[I] output: conv2d_cov/Sigmoid
[I] Input "input_1": 3x368x640
[I] Output "conv2d_cov/Sigmoid": 4x23x40
[I] Average over 10 runs is 4.6327 ms (host walltime is 4.70868 ms, 99% percentile time is 4.6904).
[I] Average over 10 runs is 4.62576 ms (host walltime is 4.69326 ms, 99% percentile time is 4.66947).
[I] Average over 10 runs is 4.62769 ms (host walltime is 4.68794 ms, 99% percentile time is 4.65818).
[I] Average over 10 runs is 4.62516 ms (host walltime is 4.69319 ms, 99% percentile time is 4.66173).
[I] Average over 10 runs is 4.62184 ms (host walltime is 4.68396 ms, 99% percentile time is 4.64534).
[I] Average over 10 runs is 4.62518 ms (host walltime is 4.67966 ms, 99% percentile time is 4.64067).
[I] Average over 10 runs is 4.62082 ms (host walltime is 4.68281 ms, 99% percentile time is 4.64358).
[I] Average over 10 runs is 4.62256 ms (host walltime is 4.68476 ms, 99% percentile time is 4.65318).
[I] Average over 10 runs is 4.62129 ms (host walltime is 4.68117 ms, 99% percentile time is 4.64125).
[I] Average over 10 runs is 4.62561 ms (host walltime is 4.6864 ms, 99% percentile time is 4.64435).

//UFF Model 
$ cd ~/deepstream-4.0/sources/objectDetector_SSD
$ cat config_infer_primary_ssd.txt
.......
uff-file=sample_ssd_relu6.uff
uff-input-dims=3;300;300;0
uff-input-blob-name=Input
...
output-blob-names=MarkOutput_0
parse-bbox-func-name=NvDsInferParseCustomSSD
custom-lib-path=nvdsinfer_custom_impl_ssd/libnvdsinfer_custom_impl_ssd.so

//UFF 모델은 Custom Layer를 사용해서 동작이 제대로 안될 것 같다 ( uffinput 부분도 상위 값과 어떤의미를 정확하게 알아야하는데, 모름)
$ /usr/src/tensorrt/bin/trtexec --uff=/home/nvidia/deepstream-4.0/sources/objectDetector_SSD/sample_ssd_relu6.uff \
  --uffInput=Input,3,300,300 \
  --output=MarkOutput_0
[I] uff: /home/nvidia/deepstream-4.0/sources/objectDetector_SSD/sample_ssd_relu6.uff
[I] uffInput: Input,3,300,300
[I] output: MarkOutput_0
[E] [TRT] UffParser: Validator error: concat_box_loc: Unsupported operation _FlattenConcat_TRT
[E] Engine could not be created
[E] Engine could not be created
&&&& FAILED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --uff=/home/nvidia/deepstream-4.0/sources/objectDetector_SSD/sample_ssd_relu6.uff --uffInput=Input,3,300,300 --output=MarkOutput_0

Best Practices For TensorRT Performance (trtexec 및 다른 Tool)
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-best-practices/index.html

Caffe Model trtexec 실제사용
  https://devtalk.nvidia.com/default/topic/1061845/deepstream-sdk/resnet50-classification-as-primary-gie/

Model Parser Error 사항
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#error-messaging

yais Tool 은 추후에 사용해보자
  https://github.com/NVIDIA/tensorrt-laboratory/tree/master/examples/00_TensorRT

2. DeepStream 의 Gstream 테스트

DeepStream SDK 3.0 및 Gstreamer 관련정리
https://ahyuo79.blogspot.com/2019/07/deepstream-sdk-30-gstreamer.html

2.1 Gstreamer Debugging 방법

GST_DEBUG 의 환경변수를 이용하며, 이곳에 원하는 값을 넣고 설정하면 관련 Debug Message 볼수가 있다.

1~9까지 선택이 가능하다
( 1 - ERROR ,2 - WARNING ,3 - FIXME ,4 - INFO )
( 5 - DEBUG, 6 - LOG, 7 - TRACE, 9 - MEMDUMP )

Gstreamer 관련 환경변수들 (GST_DEBUG 이외 다양한 환경변수)
https://gstreamer.freedesktop.org/documentation/gstreamer/running.html?gi-language=c

GST_DEBUG 관련사용방법
https://gstreamer.freedesktop.org/documentation/tutorials/basic/debugging-tools.html?gi-language=c

전체설정 Debug

Gstreamer 를 전체 Debug 하고자 하면 아래와 같이 하면 되지만 별로 추천하지는 않는다.

$ export GST_DEBUG="*:2" //  전체 WARN,  설정이 가능,  가장 적절    
$ export GST_DEBUG="*:4" //  전체 INFO,설정    
$ export GST_DEBUG=""   // 설정제거

원하는 PlugIn 만 Debug

모든 PlugIn을 Debug 하지말고 특정 PlugIN만 Debug하여 관련부분의 문제점을 찾아보자.

 // 모든 PlugIN Debug  , 내부 PlugIN을 구조를 모르면 이렇게 확인하고 PlugIN 구조를 파악 
$ GST_DEBUG="*:5" deepstream-app -c deepstream_app_config_yoloV3.txt 

//특정 PlugIn만 Debug 
$ GST_DEBUG="dsexample:5" ./deepstream-rtsp-app rtsp://10.0.0.199:554/h264    // 특정 PlugIN만 세부 Debug 
$ GST_DEBUG="qtdemux:5" deepstream-app -c deepstream_app_config_yoloV3.txt  // qtdemux 만 세부 Debug 

//동시에 여러개 PlugIn Debug 
$ GST_DEBUG="qtdemux:5,dsexample:4 " deepstream-app -c deepstream_app_config_yoloV3.txt  // qtdemux 만 세부 Debug

이외 PlugIn의 카테고리 설정 Debug

본인이 직접 특정 Category를 선언하고 그에 관련된 부분을 직접 Debug도 가능하다, 상위 PlugIn도 다 들어가보면, Category로 선언하여 사용한다

// 소스에서 직접 원하는 Category 설정 (my_category)
GST_DEBUG_CATEGORY_STATIC (my_category);  // define category (statically)
#define GST_CAT_DEFAULT my_category       // set as default

  if (!my_category) {
   GST_DEBUG_CATEGORY_INIT (my_category, "MY_CAT", 0, NULL);
  }

  GST_CAT_INFO (my_category, "TEST Info %s", "Category TEST");
  GST_CAT_DEBUG (my_category, "TEST Debug %s", "Category TEST");
  GST_CAT_ERROR (my_category, "TEST error %s", "Category TEST");

//실제 테스트 
$ GST_DEBUG="MY_CAT:5" deepstream-app -c deepstream_app_config_yoloV3.txt 
$ GST_DEBUG="MY_CAT:9" deepstream-app -c deepstream_app_config_yoloV3.txt

https://gstreamer.freedesktop.org/documentation/gstreamer/gstinfo.html?gi-language=c

2.2 Gstreamer 관련부분 SDK3.0 관련부분 재확인

Gstreamer 관련 Test는 이미 SDK 3.0에서 많이 했기때문에 간단히 서술하며, SDK 4.0과 SDK 3.0은 호환성 Gstreamer 명령어가 동일하게 동작되지 않는 부분이 많다.
PlugIn 이름과 설정 변경이 되었기때문에 상위의 gst-inspect로 확인하고 실행하자

DeepStream SDK 3.0의 TEST2 예제

아래와 같이 decodebin or uridecodebin 사용해서 테스트 진행했으며, 세부내용은 아래참조

$ pwd
/home/nvidia/deepstream_sdk_on_jetson/sources/apps/sample_apps/deepstream-test2

//Sample 영상로 1stGIE,2ndGIE ,nvtracker 사용하여 화면전체 재생 ( X-Window 재생은 상위 참조)

$ gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.h264 ! \
        decodebin ! nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        capsfilter caps=video/x-raw(memory:NVMM), format=NV12 ! \
        nvvidconv ! \
        capsfilter caps=video/x-raw(memory:NVMM), format=RGBA ! \
        nvosd font-size=15 ! nvoverlaysink

// RTSP를 이용하여  1stGIE,2ndGIE ,nvtracker 사용하여 화면전체 재생 

$ gst-launch-1.0 uridecodebin uri=rtsp://10.0.0.199:554/h264 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        capsfilter caps=video/x-raw(memory:NVMM), format=NV12 ! \
        nvvideoconvert ! \
        capsfilter caps=video/x-raw(memory:NVMM), format=RGBA ! \
        nvdsosd ! nvoverlaysink

//Sample 영상로 1stGIE,2ndGIE ,nvtracker 사용하여 X-Window 창 재생 

$ gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! \
        decodebin ! nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvidconv ! nvosd ! nvegltransform ! nveglglessink

DeepStream SDK 3.0 관련문서

아래 링크와 비교를 해보면, 이전의 DeepStream SDK 3.0과는 많이 변했으며, Gstreamer도 호환이 되지 않는다.
https://ahyuo79.blogspot.com/2019/07/deepstream-sdk-30-gstreamer.html

2.3 Gstreamer 관련부분 SDK 4.0 관련부분확인

DeepStream SDK 4.0으로 오면서 우선 가장 큰 차이는 1Channel을 사용해도 streammux는 반드시 사용이 되어야한다.
더불어 nvvidconv 대신 nvvideoconvert을 사용해야하며, 이전 처럼 필터설정은 필요없어 진것 같다.
nvvidconv의 경우 nvosd가 없어져서 동작이 안되는 것로 생각된다.
SDK 4.0으로 오면서 PlugIn(Element)의 Properties가 다양해졌으며, 변경되었기 때문에 조심하자.

DeepStream SDK 4.0의 TEST2 기본설정

우선 아래와 같이 TensorRT (1st, 2nd) 엔진설정을 해두자
상위에서 설명했듯이 Jetson AGX Xavier는 INT8모드 지원하며 이를 설정시 Table도 같이 설정해야함 (상위참조)
- model-engine-file
- int8-calib-file
- network-mode=1 # 0=FP32, 1=INT8, 2=FP16 mode

DeepStream SDK4.0 TEST4 와 iPlugIn 관련사항 (이전부분참조)
https://ahyuo79.blogspot.com/2019/08/ds-sdk-40-test4-iplugin-sample.html

$cd ~/deepstream-4.0/sources/apps/sample_apps/deepstream-test2
$ pwd
/home/nvidia/deepstream-4.0/sources/apps/sample_apps/deepstream-test2

$ vi dstest2_pgie_config.txt 
model-engine-file=../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_int8.engine

$ vi dstest2_sgie1_config.txt 
model-engine-file=../../../../samples/models/Secondary_CarColor/resnet18.caffemodel_b16_int8.engine

$ vi dstest2_sgie2_config.txt
model-engine-file=../../../../samples/models/Secondary_CarMake/resnet18.caffemodel_b16_int8.engine

$ vi dstest2_sgie3_config.txt
model-engine-file=../../../../samples/models/Secondary_VehicleTypes/resnet18.caffemodel_b16_int8.engine

nvstreammux 기본 사용법

필수설정이므로 관련 각 기능을 알아두도록하자 (nvinfer 전에 설정)
아래와 같이 m.sink_0 을 두어 앞에 sink를 설정하여 여러채널을 받을 수 있다.
뒤의 설정을 보면 batch-size는 Channel (Frame) 과 Resolution을 설정 할수 있어 Scale도 가능하다
만약 nvinfer를 사용하지 않는다면, 아래의 테스트와 같이 사용을 안해도 상관은 없는것 같다.

m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720

관련질문사항( 궁금해서 직접 알아봄)
https://devtalk.nvidia.com/default/topic/1061785/deepstream-sdk/about-streammux-in-ds-on-xavier/

PlugIn Manual ( 아직 미지원사항이 있으므로 주의 , Jetson과 Tesla 와 별도)
https://docs.nvidia.com/metropolis/deepstream/4.0/DeepStream_Plugin_Manual.pdf

DeepStream SDK 4.0의 TEST2의 기본 테스트 진행

우선 h264parse 와 nvv4l2decoder(nvv4l2decoder 새로 생김)을 이용하여 동작해보고, 출력은 X-Window창으로 출력을 하도록하자 (OpenGL사용)
아래와 같이 nvosd 가 사라져서 두번째 것은 동작이 안된다.

//Sample TEST 2 동작확인 
$ gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.h264 ! \
        h264parse !  nvv4l2decoder ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink

//Sample nvvidconv 와 nvosd 변경 (nvosd 미지원으로 에러발생)   4.0부터는 nvvideoconvert 와 nvdsosd 를 이용권장 
 gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.h264 ! \
        h264parse !  nvv4l2decoder ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvidconv ! nvosd ! nvegltransform ! nveglglessink

DeepStream SDK 4.0 의 TEST2 decode 테스트 실행

기존처럼 편하게 deepstrem-test2 에 decodebin 과 uridecodebin을 사용이 가능하다.

 
// 주의 해야할 것은 처음 실행시, TensorRT Engine이 없으므로, 생성시간이 많이 걸림
// 각각의 model-engine-file을 설정을 해줘서 이를 해결하지만, Jetson AGX Xavier (INT8 지원가능)

//Sample TEST 2  decodebin 변경 (동작확인) 
$ gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.h264 ! \
        decodebin ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink

//Sample TEST 2  uridecodebin 변경 (RTSP지원) 동작확인 
$ gst-launch-1.0 uridecodebin uri=rtsp://10.0.0.199:554/h264 ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink

//Sample TEST 2  uridecodebin 변경 (FILE지원) 동작확인 
$ gst-launch-1.0 uridecodebin uri=file:///home/nvidia/deepstream-4.0/samples/streams/sample_720p.h264  ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink

//Sample TEST 2  uridecodebin 변경 (RTSP지원) 미동작확인 (nvstreamux 삭제) 
$ gst-launch-1.0 uridecodebin uri=rtsp://10.0.0.199:554/h264 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink

//Sample TEST 2  uridecodebin 변경 (RTSP지원) 동작확인 (nvstreamux 삭제 및 nvinfer/nvtracker 삭제 ) 
$ gst-launch-1.0 uridecodebin uri=rtsp://10.0.0.199:554/h264 ! \
        nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink

nvinfer 때문에 nvstreamux 필요한 것 같음 (추측)

DeepStream SDK 4.0 의 TEST2 출력부분을 변경하여 각각 테스트

nvegltransform ! nveglglessink 은 X-Window 창으로 출력
nvoverlaysink 설정하면, 전체 화면출력

//Sample TEST 2  nvoverlaysink 변경 (전체화면) 동작확인 
$ gst-launch-1.0 uridecodebin uri=file:///home/nvidia/deepstream-4.0/samples/streams/sample_720p.h264  ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvoverlaysink

MPEG4->H.264 Transcoding 테스트

기존과 동일하게 동작되며, 재생이 아닌이상 streammux가 설정안해도 됨

 $  gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! decodebin ! omxh264enc !  h264parse ! qtmux ! filesink location=test.h264

tee를 사용하여 2개의 채널을 분리하여, 재생과 Transcoding 동시진행

기존 SDK 3.0 처럼 tee를 사용하여 재생과 Transcoding을 동시진행하려고 했으나 동작 안되는데, PlugIn(Element) 단위로 보면 미동작 원인이 이해가 안간다.

// 기본영상재생시 동작확인 ( Streammux 권고)
$  gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! decodebin ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvegltransform ! nveglglessink

// RTSP를 위해 uridecodebin으로 변경 후  nvstreammux 를 제거해서 실행 
$ gst-launch-1.0 uridecodebin uri=rtsp://10.0.0.201:554/h264 ! nvvideoconvert ! nvegltransform ! nveglglessink

// tee를 사용하여 1 Channel 재생  (동작확인) 
$  gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! decodebin ! tee name=t ! queue ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvegltransform ! nveglglessink 

//tee를 사용하여 2 채널 사용 (문제발생)
$  gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! decodebin ! tee name=t ! queue ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvegltransform ! nveglglessink t. ! queue ! omxh264enc !  h264parse ! qtmux  ! filesink location=test.mp4 

//tee를 사용하여 2 채널 사용 (NVIDIA 권고사항) 동작은되지만, 가끔 에러발생
$ gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! decodebin ! tee name=t ! queue ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvegltransform ! nveglglessink t. ! queue ! nvvideoconvert ! nvv4l2h264enc !  h264parse ! qtmux  ! filesink location=test.mp4

//tee를 사용하여 2 채널 사용 fakesink (동작확인)  , 4.0 부터는 NVIDIA에서는 omxh264enc를 권장하는 않는 것 같음 
$  gst-launch-1.0 filesrc location=../../../../samples/streams/sample_720p.mp4 ! decodebin ! tee name=t ! queue ! fakesink t. ! queue ! omxh264enc !  h264parse ! qtmux  ! filesink location=test.mp4

NVIDIA에게 직접물어보니 omxh264enc 대신 nvv4l2h264enc로 변경
https://devtalk.nvidia.com/default/topic/1061803/deepstream-sdk/tee-in-ds-4-0-on-xavier-/

nvinfer 와 nvosd를 적용 후 화면재생과 H.264 Encoding 동시작업

DS3.0 과 다르게 상위와 같이 omxh264enc 대신 nvv4l2h264enc을 사용해야 제대로 동작된다

//Sample TEST 2  nvoverlaysink 변경 (전체화면) 동작확인 

$ gst-launch-1.0 uridecodebin uri=file:///home/nvidia/deepstream-4.0/samples/streams/sample_720p.h264  ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! nvoverlaysink

// 기본동작은 상위와 동일하지만, 2 Channel로 nvosd를 걸쳐 화면재생과 H.264 Encoding 동시작업
$ gst-launch-1.0 uridecodebin uri=file:///home/nvidia/deepstream-4.0/samples/streams/sample_720p.h264  ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! \
        tee name=t ! queue ! nvegltransform ! nveglglessink \
        t. ! queue ! nvvideoconvert ! nvv4l2h264enc !  h264parse ! qtmux  ! filesink location=test.mp4

//NVIDIA가 추후 nvvideoconvert로 변경해보라고 해서 재테스트  
$ gst-launch-1.0 uridecodebin uri=file:///home/nvidia/deepstream-4.0/samples/streams/sample_720p.h264  ! \
        m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! \
        nvinfer config-file-path= dstest2_pgie_config.txt ! \
        nvtracker tracker-width=640 tracker-height=368  gpu-id=0  \
        ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so \
        ll-config-file=tracker_config.yml enable-batch-process=1 ! \
        nvinfer config-file-path= dstest2_sgie1_config.txt ! \
        nvinfer config-file-path= dstest2_sgie2_config.txt ! \
        nvinfer config-file-path= dstest2_sgie3_config.txt ! \
        nvvideoconvert ! nvdsosd ! \
        tee name=t ! queue ! nvvideoconvert ! nveglglessink \
        t. ! queue ! nvvideoconvert ! nvv4l2h264enc !  h264parse ! qtmux  ! filesink location=test.mp4

7/22/2019

DeepStream SDK 3.0 와 OpenALPR

1. DeepStream SDK에 OpenALPR 기본설치

OpenALPR (Automatic License Plate Recognition)으로 자동차 번호판 인식하는 소스이며, 내부에는 tesseract OCR을 이용하여 이를 작동한다고 한다.

NVIDIA Deepstream SDK 4.0 API

SDK 4.0 API Manual이 존재하며, 아직 Jetson AGX Xavier에는 해당사항은 없지만, 참고사항으로 알아두자.

https://docs.nvidia.com/metropolis/deepstream/4.0/dev-guide/DeepStream_Development_Guide/baggage/modules.html

NVIDIA L4T Multimedia API

소스를 수정하다보면, Mulimedia 부분의 관련된 API와 구조체를 많이 보게되어 이 부분을 참고하자.

https://docs.nvidia.com/jetson/l4t-multimedia/group__l4t__mm__nvbuffer__group.html

Tesseract OCR 설치

OpenALPR은 Tesseract 기반으로 하기때문에, 기본적인 Tesseract 설치와 사용법을 알아두자

https://ahyuo79.blogspot.com/2019/05/tesseract-ocr.html

OpenALPR을 이용하여 만든 시스템

Rekorsystems 회사에서 만든 시스템인 것 같은데, 인터넷에서 있기에 관련부분

  https://www.rekorsystems.com/
  https://youtu.be/ofpxX49vdXY

-다른 동영상
  https://www.youtube.com/watch?v=w6gs10P2e1k

1.1 OpenALPR 관련소스 와 사용법

OpenALPR (Automatic License Plate Recognition)
https://www.openalpr.com/

OpenALPR Source
https://github.com/openalpr/openalpr

OpenALPR 관련 Project 전부
https://github.com/openalpr

1.2 Jetson 에 OpenALPR 설치

Jetson AGX Xavier 에서 OpenALPR 설치하고 이를 간단하게 테스트를 진행해보자.

$ sudo apt-get update && sudo apt-get install -y openalpr openalpr-daemon openalpr-utils libopenalpr-dev

$ sudo apt list | grep tesseract  //   현재 4.0 지원 설치확인 
libtesseract-dev/bionic 4.00~git2288-10f4998a-2 arm64
libtesseract4/bionic,now 4.00~git2288-10f4998a-2 arm64 [installed,automatic]
tesseract-ocr/bionic 4.00~git2288-10f4998a-2 arm64
tesseract-ocr-afr/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-all/bionic 4.00~git2288-10f4998a-2 all
tesseract-ocr-amh/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-ara/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-asm/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-aze/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-aze-cyrl/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-bel/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-ben/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-bod/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-bos/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-bre/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-bul/bionic 4.00~git24-0e00fe6-1.2 all
tesseract-ocr-cat/bionic 4.00~git24-0e00fe6-1.2 all
....

$ sudo apt install tesseract-ocr
$ sudo apt install libtesseract-dev
$ sudo apt install tesseract-ocr-eng   // 영어 모델
$ sudo apt install tesseract-ocr-kor   // 한글 모델

$ alpr ./test.JPG 
Error opening data file /usr/share/openalpr/runtime_data/ocr/lus.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'lus'
Tesseract couldn't load any languages!
Image file not found: test.JPG

$ sudo ln -s /usr/share/openalpr/runtime_data/ocr/tessdata/lus.traineddata /usr/share/openalpr/runtime_data/ocr/lus.traineddata

$ alpr ./test.JPG 
plate0: 10 results
    - GS00  confidence: 79.1004
    - GSH00  confidence: 78.1927
    - NGS00  confidence: 77.5058
    - NGSH00  confidence: 76.5981
    - GSD0  confidence: 76.2403
    - GSHD0  confidence: 75.3326
    - HGS00  confidence: 75.3263
    - GSO0  confidence: 75.1033
    - NGSD0  confidence: 74.6456
    - HGSH00  confidence: 74.4186

# Test US plates
$ wget http://plates.openalpr.com/ea7the.jpg
$ alpr -c us ea7the.jpg

# Test European plates
$ wget http://plates.openalpr.com/h786poj.jpg
$ alpr -c eu h786poj.jpg

$ ls /etc/openalpr/
alprd.conf  openalpr.conf

$ ls  /usr/share/openalpr/config/
alprd.defaults.conf  openalpr.defaults.conf

$ vi  /usr/share/openalpr/config/openalpr.defaults.conf         // debug 옵션 및 기타 설정가능
.....
debug_general         = 0
debug_timing          = 0
debug_detector        = 1
debug_prewarp         = 0
debug_state_id        = 0
debug_plate_lines     = 1
debug_plate_corners   = 0
debug_char_segment    = 0
debug_char_analysis   = 0
debug_color_filter    = 0
debug_ocr             = 1
debug_postprocess     = 0
debug_show_images     = 0
debug_pause_on_frame  = 0

OpenALPR 설치 및 기본사용법
http://doc.openalpr.com/compiling.html

http://doc.openalpr.com/video_processing.html

http://doc.openalpr.com/on_premises.html

2. Deepstream SDK 3.0 과 OpenALPR 테스트

OpenALPR을 가지고, Deepstream의 Gst-Plugin을 추가하여 사용되는 소스가 존재하며, 이를 간단하게 진행해보면, 빌드가 되지 않는데, 이 소스가 DeepStream SDK 1.0 기준으로
작성되어 동작되지 않는 것 같다.

Deepstream SDK의 Gstreamer OpenALPR PlugIn
https://github.com/openalpr/deepstream_jetson

$ cd ~ 
$ git clone https://github.com/openalpr/deepstream_jetson
$ cd deepstream_jetson

$ make   //
gstdsexample.h:21:10: fatal error: nvbuf_utils.h: No such file or directory

$ vi Makefile   // Deepstream SDK의 Version이 다르며, 1.0과 호환되는 것 같다. 

#CFLAGS+= \
#  -I../nvgstiva-app_sources/nvgstiva-app/includes 
CFLAGS+= \
  -I../deepstream_sdk_on_jetson/sources/includes

$ make   //  Deepstream SDK Version이 맞지 않아 빌드가 되지 않는다.
gstdsexample.h:22:10: fatal error: gstnvivameta_api.h: No such file or directory

DeepSream SDK 1.0 관련내용
https://tech-blog.abeja.asia/entry/deepstream-sdk-jetson

2.1 Commerical OpenALPR Porting (유료버전)

DeepStream SDK 3.0에서도 동일하게 gst-dsexample을 제공하고 있으며, 이 소스를 변경하여 테스트를 진행해보자.
간단히 소스를 분석하면, Commercial OpenALPR의 소스를 사용하고 있어, 이에 관련된 SDK도 필요하며, 사용하고 싶다면 상용버전을 사야할 것 같지만, 일단 무료로 다운받아보고 설치해보자.

Commercial OpenALPR SDK 설치

$ sudo apt install curl
$ bash <(curl -s https://deb.openalpr.com/install)

아래와 같이 선택해서 설치하자.

install _webserver : 현재설치시 에러 발생
install_agent : 설치완료
install_sdk : 설치완료
install_nvidia: 설치완료

Commercial Vesion
http://doc.openalpr.com/sdk.html#linux

gst-dsopenaplr 기본구성

dsopenalpr_lib만 복사하여 gst-dsexample에 넣고 유사하게 Makefile과 다른 소스도 수정하여 빌드를 진행을 하자
소스을 두 소스를 비교를 해서 수정을 많이 해야한다.

$ cp -a ../deepstream_sdk_on_jetson/sources/gst-plugins/gst-dsexample ../deepstream_sdk_on_jetson/sources/gst-plugins/gst-openaplr
$ cp -a dsopenalpr_lib ../deepstream_sdk_on_jetson/sources/gst-plugins/gst-openaplr/
$ cd ~/deepstream_sdk_on_jetson/sources/gst-plugins/gst-openaplr
$ rm -rf dsexample_lib

소스를 수정하여 구성해도, 내부함수가 라이센스 키가 없으면, 아무런 동작하지 않기때문에 다른방법을 찾아봐야겠다. 유료버전을 사용한다면 이렇게 구성해서 사용하도록하자.

OpenALPR 의 Commercial Version의 장점
  http://doc.openalpr.com/opensource.html#commercial-enhancements

2.2 OpenALPR을 gst-dsexample 연결 ( 무료 )

상위와 같이 DeepStream SDK 3.0의 gst-dsexample을 이용하여 비슷하게 구성하고, 소스를 전부 분석하여 본인이 직접 만들어나가는 수 밖에 없을 것 같다.

상위 OpenALPR의 소스분석하면 실행파일관련 파일들을 찾을 수 있지만, 이것들을 전부 빌드하여 넣어 동작시키기가 쉽지않다.
그래서 이 방법보다는 이를 분석해서 내가 만든것이 왠지 더 나을 것 같다.

https://github.com/openalpr/openalpr/tree/master/src/openalpr
  https://github.com/openalpr/openalpr/blob/master/src/openalpr/alpr.cpp

아래의 구성으로 이 함수를 참조하여 오픈소스로 다시 해보자.
  http://doc.openalpr.com/opensource.html#developers-guide

2.3 OpenALPR Cloud API 연결방법 (무료/유료)

OpenALPR의 Cloud가 제공하고 있으며, 간단하게 Cloud API로 연결하여 그림파일을 전송하면되는 방식이다.

OpenALPR Cloud API 테스트 진행

아래의 사이트에서 자동차 사진을 가지고 있다면, OpenALPR을 Cloud API을 이용하여 테스트 가능하며, 월 1000개까지 무료인 것 같다.
http://www.openalpr.com/cloud-api.html

OpenALPR Cloud API 사용방법

각 Shell / Python / C#을 이용하여 Cloud API 예제를 제공하고 있으므로, 이부분을 참조하자.
http://doc.openalpr.com/cloud_api.html

7/19/2019

DeepStream SDK 3.0 Kafka 연결

1. DeepStream SDK 3.0 Msg 관련부분 설정 및 테스트

기본설치방법

아래링크의-DeepStream SDK 3.0 설치 참조
  https://ahyuo79.blogspot.com/2019/06/deepstream-sdk-30-jetpack-42-411.html

NVIDIA msgbroker 관련부분
  https://devtalk.nvidia.com/default/topic/1048534/deepstream-for-tesla/deepstream-test4-app-stalled-after-the-first-few-frames/
https://devtalk.nvidia.com/default/topic/1044365/deepstream-for-tesla/using-kafka-protocol-for-retrieving-data-from-a-deepstream-pipeline-/

NVIDIA Analytic Server ( Smart PArking Sytstem)
https://github.com/NVIDIA-AI-IOT/deepstream_360_d_smart_parking_application/

Kafka Site
https://kafka.apache.org/intro.html

Apach Kafka 구조 파악
  https://epicdevs.com/17
  https://www.joinc.co.kr/w/man/12/Kafka/chatting

1.1 DeepStream SDK 3.0 TEST 4 부분

아래 링크의 3.4 Deepstream TEST 4 (Kafka) 부분 다시 참조
  https://ahyuo79.blogspot.com/2019/06/deepstream-sdk-30-jetpack-42-411.html

$ vi deepstream_test4.app.c 
#define PROTOCOL_ADAPTOR_LIB  "/usr/lib/aarch64-linux-gnu/tegra/libnvds_kafka_proto.so"
#define CONNECTION_STRING "127.0.0.1;9092;dsapp2"

우선 아래의 Kafka를 설치
설치완료 후 dsapp2의 topic 생성
상위 수정된 소스 make 후 deepstream-test4-app 실행
상위 topic으로 받은 메시지를 bin/kafka-console-consumer.sh 이용하여 확인

이 소스는 Gst-msgconv 와 Gst-msgbroker 를 같이 사용하는 예제이다.
Gst-msgconv는 현재 config file을 별도로 지정하여 dstest4_msgconv_config.txt 설정하고 이 설정에 해당하는 소스부분이 상위 소스에 있다.
Gst-msgbroker 는 받은 metadata를 Kafka에게 전송하는 역할이며, 현재 모두 JSON Format이다.

$ cd /usr/local/kafka
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic dsapp2 --from-beginning            //deepstream-test4-app 이 보낸 metadata 를 json으로 보냄 
{
  "messageid" : "220a8cea-9dc5-448c-8da6-cc9998a5fefe",
  "mdsversion" : "1.0",
  "@timestamp" : "2019-07-22T02:01:54.067Z",
  "place" : {
    "id" : "1",
    "name" : "XYZ",
    "type" : "garage",
    "location" : {
      "lat" : 30.32,
      "lon" : -40.549999999999997,
      "alt" : 100.0
    },
    "aisle" : {
      "id" : "walsh",
      "name" : "lane1",
      "level" : "P2",
      "coordinate" : {
        "x" : 1.0,
        "y" : 2.0,
        "z" : 3.0
      }
    }
  },
  "sensor" : {
    "id" : "CAMERA_ID",
    "type" : "Camera",
    "description" : "\"Entrance of Garage Right Lane\"",
    "location" : {
      "lat" : 45.293701446999997,
      "lon" : -75.830391449900006,
      "alt" : 48.155747933800001
    },
    "coordinate" : {
      "x" : 5.2000000000000002,
      "y" : 10.1,
      "z" : 11.199999999999999
    }
  },
  "analyticsModule" : {
    "id" : "XYZ",
    "description" : "\"Vehicle Detection and License Plate Recognition\"",
    "source" : "OpenALR",
    "version" : "1.0",
    "confidence" : 0.0
  },
  "object" : {
    "id" : "db521849-00f8-44f6-b20c-daf3111b6ccd",
    "speed" : 0.0,
    "direction" : 0.0,
    "orientation" : 0.0,
    "vehicle" : {
      "type" : "sedan",
      "make" : "Bugatti",
      "model" : "M",
      "color" : "blue",
      "licenseState" : "CA",
      "license" : "XX1234",
      "confidence" : 0.0
    },
    "bbox" : {
      "topleftx" : 0,
      "toplefty" : 0,
      "bottomrightx" : 0,
      "bottomrighty" : 0
    },
    "location" : {
      "lat" : 0.0,
      "lon" : 0.0,
      "alt" : 0.0
    },
    "coordinate" : {
      "x" : 0.0,
      "y" : 0.0,
      "z" : 0.0
    }
  },
  "event" : {
    "id" : "b63d0d65-eaf8-44c5-be51-979a009f8a33",
    "type" : "moving"
  },
  "videoPath" : ""
}
....   // Ctrl+C 로 정지

2. Apache Kafka 설치 작업 및 테스트

Apache Kafka를 설치를 위해서 Apache Zookeeper가 필요하기에 아래와 같이 설치하다.

apt update 와 Java JDK or default-jre 설치

Kafka 설치를 위해 JDK or JRE설치

$ sudo apt-get update
$ sudo apt-get install openjdk-8-jdk

2.1 Apache zookeeper 기본설치

zookeeper site
  https://zookeeper.apache.org/releases.html

package download
  https://www-eu.apache.org/dist/zookeeper/
http://www-us.apache.org/dist/zookeeper
  http://apache.mirror.globo.tech/zookeeper

zookeeper 설치 (apt 이용)

$ sudo apt-get install zookeeperd

$ find / -name zkServer.sh 2> /dev/null 
/usr/share/zookeeper/bin/zkServer.sh

package download 설치

package로 설치하면 /usr/share 에 존재하는 것 같아 /usr/local/변경

$ wget https://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
$ tar zxvf zookeeper-3.4.14.tar.gz
$ sudo mv zookeeper-3.4.14/ /usr/local/zookeeper
$ sudo chown -R root:root /usr/local/zookeeper

$ find / -name zkServer.sh 2> /dev/null      // 좀더 최신 version이라 좀 다르다.
/usr/local/zookeeper/zookeeper-contrib/zookeeper-contrib-rest/src/test/zkServer.sh
/usr/local/zookeeper/zookeeper-contrib/zookeeper-contrib-zkpython/src/test/zkServer.sh
/usr/local/zookeeper/zookeeper-client/zookeeper-client-c/tests/zkServer.sh
/usr/local/zookeeper/bin/zkServer.sh
/usr/local/zookeeper/zookeeper-recipes/zookeeper-recipes-lock/src/main/c/tests/zkServer.sh
/usr/local/zookeeper/zookeeper-recipes/zookeeper-recipes-queue/src/main/c/tests/zkServer.sh

2.2 Apache Zookeeper Service 등록

사용하기 편하기 위해서 아래와 같이 systemd에 Service로 등록을 한다.

systemd 에 zookeeper.service 등록

$ sudo vi /etc/systemd/system/zookeeper.service
[Unit]
Description=zookeeper-server
After=network.target

[Service]
Type=forking
User=root
Group=root
SyslogIdentifier=zookeeper-server
WorkingDirectory=/usr/share/zookeeper
Restart=on-failure
RestartSec=0s
ExecStart=/usr/share/zookeeper/bin/zkServer.sh start
ExecStop=/usr/share/zookeeper/bin/zkServer.sh stop

system service 등록 및 확인

$ sudo systemctl daemon-reload
$ sudo systemctl enable zookeeper.service
$ sudo systemctl start zookeeper.service
$ sudo systemctl status zookeeper.service

zookeeper 상태체크

$ netstat -nlp|grep 2181
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 :::2181                 :::*                    LISTEN      -  

$ sudo /usr/share/zookeeper/bin/zkServer.sh status     
ZooKeeper JMX enabled by default
Using config: /etc/zookeeper/conf/zoo.cfg
Error contacting service. It is probably not running.

or 

$ echo status | nc 127.0.0.1 2181
Zookeeper version: 3.4.10-3--1, built on Sat, 03 Feb 2018 14:58:02 -0800
Clients:
 /127.0.0.1:45894[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 3
Sent: 2
Connections: 1
Outstanding: 0
Zxid: 0x0
Mode: standalone
Node count: 4

2.3 Apache Kafka 관련 설치 및 서비스 등록

Apache Kafka
  https://kafka.apache.org/

Package download (지원되는 Version확인)
  https://www-eu.apache.org/dist/kafka
  http://www-us.apache.org/dist/kafka
  http://apache.mirror.globo.tech/kafka

Download하면서 Version이 혼동이 되었는데, 아래 사이트에서 정확하게 알게되었다.
Kafka가 Scala 언어로 개발되었는데, 호환성문제로 Java로 이식되었다고 한다.
Version 은 File에서 두개로 표시를 한다

e.g kafka/2.2.1/kafka_2.12-2.2.1.tgz (Scala:2.12 , Kafka:2.2.1)

현재 Download 하는 사이트들을 보니 현재는 Kafka 2.x.x 만 지원을 하고 있어서 이 기준으로 변경

Kafka Download 및 기본 테스트 진행

$ wget http://www-us.apache.org/dist/kafka/2.2.1/kafka_2.12-2.2.1.tgz  //download 가능 

$ tar zxvf kafka_2.12-2.2.1.tgz
$ sudo mv kafka_2.12-2.2.1 /usr/local/kafka

$ cd /usr/local/kafka

$ vi config/server.properties     // Log Data 설정된 장소확인 및 변경 
log.dirs=/tmp/kafka-logs/

상위 log.dirs 은 만약 topic를 생성하면 이곳에서 관련 data를 확인이 가능하다.

Kafka Service 등록

$ sudo vi  /etc/systemd/system/kafka.service          // 새로운 File 생성 
[Unit]
Description=kafka-server
After=network.target

[Service]
Type=simple
User=root
Group=root
SyslogIdentifier=kafka-server
WorkingDirectory=/usr/local/kafka
Restart=no
RestartSec=0s
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh

$ systemctl daemon-reload
$ systemctl enable kafka.service
$ systemctl start kafka.service
$ systemctl status kafka.service

Kafka 동작확인

$ sudo apt install lsof 
$ sudo lsof -i :2181                       //Zookeeper 만 동작될 때 
COMMAND   PID      USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    18819 zookeeper   43u  IPv6 759187      0t0  TCP *:2181 (LISTEN)

$  sudo lsof -i :2181      //Zookeeper 와 Kafka 동작될 때 
COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    22350 nvidia  109u  IPv6 767085      0t0  TCP *:2181 (LISTEN)
java    22350 nvidia  110u  IPv6 761679      0t0  TCP localhost:2181->localhost:46018 (ESTABLISHED)
java    22675 nvidia  109u  IPv6 766151      0t0  TCP localhost:46018->localhost:2181 (ESTABLISHED)

2.4 Kafka TEST 방법

Kafka는 File기반으로 Topic을 만들고, 이를 통신하는 구조라고 한다.
아래과 같이 새로운 testTopic을 만들고 이를 MSG를 전송해보고 받아보자.

Kafka에서 testTopic 생성 및 확인

$ cd /usr/local/kafka
$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic testTopic
Created topic "testTopic".

$ bin/kafka-topics.sh --list --zookeeper localhost:2181     // 생성된 list
testTopic

$ ls //tmp/kafka-logs/     // testTopic 확인 
..
testTopic-0/

Kafka로 MSG 전송 (testTopic)

$ cd /usr/local/kafka
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic testTopic    
>Welcome to kafka
>This is my first topic      // Ctrol+c

Kafka에서 받은 메시지 확인 (testTopic)

$ cd /usr/local/kafka
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic testTopic --from-beginning
Welcome to kafka
This is my first topic

Kafka 설치방법들
  https://zzsza.github.io/data/2018/07/24/apache-kafka-install/
  https://tecadmin.net/install-apache-kafka-ubuntu/
  https://linuxhint.com/install-apache-kafka-ubuntu/
  https://jwon.org/install-kafka-on-ubuntu/
  https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-18-04

2.5 systemctl 사용법

systemd default target 확인

$ systemctl get-default
multi-user.target

$ sudo systemctl set-default graphical.target  // X-Windows Log In 변경 

$ systemctl get-default
graphical.target

systemd target list 확인

$ systemctl list-units --type target
UNIT                  LOAD   ACTIVE SUB    DESCRIPTION              
basic.target          loaded active active Basic System             
bluetooth.target      loaded active active Bluetooth                
cryptsetup.target     loaded active active Encrypted Volumes        
getty.target          loaded active active Login Prompts            
graphical.target      loaded active active Graphical Interface      
local-fs-pre.target   loaded active active Local File Systems (Pre) 
local-fs.target       loaded active active Local File Systems       
multi-user.target     loaded active active Multi-User System        
network-online.target loaded active active Network is Online        
network.target        loaded active active Network                  
nfs-client.target     loaded active active NFS client services      
paths.target          loaded active active Paths                    
remote-fs-pre.target  loaded active active Remote File Systems (Pre)
remote-fs.target      loaded active active Remote File Systems      
slices.target         loaded active active Slices                   
sockets.target        loaded active active Sockets                  
sound.target          loaded active active Sound Card               
swap.target           loaded active active Swap                     
sysinit.target        loaded active active System Initialization    
time-sync.target      loaded active active System Time Synchronized 
timers.target         loaded active active Timers                   

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

systemd default target 확인

$ systemctl get-default
multi-user.target

$ systemctl get-default
graphical.target

systemctl 사용법
https://www.lesstif.com/pages/viewpage.action?pageId=24445064

System Service 등록방법
https://fmd1225.tistory.com/93
https://pinedance.github.io/blog/2017/09/12/Ubuntu-16.04-system-service-%EB%93%B1%EB%A1%9D%ED%95%98%EA%B8%B0

7/10/2019

Deepstream의 Gst-nvinfer의 구조 와 TensorRT의 IPlugIn 기능

1. Deepstream의 GST-NVINFER 기본구조

GST-NVINFER의 역할은 TensorRT를 이용하여, Gstreamer에서 Inferece를 하는 역할이다.
주기능은 Object Detection 과 Classfication이 될 것이다.

TensorRT의 경우는 자세한 설명은, 별도의 NVIDIA TensorRT를 찾아보거나, 현재 정리된 Manual을 참조하자.

GST_NVINFER는 단지 TensorRT의 기능의 역할만 하는 것이 아니라, IPlugIn이라는 확장 Interface를 제공하고 있다.
이는 공유라이브러리와 연결하여, 이기능을 확장하여 사용이 가능하다고 한다. 주로, cumtom layer에 사용이 된다.

기본동작은 Gstremer의 Pipeline구조에 따라 아래와 같이 동작하며, 이 때 전달되는 매개체들을 알아두자.

일반 Gstreamer 처럼 Input Buffer가 들어오고, Output Buffer가 나가는 형식이며,
nvll_infer는 Low Leve Library(libnvill_infer)에서 float RGB or BGR planar data를 Network의 dimention에따라 처리한다고 한다.

Input

Gst_Buffer
Meta Data (NvStreamMeta)
Caffe Model and Caffe Prototxt
ONNX
UFF file

상위 3,4,5 번 TensorRT의 지원되는 Platform부분을 보고, Parser를 보면 쉽게 이해가 간다.

Output

Gst_Buffer:
Meta Data (NvStreamMeta)
Infer Meta Data (NvDsMeta: NVInfer에 의해 생성된 classes 정보 및 bounding box정보

Control Paramets

Gst-nvinfer 즉, Gstreamer의 Control Parameter로 Config File에서 현재 설정되고 있다.

Batch size
Inference interval
Clustering parameters
Class theshold
Bonding Box color
Width and height of bounding box to filter boxes for downstream component

Gst-nvInfer의 구조를 보면, 다른 Gst-PlugIn의 구조도 거의 비슷하기에 관련부분을 이해하기가 쉬워지며, 중요한 것은 TensorRT를 어떻게 넣고, 이를 적용하는 것 같다.

1.1 Gst-nvinfer 의 Properties

Gst Properties는 Gstreamer Command로 실행 했을 경우, 직접 넣을 수 있는 Argument라고 생각하면될 것 같다.
Source로 보면 g_object_set 함수를 이용하여 설정이 가능하며, 각각의 기능을 정확히 알아두자

아래의 config-file-path는 File을 이용하여 세부적으로 설정가능한데, 1.2에서 설명

1.2 Gst-nvinfer File Configuration Specifications

Gst-nvifer의 config file은 Key File Format으로 구성된다고 하는데, 아래의 사이트를 보거나, Config File을 보면 쉽게 이해간다.

[groupname]
Key=Value
https://specifications.freedesktop.org/desktop-entry-spec/latest/

예제의 config_infer_primary.txt File Format이 Key File Format이라고 생각하면된다.
Gst-nvinfer 의 Config 파일 크게 아래와 같이 구성되며 세부 설정은 Manual을 참조

[property] : 필수 설정이므로, 반드시 알아야 하며, 관련내용 Manual 참조
[class-attrs-all] : 모든 class 를 위한 detection 설정
[class-attrs-] : 특정 class-id만을 별도 설정가능

Gst-nvinfer PlugIn만 이기능을 지원하며, 이 관련소스는 현재 Library로만 제공되어있기때문에 자세한 구조 및 Parsing 소스를 알수 없으며,
Manual 과 Sample 예제 Config를 참조하여 각 값을 보고 설정하고 테스트를 진행해야한다.

Gst-nvinfer(1st GIE, TensorRT)의 설정

$ cd ~/deepstream_sdk_on_jetson/samples/configs/deepstream-app
$ cat config_infer_primary.txt  
....
[property]
net-scale-factor=0.0039215697906911373
## caffe model 
model-file=../../models/Primary_Detector/resnet10.caffemodel
## model의 Layer 구조 및 이름파악 
proto-file=../../models/Primary_Detector/resnet10.prototxt
model-engine-file=../../models/Primary_Detector/resnet10.caffemodel_b30_int8.engine
labelfile-path=../../models/Primary_Detector/labels.txt
int8-calib-file=../../models/Primary_Detector/cal_trt4.bin
batch-size=30
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
## 상위 labels의 class 종류가 4개 
num-detected-classes=4
interval=0
gie-unique-id=1
## 0: Custom 4 : Resnet 
parse-func=4

## 상위 proto-file의 output layer name , 즉 결과물을 보여주는 Layer 표시 
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid

#parse-bbox-func-name=NvDsInferParseCustomResnet
#custom-lib-path=/home/nvidia/deepstream_sdk_on_jetson/sources/libs/nvdsparsebbox/libnvdsparsebbox.so

## DBSCAN or  OpenCV groupRectangles for grouping detected object  (군집화)
#enable-dbscan=1

[class-attrs-all]
threshold=0.2
group-threshold=1
## Set eps=0.7 and minBoxes for enable-dbscan=1
eps=0.2
#minBoxes=3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

## Per class configuration
#[class-attrs-2]
#threshold=0.6
#eps=0.5
#group-threshold=3
#roi-top-offset=20
#roi-bottom-offset=10
#detected-min-w=40
#detected-min-h=40
#detected-max-w=400
#detected-max-h=800

[propertyp] 숙지정보

model-file: Caffe의 기본 Model 사용
proto-file: Caffe에서 제공하는 Layer Graph 인것 같음
model-engine-file:TensorRT용 Model Engine이며, 상위 model-file 기반으로 처음생성
labelfile-path: 현재 제공되어지는 Label File
network-mode: FP32/INT8/FP16기반으로 생성가능 이를 설정
batch-szie: 들어오는 buffer의 수 (Frame/Object) , 현재 Frame이라고 생각하면됨
num-detected-classes: Label의 갯수와 맞추면됨
parse-func : [0 custom] [4: resnet] 세부사항은 Manual

상위 파란색으로 막아진 기능은 현재 /libs/libnvdsparsebox 기능으로 기본 Binary 설치가 안되었으며, 역할은 bounding box parsing 기능이며,
사용하고 싶다면, parse-func =0 변경과 함께source/libs/libnvdparsebox/libnvdsparsebbox.so 생성후 설정 (아래의 IPlugIn에서 다시언급)

DBSCAN (Density-based spatial clustering of applications with noise)

상위 옵션 중에 DBSCAN이 존재하는데, 군집화를 하기위해서 사용하는 것 같으며, OpenCV 역시 비슷할 것 같다. (상위 EPS 설정과 같이 함)

enable-dbscan
eps=0.2

https://bcho.tistory.com/1205

관심영역 ROI(Region of Interest) 관련설정

roi-top-offset=0 : ROI의 Frame Top에서 부터 OFFSET
roi-bottom-offset=0 : ROI의Frame Bottom에서 부터 OFFSET

TensorRT Engine

model-engine-file

상위 *.caffemode_xxx.engine은 TensorRT용 Engine이므로, 처음 실행시 생성되며, 필요없다면 제거가능

주의사항

batch-size
network-mode

상위 설정값들은 아래의 deepstreeam-app 의 config 값에 의해 재해석되는되므로, (GST-nvinfer config값을 override가 가능), 상위 설정만 보지 않고 전체설정도 같이 봐야한다.

다양한 예제 (for Telsa version)
https://github.com/NVIDIA-AI-IOT/redaction_with_deepstream/blob/master/configs/pgie_config_fd_lpd.txt

1.3 deepstream-app 의 Config

deepstream-app 에서만 사용되는 Config File 이며, 이 구조도 상위 구조와 동일하며, 다만 다른 점은 Gst-nvinfer Properties 의 Gst-nvinfer File Configuration 값을 덮어쓰는 것이 가능하다.

이 파일의 구조는 이전에 분석한 부분의 Parser에서 각각 값을 얻고 값을 별도로 저장을 한다.(여기서 부터는 소스로 확인가능)
이전의 Config File은 Primary-GIE (TensorRT) 설정이며, 아래의 Config에서 중복설정을 하여, 덮어쓰는 것이 가능하다.

Manual의 Configuration Groups (Manual 참조)

Group	Configuration
application	Configuration settings that are not related to a specific component.
tiled-display	Settings that configure tiled display in the application.
source	Settings that specify source properties. There can be multiple sources, with one configuration group for each source. The groups must be named [source0], [source1], and so on.
streammux	Settings that specify the properties and modify the behavior of the streammux component.
primary-gie	Settings that specify the properties and modify the behavior of the primary GIE.
secondary-gie	Settings that specify the properties and modify the behavior of the secondary GIE. There may be multiple secondary GIEs, with one configuration group for each GIE. These groups must be named [secondary-gie0], [secondary-gie1], etc.
tracker	Settings that specify the properties and modify the behavior of the object tracker.
osd	Settings that specify the properties and modify the behavior of the on-screen display (OSD) component which overlays text and rectangles on the frame.
sink	Settings that specify the properties and modify the behavior of the sink components, which represent outputs such as displays or files for rendering, encoding, and file saving. The pipeline may contain multiple sinks, with one configuration group for each sink. The groups must be named [sink0], [sink1], etc.
tests	Settings for diagnostics and debugging. This configuration group is experimental.

이외의 group 설정

이외 source/gst-plugins의 모든 소스는 현재 공식 제공이 아니며, 개발중인 소스인 것 같다. 예를들면 [ds-example] 도 옵션이 존재하지만, 이부분은 테스트 중 같다.
이외에 [messsage-broker] 및 [message-conv]도 지원할것 같으나, 현재 소스는 미지원이다.

(deepstrem_config_file_parser.c/h 소스참조)

4Ch Deepstream의 전체설정

각각은 [] 표시는 Group으로 표현되고, 이 Group은 Gstreamer의 1개의 PlugIn의 Properties 값들을 표시하며, 이 값은 상위의 Gst-nvinfer의 File Config 값을 덮어쓸수가 있다.

1st GIE(Gst-nvinfer)의 설정을 보면, [primary-gie] 를 보면 이 설정은 config_infer_primary.txt와 중복이 되며, 재정의 하는 것이 가능하므로, 이 값으로 설정가능.

$ cd ~/deepstream_sdk_on_jetson/samples/configs/deepstream-app
$ cat source4_720p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt 
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=2
columns=2
width=1280
height=720

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file://../../streams/sample_720p.mp4
num-sources=4

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=2
sync=1
display-id=0
offset-x=0
offset-y=0
width=0
height=0
overlay-id=1
source-id=0

[sink1]
enable=0
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
codec=1
sync=0
bitrate=2000000
output-file=out.mp4
source-id=0

[sink2]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265 3=mpeg4
## only h264 is supported right now.
codec=1
sync=0
bitrate=4000000
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400


[osd]
enable=1
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0

[streammux]
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=4
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1280
height=720

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
model-engine-file=../../models/Primary_Detector/resnet10.caffemodel_b4_int8.engine
labelfile-path=../../models/Primary_Detector/labels.txt
batch-size=4
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
config-file=config_infer_primary.txt

[tracker]
enable=1
tracker-width=640
tracker-height=368
#1 - KLT, 2 - IOU, other values are invalid
tracker-algorithm=1
iou-threshold=0.1

[secondary-gie0]
enable=1
model-engine-file=../../models/Secondary_VehicleTypes/resnet18.caffemodel_b16_int8.engine
labelfile-path=../../models/Secondary_VehicleTypes/labels.txt
batch-size=16
gie-unique-id=4
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=config_infer_secondary_vehicletypes.txt

[secondary-gie1]
enable=1
model-engine-file=../../models/Secondary_CarColor/resnet18.caffemodel_b16_int8.engine
labelfile-path=../../models/Secondary_CarColor/labels.txt
batch-size=16
gie-unique-id=5
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=config_infer_secondary_carcolor.txt

[secondary-gie2]
enable=1
model-engine-file=../../models/Secondary_CarMake/resnet18.caffemodel_b16_int8.engine
labelfile-path=../../models/Secondary_CarMake/labels.txt
batch-size=16
gie-unique-id=6
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=config_infer_secondary_carmake.txt

[tests]
file-loop=0

재미있는 것은 Gstreamer 처럼 상위 정보들이 순서대로 동작될 것이라고는 생각하지 말아야하며, 소스를 분석해서 동작방식을 알아야한다.
상위정보들은 오직 config만 해당하므로, 전체 Pipeline 구조는 소스에서 확인을 해야한다.

기본으로 숙지해야할 정보들 정리

[plug-in ]
enable : 사용여부
batch-size : data 갯수

[tiled-display]

화면에 표시되는 window의 설정

row: 2 , columns: 2 일 경우 4 Channel 로 표시
width, height : 본인이 설정하기 나름

[source0)

1st Input srouce 설정

type=3 //MultiURI RTSP/FILE 동시 설정가능
uri=rtsp://10.0.0.196:554/h264
uri=file://../../streams/sample_720p.mp4
num-sources: 4 # 4channel

[sink0]

1st PlugIn의 output , RTSP 설정시 아래의 sync설정도 중요

sync: Indicates how the stream is to be rendered ( 0 = As fast as possible ,1 = Synchronously)

Deepstream SDK Manual 참조

2. DeepStream 기본소스분석

상위에서 설명한 각각의 GStreamer 구조와 DeepStream의 Properties 부분을 파악해보자.

Deepstream의 SDK Source 전체구조 파악

$ cd ~/deepstream_sdk_on_jetson/sources

$ tree -t -L 2 .
.
├── apps                                //Deepstream Sample Application 
│   ├── apps-common             
│   └── sample_apps             // DeepStream Samples App
├── gst-plugins                        // DeepStream PlugIn  
│   ├── gst-dsexample
│   ├── gst-nvmsgbroker
│   └── gst-nvmsgconv
├── libs                                  // Deepstream Plugin Lib      
│   ├── kafka_protocol_adaptor
│   ├── nvdsparsebbox        //nvinfer의 IPlugIn (Gst-nvinfer 의 config file에서 Load가능)
│   ├── nvmsgconv
│   └── nvds_logger
├── objectDetector_FasterRCNN      // nvInfer의 IPlugIn  FasterRCNN 
│   ├── config_infer_primary_fasterRCNN.txt
│   ├── deepstream_app_config_fasterRCNN.txt
│   ├── labels.txt
│   ├── README
│   └── nvdsinfer_custom_impl_fasterRCNN
├── objectDetector_SSD              // nvInfer의 IPlugIn  SSD  
│   ├── config_infer_primary_ssd.txt
│   ├── deepstream_app_config_ssd.txt
│   ├── nvdsinfer_custom_impl_ssd
│   └── README
└── includes
    ├── gstnvdsinfer.h
    ├── gstnvdsmeta.h
    ├── gst-nvquery.h
    ├── gstnvstreammeta.h
    ├── nvbuffer.h
    ├── nvbuf_utils.h
    ├── nvdsinfer_custom_impl.h
    ├── nvdsinfer.h
    ├── nvds_logger.h
    ├── nvdsmeta.h
    ├── nvds_msgapi.h
    └── nvosd.h

Deepstream의 Sampeapp의 Gstreamer PlugIn 구조파악

apps Gstreamer의 PlugIn 구조 파악, gst_element_factory_make 함수를 찾아, gst_bin_add or gst_bin_add_many로 추가하고 gst_element_link하는지 파악

deepstream-app : create_pipeline 분석가능하며, config file갯수/설정에따라 변경
deepstream-test1-app : 나머지 3개는 이해하기가 쉬움
deepstream-test2-app
deepstream-test3-app

$ vi app/apps-common/includes/deepstream_config.h   // 전체 Gstreamer (DeepStream의 구조파악)
...... 
#define NVDS_ELEM_SRC_CAMERA_CSI "nvarguscamerasrc"
#define NVDS_ELEM_SRC_CAMERA_V4L2 "v4l2src"
#define NVDS_ELEM_SRC_URI "uridecodebin"

#define NVDS_ELEM_QUEUE "queue"
#define NVDS_ELEM_CAPS_FILTER "capsfilter"
#define NVDS_ELEM_TEE "tee"

#define NVDS_ELEM_PGIE "nvinfer"
#define NVDS_ELEM_SGIE "nvinfer"
#define NVDS_ELEM_TRACKER "nvtracker"

#define NVDS_ELEM_VIDEO_CONV "nvvidconv"
#define NVDS_ELEM_STREAM_MUX "nvstreammux"
#define NVDS_ELEM_STREAM_DEMUX "nvstreamdemux"
#define NVDS_ELEM_TILER "nvmultistreamtiler"
#define NVDS_ELEM_OSD "nvosd"
#define NVDS_ELEM_DSEXAMPLE_ELEMENT "dsexample"

#define NVDS_ELEM_MSG_CONV "nvmsgconv"
#define NVDS_ELEM_MSG_BROKER "nvmsgbroker"
........

$ grep -r gst_element_factory_make .   // Gstreamer 구조 파악 
./apps/apps-common/src/deepstream_primary_gie_bin.c:      gst_element_factory_make (NVDS_ELEM_VIDEO_CONV, "primary_gie_conv");
./apps/apps-common/src/deepstream_primary_gie_bin.c:  bin->queue = gst_element_factory_make (NVDS_ELEM_QUEUE, "primary_gie_queue");
./apps/apps-common/src/deepstream_primary_gie_bin.c:      gst_element_factory_make (NVDS_ELEM_CAPS_FILTER, "primary_gie_caps");
./apps/apps-common/src/deepstream_primary_gie_bin.c:      gst_element_factory_make (NVDS_ELEM_PGIE, "primary_gie_classifier");
./apps/apps-common/src/deepstream_tracker_bin.c:      gst_element_factory_make (NVDS_ELEM_TRACKER, "tracking_tracker");
./apps/apps-common/src/deepstream_sink_bin.c:      bin->sink = gst_element_factory_make (NVDS_ELEM_SINK_EGL, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->sink = gst_element_factory_make (NVDS_ELEM_SINK_OVERLAY, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:          gst_element_factory_make (NVDS_ELEM_SINK_FAKESINK, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:        gst_element_factory_make (NVDS_ELEM_EGLTRANSFORM, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->queue = gst_element_factory_make (NVDS_ELEM_QUEUE, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->queue = gst_element_factory_make (NVDS_ELEM_QUEUE, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->transform = gst_element_factory_make (NVDS_ELEM_VIDEO_CONV, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->encoder = gst_element_factory_make (NVDS_ELEM_ENC_H264, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->encoder = gst_element_factory_make (NVDS_ELEM_ENC_H265, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->encoder = gst_element_factory_make (NVDS_ELEM_ENC_MPEG4, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->mux = gst_element_factory_make (NVDS_ELEM_MUX_MP4, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->mux = gst_element_factory_make (NVDS_ELEM_MKV, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->sink = gst_element_factory_make (NVDS_ELEM_SINK_FILE, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->queue = gst_element_factory_make (NVDS_ELEM_QUEUE, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->transform = gst_element_factory_make (NVDS_ELEM_VIDEO_CONV, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      bin->encoder = gst_element_factory_make (NVDS_ELEM_ENC_H264, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:      gst_element_factory_make (NVDS_ELEM_CAPS_FILTER, elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->rtph264pay = gst_element_factory_make ("rtph264pay", elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->sink = gst_element_factory_make ("udpsink", elem_name);
./apps/apps-common/src/deepstream_sink_bin.c:  bin->queue = gst_element_factory_make (NVDS_ELEM_QUEUE, "sink_bin_queue");
./apps/apps-common/src/deepstream_sink_bin.c:  bin->tee = gst_element_factory_make (NVDS_ELEM_TEE, "sink_bin_tee");
./apps/apps-common/src/deepstream_dsexample.c:      gst_element_factory_make (NVDS_ELEM_QUEUE, "dsexample_queue");
./apps/apps-common/src/deepstream_dsexample.c:      gst_element_factory_make (NVDS_ELEM_DSEXAMPLE_ELEMENT, "dsexample0");
..............

Deepstream 의 상위의 1.3 deepstream-app Config 구조파악

deepstream-app 의 config 파일의 Parser 관련된 부분 파악하기

$ tree -t apps/apps-common
apps/apps-common
├── includes
│   ├── deepstream_app.h
│   ├── deepstream_colors.h
│   ├── deepstream_common.h
│   ├── deepstream_config_file_parser.h         // 상위 1.3  deepstream-app config 의 GROUP 
│   ├── deepstream_config.h                           // gstream plugin  (app-common *.c 참조 )
│   ├── deepstream_dsexample.h
│   ├── deepstream_gie.h
│   ├── deepstream_metadata_pool.h
│   ├── deepstream_osd.h
│   ├── deepstream_perf.h
│   ├── deepstream_primary_gie.h
│   ├── deepstream_secondary_gie.h
│   ├── deepstream_sinks.h
│   ├── deepstream_sources.h
│   ├── deepstream_streammux.h
│   ├── deepstream_tiled_display.h
│   └── deepstream_tracker.h
└── src
    ├── deepstream_common.c
    ├── deepstream_config_file_parser.c                 // 상위 1.3  deepstream-app config의 GROUP 
    ├── deepstream_dsexample.c
    ├── deepstream_metadata_pool.c
    ├── deepstream_osd_bin.c
    ├── deepstream_perf.c
    ├── deepstream_primary_gie_bin.c            
    ├── deepstream_secondary_gie_bin.c       
    ├── deepstream_sink_bin.c                      // Gstreamer output 
    ├── deepstream_source_bin.c                 // Gstreamer Input
    ├── deepstream_streammux.c                 // Video 영상 Mux 기능 
    ├── deepstream_tiled_display_bin.c
    ├── deepstream_tracker_bin.c                  //  Tracker 관련 기능,  KLT or IOU
    ├── deepstream_primary_gie_bin.o
    ├── deepstream_tracker_bin.o
    ├── deepstream_config_file_parser.o
    ├── deepstream_metadata_pool.o
    ├── deepstream_source_bin.o
    ├── deepstream_common.o
    ├── deepstream_perf.o
    ├── deepstream_sink_bin.o
    ├── deepstream_dsexample.o
    ├── deepstream_osd_bin.o
    ├── deepstream_secondary_gie_bin.o
    ├── deepstream_tiled_display_bin.o
    └── deepstream_streammux.o

$ cat apps/apps-common/includes/deepstream_config_file_parser.h   // 1.3 참조, 이 곳에 선언된 GROUP가준으로 Gst Properies의 Config를 작성가능 
......
#define CONFIG_GROUP_SOURCE "source"
#define CONFIG_GROUP_OSD "osd"
#define CONFIG_GROUP_PRIMARY_GIE "primary-gie"
#define CONFIG_GROUP_SECONDARY_GIE "secondary-gie"
#define CONFIG_GROUP_TRACKER "tracker"
#define CONFIG_GROUP_SINK "sink"
#define CONFIG_GROUP_TILED_DISPLAY "tiled-display"
#define CONFIG_GROUP_DSEXAMPLE "ds-example"
#define CONFIG_GROUP_STREAMMUX "streammux"
......

$ cat apps/apps-common/src/deepstream_config_file_parser.c   // 1.3 참조,  이 곳에 선언된 GROUP의 각각의 Properies 설정부터 관련기능 확인가능  
......
#define CONFIG_GROUP_ENABLE "enable"

#define CONFIG_GROUP_APP "application"
#define CONFIG_GROUP_APP_ENABLE_PERF_MEASUREMENT "enable-perf-measurement"
#define CONFIG_GROUP_APP_PERF_MEASUREMENT_INTERVAL "perf-measurement-interval-sec"
#define CONFIG_GROUP_APP_GIE_OUTPUT_DIR "gie-kitti-output-dir"

#define CONFIG_GROUP_TESTS "tests"
#define CONFIG_GROUP_TESTS_FILE_LOOP "file-loop"

#define CONFIG_GROUP_SOURCE_ENABLE "enable"
#define CONFIG_GROUP_SOURCE_TYPE "type"
#define CONFIG_GROUP_SOURCE_CAMERA_WIDTH "camera-width"
#define CONFIG_GROUP_SOURCE_CAMERA_HEIGHT "camera-height"
#define CONFIG_GROUP_SOURCE_CAMERA_FPS_N "camera-fps-n"
#define CONFIG_GROUP_SOURCE_CAMERA_FPS_D "camera-fps-d"
#define CONFIG_GROUP_SOURCE_CAMERA_CSI_SID "camera-csi-sensor-id"
#define CONFIG_GROUP_SOURCE_CAMERA_V4L2_DEVNODE "camera-v4l2-dev-node"
#define CONFIG_GROUP_SOURCE_URI "uri"
#define CONFIG_GROUP_SOURCE_LATENCY "latency"
#define CONFIG_GROUP_SOURCE_NUM_SOURCES "num-sources"
#define CONFIG_GROUP_SOURCE_INTRA_DECODE "intra-decode-enable"

#define CONFIG_GROUP_STREAMMUX_WIDTH "width"
#define CONFIG_GROUP_STREAMMUX_HEIGHT "height"
#define CONFIG_GROUP_STREAMMUX_BATCH_SIZE "batch-size"
#define CONFIG_GROUP_STREAMMUX_BATCHED_PUSH_TIMEOUT "batched-push-timeout"
#define CONFIG_GROUP_STREAMMUX_LIVE_SOURCE "live-source"

#define CONFIG_GROUP_OSD_MODE "osd-mode"
#define CONFIG_GROUP_OSD_BORDER_WIDTH "border-width"
#define CONFIG_GROUP_OSD_BORDER_COLOR "border-color"
#define CONFIG_GROUP_OSD_TEXT_SIZE "text-size"
#define CONFIG_GROUP_OSD_TEXT_COLOR "text-color"
#define CONFIG_GROUP_OSD_TEXT_BG_COLOR "text-bg-color"
#define CONFIG_GROUP_OSD_FONT "font"
#define CONFIG_GROUP_OSD_CLOCK_ENABLE "show-clock"
#define CONFIG_GROUP_OSD_CLOCK_X_OFFSET "clock-x-offset"
#define CONFIG_GROUP_OSD_CLOCK_Y_OFFSET "clock-y-offset"
#define CONFIG_GROUP_OSD_CLOCK_TEXT_SIZE "clock-text-size"
#define CONFIG_GROUP_OSD_CLOCK_COLOR "clock-color"

//현재 GIE에서 설정가능한 Properties 들이며, 세부사용법은 Manual 
#define CONFIG_GROUP_GIE_BATCH_SIZE "batch-size"
#define CONFIG_GROUP_GIE_MODEL_ENGINE "model-engine-file"
#define CONFIG_GROUP_GIE_CONFIG_FILE "config-file"
#define CONFIG_GROUP_GIE_LABEL "labelfile-path"
#define CONFIG_GROUP_GIE_UNIQUE_ID "gie-unique-id"
#define CONFIG_GROUP_GIE_ID_FOR_OPERATION "operate-on-gie-id"
#define CONFIG_GROUP_GIE_BBOX_BORDER_COLOR "bbox-border-color"
#define CONFIG_GROUP_GIE_BBOX_BG_COLOR "bbox-bg-color"
#define CONFIG_GROUP_GIE_CLASS_IDS_FOR_OPERATION "operate-on-class-ids"
#define CONFIG_GROUP_GIE_INTERVAL "interval"
#define CONFIG_GROUP_GIE_RAW_OUTPUT_DIR "infer-raw-output-dir"
.............

이 기능들은 각각의 GROUP인 parse_xxx 함수에 의해 호출되며, GIE인 경우 parse_gie가 된다. 
그리고, 상위의 Gst Properites의 Config 파일로 읽어 각 Group의 Config 변수에 저장되어지며, GIE의 경우 NvDsGieConfig 저장

3. Gst-nvinfer(TensorRT)의 IPlugin 확장기능

TensorRT의 Custom layer를 위한 확장기능으로 동작하는 방식은 IPlugin Interface를 연결하여 만들고, 본인이 원하는 Layer를 공유라이브러리로 구현하면된다고한다.

이 기능을 테스트하기위해서는 각각의 Framwork의 Model과 관련 Label 정보가 필요하며, 이는 기존의 TensorRT 예제에서 가져오고 테스트를 진행한다.
그러므로, 기존에 사용했던 TensorRT 부분을 확인하고 가자.

필요사항

Framework의 pre-trained Model (Caffe, UFF (Tensorflow) )
lable or prototxt file ( Framework에 의존적임)

Deepstream 에서 제공되어지는 기본예제들

nvinfer의 IPlugIn FasterRCNN (Caffe)
nvinfer의 IPlugIn SSD (Tensorflow)

TensorRT Layer 특징과 Presion 정보 (PlugIn Layer)

TensorRT의 Layer의 특징과 제약사항들을 다시 알아보고, PlugIn의 Layer관련부분을 알아보자.

https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html#layers-matrix
https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html#hardware-precision-matrix

TensorRT Layer 전체 Index

https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#layers

TensorRT Plugin Layer 와 API

https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#pluginv2-layer
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#plugin-api-desc

3.1 IPlugin 관련 소스 확인 및 기본적인 이해 와 분석

Deepstream 3.0 SDK에서 Gst-nvinfer(TensorRT)는 IPluginV2 and IPluginCreator interface를 지원하며 DeepStream SDK Manual의 5장참조하자.

상위 2장의 전체 Deepstream SDK 3.0 Source에서 IPlugin 부분들을 보면될 것 같다.
Gst-nvinfer에서 config file에서 key로 설정하여 본인이 만든 공유 library loading 방식이며, bbox(boundbox)를 다 독자적으로 구현을 했다.

현재 주 사용하는 목적은 현재로는 bbox 구현으로만 보이지만, 본인이 원하면, 다른것으로도 구현이 가능할 것 같다.
Output layer를 기반으로 본인이 얻고자하는 Data를 가공하여 이를 만들면 되겠지만, 역시 등록해야하는 함수 Key가 필요하기때문에 제한적일 것 같다.

TensorRT의 IPlugInV2 예제 소스

libs/nvdsparsebox: bound box 구현
ObjectDetector_FasterRCNN: bound box 구현
ObjectDetector_SSD: bound box 구현

TensorRT는 C++기반의 Library Engine이며, 아래와 같이 Class를 제공하더라도 어디에 속했는지가 중요하므로, 각각의 Manual을 볼때 주의하도록하자.

TensorRT의 Main Class List

nvcaffeparser1
nvinfer1
nvonnxparser
nvuffparser

Class List
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/annotated.html

nvdsinfer_custom_impl.h

NvDsInferPluginFactoryCaffeGet
NvDsInferPluginFactoryCaffeDestroy
NvDsInferPluginFactoryUffGet
NvDsInferPluginFactoryUffDestroy

IPluginFactory Class의 제공하는 Main Class들

nvinfer1::IPluginFactory : TensorRT
nvuffparser1::IPluginFactory : UFF Parser
nvuffparser1::IPluginFactoryExt : UFF Parser
nvcaffeparser1::IPluginFactory : Caffe Parser
nvcaffeparser1::IPluginFactoryExt : Caffe Parser
nvcaffeparser1::IPluginFactoryV2 : Caffe Parser

DeepStream SDK Manual 의 5.0 IPLUGIN INTERFACE 부분 참조

IPluginCreator or IPluginFactory Interface를 사용하기위해 반드시 독립적인 custom library(공유 Library)를 구현해야하며, 이 Library는 Gst-nvinfer 의 custom-lib-path key로 설정이 가능하다.
이 예가 lib/nvdsparsebox 이며 Gst-nvinfer에 상위 설정을 적용하며, 이 Library를 Plugin가능하다.

1stGIE Config File

Gst-nvinfer Config File들을 비교해보면, 아래와 같이 Custom Shared Library와 Custom Function 을 bbox를 위해 등록.(파란색참조)

$ cat deepstream_sdk_on_jetson/samples/configs/deepstream-app/config_infer_primary.txt
## 1st GIE는 오직 4개만 분류가능 
num-detected-classes=4
## Array of output layer  names , 세미콜론을 구분 
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid  
...
# 0:custom 1:GoogleNet 2:NVIDIA Type0 3:NVIDIA Type1 4: ResNet 
parse-func=4
##  //libs/nvsparsebox/nvdsparsebbox.cpp 에 함수가 구현됨
#parse-bbox-func-name=NvDsInferParseCustomResnet
##   IPlugin 연결 (Gst-nvinfer 이를 dlopen 연결)
#custom-lib-path=/home/nvidia/deepstream_sdk_on_jetson/sources/libs/nvdsparsebbox/libnvdsparsebbox.so   

$ cat deepstream_sdk_on_jetson/sources/objectDetector_FasterRCNN/config_infer_primary_fasterRCNN.txt 
..

# 1st GIE는 오직 21 개만 분류가능 
num-detected-classes=21
# Array of output layer  names , 세미콜론을 구분 
output-blob-names=bbox_pred;cls_prob;rois         

...
# 0:custom 1:GoogleNet 2:NVIDIA Type0 3:NVIDIA Type1 4: ResNet 
parse-func=0
parse-bbox-func-name=NvDsInferParseCustomFasterRCNN                                                           
custom-lib-path=nvdsinfer_custom_impl_fasterRCNN/libnvdsinfer_custom_impl_fasterRCNN.so

$ cat deepstream_sdk_on_jetson/sources/objectDetector_SSD/config_infer_primary_ssd.txt 

## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0       

##1st GIE는 오직 91 개만 분류가능 
num-detected-classes=91 

## Array of output layer  names , 세미콜론을 구분 
output-blob-names=MarkOutput_0    

...
## 0:custom 1:GoogleNet 2:NVIDIA Type0 3:NVIDIA Type1 4: ResNet 
parse-func=0        
## bbox 구현된 function              
parse-bbox-func-name=NvDsInferParseCustomSSD
custom-lib-path=nvdsinfer_custom_impl_ssd/libnvdsinfer_custom_impl_ssd.so

현재는 UFF와 Caffer만 지원가능한 것 같고, ONMX의 경우는 언급이 없으며, 예제도 아직없다.
지원을 한다고 하는데, 지금은 왠지 동작 안될 것 같다.

nvdsplugin_xxx.cpp : ( IPluginV2 , IPluginCreator )

상위에 언급된 Class를 가지고, IPlugIn를 만들어 TensorRT에 등록하는 기능같다.

nvinfer1

nvinfer1::IPluginV2 Class Reference
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_plugin_v2.html

nvinfer1::IPluginCreator Class Reference
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_plugin_creator.html

nvinfer1::IPluginFactory Class Reference
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_plugin_factory.html

nvdsparsebbox_xxx.cpp

이미 pre-trained Model에서 Layer의 이름을 가지고 찾아 그 Layer에서 나온 값을 기준으로 bbox(bounding box) 함수를 만들어 동작시키는 구조이다.
이때 사용되어지는 모듈들이 Layer를 찾기위해서 Parser를 사용하며, 상위 nvinfer도 같이 사용하는 것 같다.

nvuffparser and nvcaffeparser1

nvuffparser::IPluginFactory Class Reference
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvuffparser_1_1_i_plugin_factory.html

nvcaffeparser1::IPluginFactory Class Reference
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvcaffeparser1_1_1_i_plugin_factory.html

세부내용은Deepstream SDK Manual참조

Deepstream SDK 3.0의 FasterRCNN 과 SSD 의 소스

각각의 UFF, Caffe에 따라 구현되어지는 함수가 다른 것 같으며, UFF가 가장 쉽게 연결하여 구현할수 있을 것 같다.

$ cd ~/deepstream_sdk_on_jetson/sources/objectDetector_FasterRCNN/nvdsinfer_custom_impl_fasterRCNN
$ ls
factoryFasterRCNN.h        libnvdsinfer_custom_impl_fasterRCNN.so  nvdsinitinputlayers_fasterRCNN.cpp  nvdsparsebbox_fasterRCNN.cpp
factoryFasterRCNNLegacy.h  Makefile                                nvdsiplugin_fasterRCNN.cpp          nvdssample_fasterRCNN_common.h

$ cd ~/deepstream_sdk_on_jetson/sources/objectDetector_SSD/nvdsinfer_custom_impl_ssd
$ ls
libnvdsinfer_custom_impl_ssd.so  Makefile  nvdsiplugin_ssd.cpp  nvdsparsebbox_ssd.cpp

TensorRT의 IPlugIN의 C++ 기능
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_plugin.html

TensorRT의 IPlugIN Python API기능
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Plugin/pyPlugin.html

Extending TensorRT With Custom Layers
  https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#extending

3.2 IPlugIn FasterRCNN 기능확인 (Caffe 기반)

DeepStream SDK의 포함된 FasterRCNN(objectDetector_FasterRCNN)을 Build하여, 각각의 Sample 영상을 확인하자.

Deepstram FasterRCNN 준비작업

README를 읽어보면, CafferModel과 기존의 TensorRT의 prototxt file이 필요하다.
CaffeModel은 인터넷에서 아래와 같이 사이트에서 찾고, config 파일 기준으로 환경을 설정해주자.
https://docs.openvinotoolkit.org/2018_R5/_samples_object_detection_demo_README.html

$ cd ~/deepstream_sdk_on_jetson/sources/objectDetector_FasterRCNN

$ ls   //README 참고하고, 두개의 Config가 존재 
config_infer_primary_fasterRCNN.txt  deepstream_app_config_fasterRCNN.txt  labels.txt  nvdsinfer_custom_impl_fasterRCNN  README

$ cd nvdsinfer_custom_impl_fasterRCNN/
$ make    //  libnvdsinfer_custom_impl_fasterRCNN.so 생성확인 

$ cd ..

//download VGG16_faster_rcnn_final.caffemodel
$ wget https://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz?dl=0  

$ mv 'faster_rcnn_models.tgz?dl=0' faster_rcnn_models.tgz

$ tar zxvf faster_rcnn_models.tgz 
faster_rcnn_models/
faster_rcnn_models/ZF_faster_rcnn_final.caffemodel
faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel

$ ln -s faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel VGG16_faster_rcnn_final.caffemodel

$ cp /usr/src/tensorrt/data/faster-rcnn/faster_rcnn_test_iplugin.prototxt .

FastRCNN의 CMD Gstreamer 와 deepstream-app 비교 영상확인

$ pwd
/home/nvidia/deepstream_sdk_on_jetson/sources/objectDetector_FasterRCNN

$ gst-launch-1.0 filesrc location=../../samples/streams/sample_720p.mp4 ! \
        decodebin ! nvinfer config-file-path= config_infer_primary_fasterRCNN.txt ! \
        nvvidconv ! nvosd ! nvegltransform ! nveglglessink

$ deepstream-app -c deepstream_app_config_fasterRCNN.txt

두개의 영상 느리며, 실시간으로 사용하기가 힘들것으로 보이며, 현재 FP32엔진으로 사용

CMD-Gstreamer에 Gst-nvinfer 설정은 가능하지만 metadata 처리를 위한 Callback Function이 없기에 아래와 같이 OSD에 Class를 표시 못하고 Detection만 가능

deepstream-app 기존처럼 동작하지만, PERF의 7 ~8 FPS로 동작되어 느림 (실시간은 힘듦)

3.3 IPlugIn SSD 기능확인 (Tensorflow 기반)

Deepstream SDK에 동일하게 UFFSSD(objectDetector_SSD)를 부분을 Build를 진행하고, Tensorflow에서 Training 된 *.PB파일을 받아 이를 UFF로 변경 후에
이를 상위 objectDetector_SSD를 이용하여 동영상을 확인하자.

Deepstream SSD 작업준비사항

동일하게 README를 읽어보면, 준비작업이 필요하며, 필요사항을 설치 및 준비해주자.
UFF Model: x86에서 Tensorflow PB파일을 받아 UFF로 변환을 하고 이를 가져와야한다
Label File: TensorRT Sampel에서 가져옴

$ cd ~/deepstream_sdk_on_jetson/sources/objectDetector_SSD

$ ls   //README 참고하고, 두개의 Config가 존재 
config_infer_primary_ssd.txt  deepstream_app_config_ssd.txt  nvdsinfer_custom_impl_ssd  README


$ cd nvdsinfer_custom_impl_ssd/
$ make    //  libnvdsinfer_custom_impl_ssd.so 생성확인 
$ cd  ..

//$ cat  /usr/src/tensorrt/samples/sampleUffSSD/README   
//download ssd_inception_v2_coco_2017_11_17.tar.gz
$ wget  http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_2017_11_17.tar.gz 

$ tar zxvf ssd_inception_v2_coco_2017_11_17.tar.gz
ssd_inception_v2_coco_2017_11_17/
ssd_inception_v2_coco_2017_11_17/model.ckpt.index
ssd_inception_v2_coco_2017_11_17/model.ckpt.meta
ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pb
ssd_inception_v2_coco_2017_11_17/model.ckpt.data-00000-of-00001
ssd_inception_v2_coco_2017_11_17/saved_model/
ssd_inception_v2_coco_2017_11_17/saved_model/saved_model.pb
ssd_inception_v2_coco_2017_11_17/saved_model/variables/
ssd_inception_v2_coco_2017_11_17/checkpoint

//$ cp   /usr/src/tensorrt/samples/sampleUffSSD/config.py  .                    (config.py)
//$ convert-to-uff --input-file frozen_inference_graph.pb -O NMS -p config.py   (x86에서 변환) 

$ mv frozen_inference_graph.uff sample_ssd_relu6.uff

$ cp /usr/src/tensorrt/data/ssd/ssd_coco_labels.txt .

SSD CMD GStreamer와 deepstream-app 비교 영상확인

동일한 기능을 Gstreamer를 이용하여 실행해보고, deepstream app를 이용해서 비교를 해보면, deepstream app 내부에는 callback function 에서
Stream의 metadata처리하여 OSD에 이를 적용하는부분 때문에 두 실행결과가 다르다.

결과적으로, metadata 처리를 위한 callback function들은 작성을 해야겠다.

$ pwd
/home/nvidia/deepstream_sdk_on_jetson/sources/objectDetector_SSD

$ gst-launch-1.0 filesrc location=../../samples/streams/sample_720p.mp4 ! \
        decodebin ! nvinfer config-file-path= config_infer_primary_ssd.txt ! \
        nvvidconv ! nvosd ! nvegltransform ! nveglglessink

$ deepstream-app -c deepstream_app_config_ssd.txt

CMD-Gstreamer 실행 , 이전 Faster RCNN과 속도는 비교가 안되며, 실시간으로 가능
(상위와 동일하게 Callback Function이 동작하지 않으므로, OSD Label 표시가 안됨)

deepstream-app 돌려서 OSD에서 각 Class가 구분되며, 실시간으로 FPS 30으로 동작가능확인

RTSP (IP Camera 설정으로 변경)

만약 IP 카메라로 RTSP를 이용하여 실시간으로 테스트 하고 싶다면, Config File을 수정하여 손쉽게 동작가능하며 테스트도 상위와 동일 (수정방법 아래)

$ pwd
/home/nvidia/deepstream_sdk_on_jetson/sources/objectDetector_SSD

$ cp deepstream_app_config_ssd.txt  deepstream_app_rtsp.txt

$ vi deepstream_app_rtsp.txt

[source0]
...
#uri=file://../../samples/streams/sample_720p.mp4
uri=rtsp://10.0.0.196:554/h264

[sink0]
..
sync=0


[tracker]         //이부분 IOU 변경 
..
tracker-algorithm=2

다른 batch-size도 변경해도 됨 

$ deepstream-app -c  deepstream_app_rtsp.txt

3.4 TensorRT의 기능 재확인

TensorRT의 기능 확인

SampleFasterRCNN확인 (Caffe Model 사용)
SampleUFFSSD 확인 (UFF Format 현재 거의 Tensorflow)

이를 테스트 한지 오래되어 아래와 같이 TensorRT 부분을 재확인

$ ls /usr/src/tensorrt/samples  // TensorRT의 전체 Sample Source 확인 
common     Makefile         sampleCharRNN     sampleGoogleNet  sampleINT8API  sampleMNIST     sampleMovieLens  sampleOnnxMNIST  sampleSSD       sampleUffSSD
getDigits  Makefile.config  sampleFasterRCNN  sampleINT8       sampleMLP      sampleMNISTAPI  sampleNMT        samplePlugin     sampleUffMNIST  trtexec

$ ls /usr/src/tensorrt/samples/sampleFasterRCNN  //FasterRCNN Source 및 README 확인 
factoryFasterRCNN.h  Makefile  README.txt  sampleFasterRCNN.cpp

$ ls /usr/src/tensorrt/samples/sampleUffSSD  //sampleUffSSD Source 및 README 확인 
BatchStreamPPM.h  config.py  Makefile  README.txt  sampleUffSSD.cpp


$ ls /usr/src/tensorrt/bin/    // TensorRT의 Binary 확인 
chobj/                    sample_fasterRCNN         sample_int8_api_debug     sample_mnist_api_debug    sample_onnx_mnist         sample_uff_mnist          
dchobj/                   sample_fasterRCNN_debug   sample_int8_debug         sample_mnist_debug        sample_onnx_mnist_debug   sample_uff_mnist_debug    
download-digits-model.py  sample_googlenet          sample_mlp                sample_movielens          sample_plugin             sample_uff_ssd            
giexec                    sample_googlenet_debug    sample_mlp_debug          sample_movielens_debug    sample_plugin_debug       sample_uff_ssd_debug      
sample_char_rnn           sample_int8               sample_mnist              sample_nmt                sample_ssd                trtexec                   
sample_char_rnn_debug     sample_int8_api           sample_mnist_api          sample_nmt_debug          sample_ssd_debug          trtexec_debug   

$ ls /usr/src/tensorrt/data/faster-rcnn/
000456.ppm  000542.ppm  001150.ppm  001763.ppm  004545.ppm  faster_rcnn_test_iplugin.prototxt

$ ls /usr/src/tensorrt/data/ssd/
bus.ppm  dog.ppm  ssd_coco_labels.txt