Jetpack 4.2를 설치를 다 진행을 하고 SDK를 다 설치를 진행을 했으면, 이전 Jetpack 3.3 처럼 기본 Sample을 다시 확인하면,
드디어 python 지원 되는 것을 확인가능하며, C++ 예제도 여전히 존재한다.
이전 SDK보다 Sample의 갯수가 많아졌으며, 추측으로는 Server 기반의 API도 지원가능한 것 같다.
각각의 기능을 다 살펴보려면 Manual과 소스를 조금씩 다 분석을 해봐야겠다.
1.1 TensorRT Sample TEST
이전 처럼 TensorRT Sample 소스로 가서 빌드 후 생긴 각각의 bin 파일들을 테스트 진행
- Sampel Build
$ ssh -X jetsontx2@192.168.55.1 // jetson TX2 접속 jetsontx2@jetsontx2-desktop:~$ ls /usr/src/tensorrt/samples/ common Makefile.config sampleFasterRCNN sampleINT8API sampleMNISTAPI sampleOnnxMNIST sampleUffMNIST getDigits python sampleGoogleNet sampleMLP sampleMovieLens samplePlugin sampleUffSSD Makefile sampleCharRNN sampleINT8 sampleMNIST sampleNMT sampleSSD trtexec jetsontx2@jetsontx2-desktop:~$ cd /usr/src/tensorrt/samples jetsontx2@jetsontx2-desktop:~$ sudo make
- Sampel Test
jetsontx2@jetsontx2-desktop:~$ cd .. bin data python samples jetsontx2@jetsontx2-desktop:~$ cd ./bin jetsontx2@jetsontx2-desktop:/usr/src/tensorrt/bin$ ls chobj sample_googlenet sample_mnist sample_onnx_mnist sample_uff_ssd dchobj sample_googlenet_debug sample_mnist_api sample_onnx_mnist_debug sample_uff_ssd_debug download-digits-model.py sample_int8 sample_mnist_api_debug sample_plugin trtexec giexec sample_int8_api sample_mnist_debug sample_plugin_debug trtexec_debug sample_char_rnn sample_int8_api_debug sample_movielens sample_ssd sample_char_rnn_debug sample_int8_debug sample_movielens_debug sample_ssd_debug sample_fasterRCNN sample_mlp sample_nmt sample_uff_mnist sample_fasterRCNN_debug sample_mlp_debug sample_nmt_debug sample_uff_mnist_debug :/usr/src/tensorrt/bin$ ./sample_mlp // Simple TEST 이며, 다 테스트 진행 --------------------------- @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@%+-: =@@@@@@@@@@@@ @@@@@@@%= -@@@**@@@@@@@ @@@@@@@ :%#@-#@@@. #@@@@@@ @@@@@@* +@@@@:*@@@ *@@@@@@ @@@@@@# +@@@@ @@@% @@@@@@@ @@@@@@@. :%@@.@@@. *@@@@@@@ @@@@@@@@- =@@@@. -@@@@@@@@ @@@@@@@@@%: +@- :@@@@@@@@@ @@@@@@@@@@@%. : -@@@@@@@@@@ @@@@@@@@@@@@@+ #@@@@@@@@@@ @@@@@@@@@@@@@@+ :@@@@@@@@@@ @@@@@@@@@@@@@@+ *@@@@@@@@@ @@@@@@@@@@@@@@: = @@@@@@@@@ @@@@@@@@@@@@@@ :@ @@@@@@@@@ @@@@@@@@@@@@@@ -@ @@@@@@@@@ @@@@@@@@@@@@@# +@ @@@@@@@@@ @@@@@@@@@@@@@* ++ @@@@@@@@@ @@@@@@@@@@@@@* *@@@@@@@@@ @@@@@@@@@@@@@# =@@@@@@@@@@ @@@@@@@@@@@@@@. +@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@
2. Yolov3 python 테스트
TensorRT 5.0 부터 python으로 Yolov3를 onnx 기반에서 지원하며, 아래 테스트를 진행해보면, 결국 TensorRT 용 전용 Data를 생성해서 동작된다.
- Python Package 확인
$ dpkg -l | grep TensorRT // TensorRT 관련 Package 설치확인 ii graphsurgeon-tf 5.0.6-1+cuda10.0 arm64 GraphSurgeon for TensorRT package ii libnvinfer-dev 5.0.6-1+cuda10.0 arm64 TensorRT development libraries and headers ii libnvinfer-samples 5.0.6-1+cuda10.0 all TensorRT samples and documentation ii libnvinfer5 5.0.6-1+cuda10.0 arm64 TensorRT runtime libraries ii python-libnvinfer 5.0.6-1+cuda10.0 arm64 Python bindings for TensorRT ii python-libnvinfer-dev 5.0.6-1+cuda10.0 arm64 Python development package for TensorRT ii python3-libnvinfer 5.0.6-1+cuda10.0 arm64 Python 3 bindings for TensorRT ii python3-libnvinfer-dev 5.0.6-1+cuda10.0 arm64 Python 3 development package for TensorRT ii tensorrt 5.0.6.3-1+cuda10.0 arm64 Meta package of TensorRT ii uff-converter-tf 5.0.6-1+cuda10.0 arm64 UFF converter for TensorRT package
TensorRT 설치확인 (주의 x86기반 설명이므로 참고만)
https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html
- Python Package 설치
$ sudo apt-get install -y python-pip python-dev // python2 상위소스가 python 2이므로 이것만설치 $ sudo apt-get install -y python3-pip python3-dev // python3 , 다른 소스는 python3 이므로 설치 $ pip install wget //onnx 에러 해결을 위해 아래와 같이 설치 $ sudo apt-get install cmake // onnx에서 필요 $ sudo apt-get install build-essential // build 관련부분 전체설치 $ sudo apt-get install protobuf-compiler libprotoc-dev // $ pip install onnx // 설치시 cmakd 및 probuf 설치가 필요 1.5.0 $ pip uninstall onnx; pip install onnx==1.4.1 // 상위 onnx가 NVIDIA에서 알려진 이슈로 나옴, $ pip uninstall onnx; pip install onnx=1.2.2 // 진행 했지만 설치 안됨 $ pip install Pillow==2.2.1 // libjpeg 설치안해서 아래의 소스에서 에러 $ sudo apt-get install libjpeg-dev // libjpeg 설치 후 PIL 재설치 및 설정 $ pip uninstall Pillow; pip install --no-cache-dir -I pillow // Pillow==6.0.0 변경
pip 설치 및 tensorflow 기본설치
https://github.com/jetsonhacks/installTensorFlowJetsonTX
https://github.com/onnx/onnx-tensorrt/issues/62
https://github.com/onnx/onnx/issues/389
onnx nvidia issue 사항 확인
https://docs.nvidia.com/deeplearning/sdk/tensorrt-release-notes/tensorrt-5.html
https://devtalk.nvidia.com/default/topic/1047487/tensorrt-5-0-2-6-yolov3_onnx-sample-error-/
PIL ( jpeg decode 사용하므로 먼저설치필요)
https://stackoverflow.com/questions/29649941/pil-decoder-jpeg-not-available-raspberry
기타정보
https://blog.csdn.net/xxradon/article/details/89160576
onnx nvidia issue 사항 확인
https://docs.nvidia.com/deeplearning/sdk/tensorrt-release-notes/tensorrt-5.html
https://devtalk.nvidia.com/default/topic/1047487/tensorrt-5-0-2-6-yolov3_onnx-sample-error-/
PIL ( jpeg decode 사용하므로 먼저설치필요)
https://stackoverflow.com/questions/29649941/pil-decoder-jpeg-not-available-raspberry
기타정보
https://blog.csdn.net/xxradon/article/details/89160576
2.1 python yolov3_onnx test 진행
상위에서 필요한 python pakcage들을 전부 설치를 해야 동작가능하며, 나의 경우는 테스트하면서 필요한 Package를 설치하여 상위와 같이 적었다.
README 를 반드시 읽고 관련사항 숙지
- YOLOv3 -> ONNX -> TensorRT 로 최종변환
$ cd /usr/src/tensorrt/samples/python $ ls common.py fc_plugin_caffe_mnist network_api_pytorch_mnist uff_ssd end_to_end_tensorflow_mnist introductory_parser_samples uff_custom_plugin yolov3_onnx $ cd yolov3_onnx/ $ vi README.md // python 3는 미지원 확인 및 아래 명령들 확인, 각 python 기능확인 $ cat requirements.txt // 설치되어있어야 하는 Python Package numpy>=1.15.1 onnx pycuda>=2017.1.1 Pillow>=5.2.0 wget>=3.2 $ python2 -m pip install -r requirements.txt // python2 onnx 1.5.0 설치됨 (다시별도로 1.4.1 변경 후 진행) $ python3 -m pip install -r requirements.txt // python3 (나는 아직 설치 안함) //python2이며 동작되면, yolov3.weight yolov3.cfg download 진행후 onnx 변환 $ sudo python yolov3_to_onnx.py // onnx 1.4.1 동작 , Layer of type yolo not supported, skipping ONNX node generation. Layer of type yolo not supported, skipping ONNX node generation. Layer of type yolo not supported, skipping ONNX node generation. graph YOLOv3-608 ( 0_net[FLOAT, 64x3x608x608] ) initializers ( 1_convolutional_bn_scale[FLOAT, 32] 1_convolutional_bn_bias[FLOAT, 32] 1_convolutional_bn_mean[FLOAT, 32] 1_convolutional_bn_var[FLOAT, 32] 1_convolutional_conv_weights[FLOAT, 32x3x3x3] 2_convolutional_bn_scale[FLOAT, 64] 2_convolutional_bn_bias[FLOAT, 64] 2_convolutional_bn_mean[FLOAT, 64] 2_convolutional_bn_var[FLOAT, 64] 2_convolutional_conv_weights[FLOAT, 64x32x3x3] 3_convolutional_bn_scale[FLOAT, 32] 3_convolutional_bn_bias[FLOAT, 32] 3_convolutional_bn_mean[FLOAT, 32] 3_convolutional_bn_var[FLOAT, 32] 3_convolutional_conv_weights[FLOAT, 32x64x1x1] 4_convolutional_bn_scale[FLOAT, 64] 4_convolutional_bn_bias[FLOAT, 64] 4_convolutional_bn_mean[FLOAT, 64] 4_convolutional_bn_var[FLOAT, 64] 4_convolutional_conv_weights[FLOAT, 64x32x3x3] 6_convolutional_bn_scale[FLOAT, 128] 6_convolutional_bn_bias[FLOAT, 128] 6_convolutional_bn_mean[FLOAT, 128] 6_convolutional_bn_var[FLOAT, 128] .......... %104_convolutional_conv_weights[FLOAT, 128x256x1x1] %105_convolutional_bn_scale[FLOAT, 256] %105_convolutional_bn_bias[FLOAT, 256] %105_convolutional_bn_mean[FLOAT, 256] %105_convolutional_bn_var[FLOAT, 256] %105_convolutional_conv_weights[FLOAT, 256x128x3x3] %106_convolutional_conv_bias[FLOAT, 255] %106_convolutional_conv_weights[FLOAT, 255x256x1x1] ... ) { 1_convolutional = Conv[auto_pad = u'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](0_net, 1_convolutional_conv_weights) 1_convolutional_bn = BatchNormalization[epsilon = 1e-05, momentum = 0.99](1_convolutional, 1_convolutional_bn_scale, 1_convolutional_bn_bias, 1_convolutional_bn_mean, 1_convolutional_bn_var) 1_convolutional_lrelu = LeakyRelu[alpha = 0.1](1_convolutional_bn) 2_convolutional = Conv[auto_pad = u'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [2, 2]](1_convolutional_lrelu, 2_convolutional_conv_weights) ... %103_convolutional_bn = BatchNormalization[epsilon = 1e-05, momentum = 0.99](%103_convolutional, %103_convolutional_bn_scale, %103_convolutional_bn_bias, %103_convolutional_bn_mean, %103_convolutional_bn_var) %103_convolutional_lrelu = LeakyRelu[alpha = 0.1](%103_convolutional_bn) %104_convolutional = Conv[auto_pad = u'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%103_convolutional_lrelu, %104_convolutional_conv_weights) %104_convolutional_bn = BatchNormalization[epsilon = 1e-05, momentum = 0.99](%104_convolutional, %104_convolutional_bn_scale, %104_convolutional_bn_bias, %104_convolutional_bn_mean, %104_convolutional_bn_var) %104_convolutional_lrelu = LeakyRelu[alpha = 0.1](%104_convolutional_bn) %105_convolutional = Conv[auto_pad = u'SAME_LOWER', dilations = [1, 1], kernel_shape = [3, 3], strides = [1, 1]](%104_convolutional_lrelu, %105_convolutional_conv_weights) %105_convolutional_bn = BatchNormalization[epsilon = 1e-05, momentum = 0.99](%105_convolutional, %105_convolutional_bn_scale, %105_convolutional_bn_bias, %105_convolutional_bn_mean, %105_convolutional_bn_var) %105_convolutional_lrelu = LeakyRelu[alpha = 0.1](%105_convolutional_bn) %106_convolutional = Conv[auto_pad = u'SAME_LOWER', dilations = [1, 1], kernel_shape = [1, 1], strides = [1, 1]](%105_convolutional_lrelu, %106_convolutional_conv_weights, %106_convolutional_conv_bias) return %082_convolutional, %094_convolutional, %106_convolutional //python 2이며, onnx에서 tensorRT 변환 (README참조) $ sudo python onnx_to_tensorrt.py // PIL Module 6.0.0 과 jpeglib 필요 $ sudo python onnx_to_tensorrt.py Loading ONNX file from path yolov3.onnx... Beginning ONNX file parsing Completed parsing of ONNX file Building an engine from file yolov3.onnx; this may take a while... Completed creating Engine Running inference on image dog.jpg... [[135.04631129 219.14287094 184.31729756 324.86083388] [ 98.95616386 135.5652711 499.10095358 299.16207424] [477.88943795 81.22835189 210.86732516 86.96319981]] [0.99852328 0.99881124 0.93929232] [16 1 7] Saved image with bounding boxes of detected objects to dog_bboxes.png $ eog dog_bboxes.png // box와 label 이 생김 $ ls coco_labels.txt data_processing.pyc dog.jpg README.md yolov3.cfg yolov3_to_onnx.py yolov3.trt data_processing.py dog_bboxes.png onnx_to_tensorrt.py requirements.txt yolov3.onnx yolov3_to_onnx.pyc yolov3.weights $ sudo python onnx_to_tensorrt.py // 두번째 실행하면, onnx->trt 변환이 필요 없으므로, 빨리 실행 Reading engine from file yolov3.trt Running inference on image dog.jpg... [[135.04631129 219.14287094 184.31729756 324.86083388] [ 98.95616386 135.5652711 499.10095358 299.16207424] [477.88943795 81.22835189 210.86732516 86.96319981]] [0.99852328 0.99881124 0.93929232] [16 1 7] Saved image with bounding boxes of detected objects to dog_bboxes.png. $ vi coco_labels.txt // 분류가능한 것 정보가 나온다. $ pip list // pip는 기본으로 pip2로 설정되며, python2 Package Version ( pip3, python3) DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning. appdirs (1.4.3) asn1crypto (0.24.0) atomicwrites (1.3.0) attrs (19.1.0) configparser (3.7.4) contextlib2 (0.5.5) cryptography (2.1.4) decorator (4.4.0) enum34 (1.1.6) funcsigs (1.0.2) gps (3.17) graphsurgeon (0.3.2) idna (2.6) importlib-metadata (0.17) ipaddress (1.0.17) keyring (10.6.0) keyrings.alt (3.0) Mako (1.0.10) MarkupSafe (1.1.1) more-itertools (5.0.0) numpy (1.16.4) onnx (1.4.1) pathlib2 (2.3.3) Pillow (6.0.0) pip (9.0.1) pluggy (0.12.0) protobuf (3.8.0) py (1.8.0) pycairo (1.16.2) pycrypto (2.6.1) pycuda (2019.1) pygobject (3.26.1) pytest (4.5.0) pytools (2019.1.1) pyxdg (0.25) scandir (1.10.0) SecretStorage (2.3.1) setuptools (41.0.1) six (1.12.0) tensorrt (5.0.6.3) typing (3.6.6) typing-extensions (3.7.2) uff (0.5.5) unity-lens-photos (1.0) wcwidth (0.1.7) wget (3.2) wheel (0.30.0) zipp (0.5.1)
결론적으로 yolo ->onnx ->TensorRT 변환시간이 많이 걸리며, 한번 변한면 속도는 괜찮다.
- 다른 그림 테스트 진행
간단히 소스를 수정해서 다른 그림들을 손쉽게 테스트가 가능하며, 지금까지 나는 최상의 성능으로 테스트를 진행하지 않았다.
JetsonTX2가 Pan이 안돌면 보통상태이다.
$ sudo cp ~/download/*.jpg // 크롬을 이용하여 자동차 , 고양이 사진 download $ sudo cp onnx_to_tensorrt.py jhleetest.py // 다른 그림 테스트 용 (권한 root ) $ sudo vi jhleetest.py // 관련부분 car.jpg 수정 $ sudo python jhleetest.py // 빠른실행 Car 변환 Reading engine from file yolov3.trt Running inference on image car.jpg... [[ 120.05153578 152.42545467 966.89172317 486.4402317 ] [ 89.13414976 131.88476328 1018.99139214 434.55479845]] [0.96183921 0.78680305] [2 7] Saved image with bounding boxes of detected objects to car_bboxes.png. $ sudo vi jhleetest.py // 관련부분 cat.jpg 수정 $ sudo python jhleetest.py //빠른실행 Reading engine from file yolov3.trt Running inference on image cat.jpg... [[113.97585209 53.73459241 781.95893924 365.30765023]] [0.85985616] [15] Saved image with bounding boxes of detected objects to cat_bboxes.png. $ eog car_bboxes.png $ eog cat_bboxes.png
- 최상의 상태에서 테스트 진행
아래의 Command를 먼저 주고 실행
$ sudo nvpmodel -m 0 $ sudo ~/jetson_clocks.sh
jetson_clock.sh (Jetpack 4.2)
https://devtalk.nvidia.com/default/topic/1049117/jetson-agx-xavier/jetpack-4-2-missing-jetson_clocks-sh-/
관련사항
https://devtalk.nvidia.com/default/topic/1047018/tensorrt/yolov3_to_onnx-py-sample-failure/
TensorRT backend for ONNX
https://github.com/onnx/onnx-tensorrt#tests
NVIDIA Multimedia
https://docs.nvidia.com/jetson/archives/l4t-multimedia-archived/l4t-multimedia-281/index.html
2.2 ONNX Model
ONNX Model에 관련된 아래의 예제가 별도로 있으며, 아래의 예제는 추후 시간이 되면 실행을 해보고 관련된 내용을 습득한다.
아래의 문서를 읽어보면, Profile 에 관한내용 및 Optimization 및 좋은 내용이 많이 있으므로 추후 반드시 관련내용을 알아두자.
How to Speed Up Deep Learning Inference Using TensorRT
https://devblogs.nvidia.com/speed-up-inference-tensorrt/
사용할지 안할지 모르겠지만, Jetson TX2에 아래의 같은 Sample이 존재하여 더 첨부해서 넣는다.
Visionworks를 사용하는 예제이며, Jetsonhacks에서 제공하는데라 따라하면, 쉽게 실행을 할수 있다.
https://www.youtube.com/watch?v=tFrrCrSTCig
https://www.youtube.com/watch?v=KROP46Wte4Q&t=552s