Jeonghun (James) Lee: 11월 2019

11/29/2019

Docker GitLab-CE 설치 및 기본이용확인

1. Docker로 이용 Gitlab-CE Server 설치 및 실행

Docker Gitlab-CE 설치 및 기본운영

Gitlab에서 쉽게 설명이 잘되어있어 따라하기도 쉽고 따라하기 좋다
https://docs.gitlab.com/omnibus/docker/
https://hub.docker.com/r/gitlab/gitlab-ce/tags/

Gitlab 직접 설치 Ubuntu

https://about.gitlab.com/install/#ubuntu

문서를 읽다가 보면 현재 Docker Image에는 E-Mail을 위한 SMTP Server가 설치가 되어있지 않다고한다.
나중에 필요하다면 SMTP Server 설치를 진행해서도 테스트를 해보자

기존에 SSH Server가 설치되어 정지

$ sudo service ssh status  //SSH Server 상태확인 
or 
$ sudo systemctl status ssh

$ sudo service ssh stop // SSH Server 정지  
or
$ sudo systemctl stop ssh

$ sudo systemctl enable/disable ssh //SSH Service enable/disable

$ systemctl list-units --type service
$ systemctl list-units --type service --all

1.1 Image Download and Run Gitlab-CE Container

기존으로 Background(detach)로 돌리것이므로, Docker의 설정인 Terminal mode 설정을 비롯하여 다른 설정이 필요없다.
본인이 Terminal에 들어가서 확인하고 싶다면 docker exec 로 이용하여 확인한다

$ docker pull gitlab/gitlab-ce

$ docker run --detach \
  --hostname gitlab.example.com \
  -p 443:443 -p 80:80 -p 22:22 \
  --name gitlab \
  --restart always \
  -v /srv/gitlab/config:/etc/gitlab \
  -v /srv/gitlab/logs:/var/log/gitlab \
  -v /srv/gitlab/data:/var/opt/gitlab \
  gitlab/gitlab-ce:latest

$ docker ps -a  // Container 동작확인 
CONTAINER ID        IMAGE                     COMMAND             CREATED             STATUS                 PORTS                                                          NAMES
e96d2500ff76        gitlab/gitlab-ce:latest   "/assets/wrapper"   2 hours ago         Up 2 hours (healthy)   0.0.0.0:22->22/tcp, 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   gitlab

PortMapping

SSH : 22
HTTP: 80
HTTPS:443

이미 상위 Server를 사용한다면, 다른 Port로 Mapping하여 사용하자

Host 와 Container Data 공유

/srv/gitlab/data For storing the GitLab configuration files
/srv/gitlab/logs For storing logs
/srv/gitlab/config For storing the GitLab configuration files

상위설정대로 하면 Host의 /srv/gitlab 에 모든 정보가 저장되어진다.

hostname

hostname은 나의 경우 없기때문에 그냥 기존대로 실행했으며, 추후 DDNS를 이용하여 hostname을 생성 한 후 나중에 다시 테스트 진행

1.2 Git Lab의 Config 설정

Gitlab 의 설정은 /etc/gitlab/gitlab.rb 에서 하면되며, 자세한 내용은 아래의 세부 설정부분에서 참조

GitLab Config 설정 및 확인

$ docker exec -it gitlab /bin/bash
root@gitlab:/#   vi /etc/gitlab/gitlab.rb 

or 

$ docker exec -it gitlab editor /etc/gitlab/gitlab.rb

Config의 SMTP 관련설정
https://docs.gitlab.com/omnibus/settings/smtp.html

Config HTTPS 관련설정
https://docs.gitlab.com/omnibus/settings/nginx.html#enable-https

Gitlab Docker 재실행

$ docker restart gitlab  //설정 변경후 Container 재시작

1.3 Gitlab 관리 부분

Data Backup 확인을 위해 Container 삭제 후 다시 재시작

여러명의 ID를 만들고 데이타를 저장을 한 후 Gitlab Container를 삭제 후 다시 시작을 해보면 제대로 /srv/gitlab 에 Backup 되었는지 확인가능하다

$ docker ps -a  // Container 동작확인 
CONTAINER ID        IMAGE                     COMMAND             CREATED             STATUS                 PORTS                                                          NAMES
e96d2500ff76       gitlab/gitlab-ce:latest   "/assets/wrapper"   2 hours ago         Up 2 hours (healthy)   0.0.0.0:22->22/tcp, 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   gitlab

$ docker stop e96d2500ff76   // Container 정지 

$ docker rm e96d2500ff76    // Container 삭제

상위 docker run 명령어를 사용하여 다시 테스트 진행 (문제 없음확인)

GitLab의 Log 확인

$ docker logs gitlab | tail

2. Gitlab-CE 기본설정 및 확인

아래의 링크로 접속 (본인 IP접속가능 or Hostname)
http://localhost

Root Password를 설정

New User를 등록하고 관리시작
( Root 권한이라서 그런가 Sign in 이외 Resister 가 존재)

그룹 과 개인으로 구분해서 사용하자

그룹으로 생성

개인 Project 생성 및 기본 테스트

아직 Composer를 이용하여 별도설정 해보지 못했으며, 현재 테스트 용도로만 사용하고 있다.
그리고, Docker를 동시에 여러개 사용하지도 않는다.

Gitlab-CE 와 EE의 차이
http://developer.gaeasoft.co.kr/development-guide/gitlab/gitlab-introduce/

Raspberry PI에서 직접 GitLab Server 이용

https://hackernoon.com/create-your-own-git-server-using-raspberry-pi-and-gitlab-f64475901a66
https://projects.raspberrypi.org/en/projects/getting-started-with-git

11/14/2019

Custom Object Detection SSD / Faster RCNN 실행 및 분석 (3차분석)

1. Tensorflow 및 Custom Object Detection 위한 준비

Object Detection을 위한 준비를 위해서 아래와 같이 설치를 진행한다.

Tensorflow 설치를 진행
필요 Python Package / 필요 Package 설치진행
Model을 Download하여 진행

NVIDIA Docker 및 SSD Traning 2차분석
https://ahyuo79.blogspot.com/2019/10/docker-tensorflow.html

NVIDIA Docker 및 Tensorflow 기본 사용법
https://ahyuo79.blogspot.com/2019/10/nvidia-docker.html

IOU 기능

https://ahyuo79.blogspot.com/2019/09/iou-intersection-over-union.html

Tensorflow Model
https://github.com/tensorflow/models

Tensorflow Model Branch 확인
Tensorflow의 Version에 Model source의 branch 변경하여 download
https://github.com/tensorflow/models/branches

1.1 Tensorflow 직접설치 및 설정

Tensorflow Object Detection를 사용하기 위해서는 아래와 같이 먼저 Tensorflow를 설치하고, 이후에 Object Detection Model을 Download와 관련 Package 설치한다.

Custom Object Detection 설치가이드
  https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html

1.2 General Tensorflow Docker 이용

Tensorflow Docker 기반으로 아래의 Model version을 Download하여 하나의 Image로 생성후 이를 진행하자.
이때 주의해야한 것은 Tags의 정보와 Tensorflow의 Version 일 것 같다.

Docker의 Tag 의미
  https://www.tensorflow.org/install/docker?hl=ko

Tensorflow Docker
Tensorflow version 과 상위의 model version을 같이 맞추도록하자
  https://hub.docker.com/r/tensorflow/tensorflow
  https://hub.docker.com/r/tensorflow/tensorflow/tags

1.3 NVIDIA Tensorflow Docker 이용

기존에 NVIDIA Tensorflow Docker를 설치하였던 것으로 이용 Object Detection을 사용가능.

NVIDIA Tensorflow Docker
  https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow/tags

이전에 NVIDIA SSD Docker 관련분석 참조
  https://ahyuo79.blogspot.com/2019/10/docker-tensorflow.html

2. Custom Data SET 구성

우선 다들 개와 고양이 사진으로 기본적으로 Custom DATA SET를 만들어 테스트를 진행하기에 나도 역시 쉽게 할수 있는 방법으로 시작

개와 고양이 사진 구하기 (DATASET)

$ cd ~/works/custom
$ git clone https://github.com/hardikvasa/google-images-download.git
$ cd google-images-download
$ python google_images_download/google_images_download.py --keywords "dogs" --size medium --output_directory ~/works/custom/data/
$ python google_images_download/google_images_download.py --keywords "cats" --size medium --output_directory ~/works/custom/data/

google image download 구할 수 있는 이미지들은 현재 제한적이며, 최대 100개까지 download가 가능하다.
옵션에서 limit를 100이상을 늘려도 한번에 100개이상의 image를 구할 수 없다.

google_image_download
https://google-images-download.readthedocs.io/en/latest/installation.html

google_image_download argument
https://google-images-download.readthedocs.io/en/latest/arguments.html

Image 정리 및 구성

$ cd ~/works/custom/data/
$ mkdir images        // Image들을 한곳정리  
$ mkdir annotation    // LableImg의 XML 저장장소 
$ mv ./dogs/*.jpg images/
$ mv ./cats/*.jpg images/

Annotation (LabelImg 사용, PascalVOC저장 )

$ cd ~/works/custom/labelImg    // labelImg 이미 이전에 설치됨
$ cat data/predefined_classes.txt   // Default Class 확인(개,고양이 있음), 만약 이름이 없다면, 새로생성 
dog
person
cat
tv
car
meatballs
marinara sauce
tomato soup
chicken noodle soup
french onion soup
chicken breast
ribs
pulled pork
hamburger

$ python3 labelImg.py   ~/works/custom/data/images    // images 안에 같이 xml 저장

주의사항
lableImg 실행 후 XML저장위치를 반드시 Change Save Dir ~/works/custom/data/annotation 설정
상위 정의 된 class의 순서가 달라도 상관 없지만 상위 이름과 label_map.pbtxt의 이름만 동일하면 된다.

https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#annotating-images

Label_map 정의

$ cd ~/works/custom/data
$ vi label_map.pbtxt
item {
    id: 1
    name: 'cat'
}

item {
    id: 2
    name: 'dog'
}

https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#creating-label-map

1.1 TF Record File 생성

다른블로그 혹은 Tensorflow 예제 사이트를 보면 XML->CSV 후 변환 CSV->TFRecord 로 변환하도록 하는데,
다른 소스들을 간단히 분석해보면 TF Record 작업은 거의 비슷한데, 왜 두번을 해야하는지 이해를 못해 아래와 같이 직접 변경시도

TF Record 만드는 법

현재 이방식으로 진행을 하지 않음
https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#creating-tensorflow-records

TF_RECORD 생성

tf_record는 반드시 Tensorflow가 설치된 상태에서 실행가능

 root@3aac229c45c3:/workdir/models/research# pip install lxml
## 이전처럼 --data-dir path 주의 
root@3aac229c45c3:/workdir/models/research# python create_pascal_tf_record.py \
 --data_dir=/data \
 --annotations_dir=/data/annotation \
 --label_map_path=/data/label_map.pbtxt \
 --output_path=/data/pascal.record

Lablelimg TF Record 생성방법
https://ahyuo79.blogspot.com/2019/11/coco-set-annotation-tools.html

2. Custom Training/Evolution

Custom Model을 두개를 이용하여 테스트를 해보고 비교

2.1 Pre-trained Model Download

SSD (Single Shot MultiBox Detector)는 Feature extractor 용으로 별도의 Network를 구성해서 사용하고 있는데, 그 부분을 Download하여 기본구성을 갖춘다.

check 기본구성

$ cd ~/works/custom/check
$ mkdir -p models/configs
$ mkdir -p models/resnet_v1_50_2016_08_28
$ mkdir -p train_resnet                  //SSD-Resnet50   의  Checkpoint directory (Training 후 생성됨)
$ mkdir -p train_inception               //SSD-Inceptionv2 의  checkpoint directory (Training 후 생성됨)
$ mkdir -p fasterrcnn_train_resnet       //Faster RCNN-Resnet50 의 checkpoint directory (Training 후 생성됨)

Resnet 50 Download

$ cd ~/works/custom/check/models
$ wget http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
$ tar -xzf resnet_v1_50_2016_08_28.tar.gz
$ mv resnet_v1_50.ckpt resnet_v1_50_2016_08_28/model.ckpt

InceptionV2 Download

$ cd ~/works/custom/check/models
$ wget http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_11_06_2017.tar.gz
$ tar -xzf ssd_inception_v2_coco_11_06_2017.tar.gz

Pre-Trained model 정보
https://github.com/tensorflow/models/tree/master/research/slim

check 의 model 구성

$ cd ~/works/custom/check/models
$ tree 
.
├── configs                          // Pipeline Config 저장장소 (Resnet , Inceptionv2 ) 
├── resnet_v1_50_2016_08_28          // Resnet 50 (Pre-trained Model)
│   └── model.ckpt                   // checkpoint   
├── resnet_v1_50_2016_08_28.tar.gz
├── ssd_inception_v2_coco_11_06_2017    // Inception V2 (Pre-trained Model)
│   ├── frozen_inference_graph.pb          // Inception Pb file 
│   ├── graph.pbtxt                        // Inception Graph 구성 
│   ├── model.ckpt.data-00000-of-00001     // checkpoint
│   ├── model.ckpt.index
│   └── model.ckpt.meta
└── ssd_inception_v2_coco_11_06_2017.tar.gz

2.2 SSD / Faster RCNN Pipeline 설정

SSD의 경우 feature extractor로 Resnet 50 와 Inception V2 로 사용가능하며, 다른 Network로도 구성가능하다.
그리고, Pipleline의 Field들은 *.proto 에 선언이 되어있어야 동작이 가능한 것 같다.
나중에 시간이 된다면 면밀히 다시 봐야할 것 같다.

Docker Container 실행

$ docker run --gpus all --rm -it \
--shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
-p 8888:8888 -p 6006:6006  \
-v /home/jhlee/works/custom/data:/data \
-v /home/jhlee/works/custom/check:/checkpoints \
--ipc=host \
--name nvidia_ssd \
nvidia_ssd

SSD-Resnet 50 Pipeline 설정변경

root@f46c490016e0:/workdir/models/research# cp configs/ssd320_full_1gpus.config  /checkpoints/models/configs
root@f46c490016e0:/workdir/models/research# vi /checkpoints/models/configs/ssd320_full_1gpus.config 

model {
  ssd {
    inplace_batchnorm_update: true
    freeze_batchnorm: true
    num_classes: 2    # label 갯수 (Cat/Dog) 
    box_coder {
      faster_rcnn_box_coder {   
        y_scale: 10.0              
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5    ## 테스트시, output_dict['detection_scores']가 0.5 이상인것만 
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
...
    image_resizer {            # 
      fixed_shape_resizer {
        height: 320
        width: 320
      }
    }
....

    feature_extractor {
      type: 'ssd_resnet50_v1_fpn'  # SSD의 feature extractor를 resnet 50 사용 
      fpn {
        min_level: 3
        max_level: 7
      }
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.0004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          scale: true,
          decay: 0.997,
          epsilon: 0.001,
        }
      }
      override_base_feature_extractor_hyperparams: true
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
          alpha: 0.25
          gamma: 2.0
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    normalize_loc_loss_by_codesize: true
    post_processing {                    ## post process 설정확인 
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100    ## Class당 100개설정         output_dict['detection_classes'] 
        max_total_detections: 100        ## Max detection 100개 설정  output_dict['num_detections']
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  fine_tune_checkpoint: "/checkpoints/models/resnet_v1_50_2016_08_28/model.ckpt"
  fine_tune_checkpoint_type: "classification"
  batch_size: 2            # OUT OF MEMORY 문제로 32->2 변경, GPU Memory가 많다면 그대로  
  sync_replicas: true
  startup_delay_steps: 0
  replicas_to_aggregate: 8
  num_steps: 100         # steps 100000 -> 1000  (간단히 테스트용으로 변경, 실제 Training은 원래대로 )
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
....


train_input_reader: {
  tf_record_input_reader {
    input_path: "/data/pascal.record"  # train TF Record 
  } 
  label_map_path: "/data/label_map.pbtxt" # label_map.pbtxt
}

eval_config: {
  #metrics_set: "coco_detection_metrics"
  #use_moving_averages: false
  num_examples: 8000   # eval 하지 않을 것이므로, 그대로 유지 
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/data/pascal.record"  # 현재 eval을 위한 tfrecord가 별도로 없음(Training과 동일하게 설정) 
  }
  label_map_path: "/data/label_map.pbtxt" # 설정만 변경 추후 
  shuffle: false
  num_readers: 1
}

Faster RCNN-Resnet 50 Pipeline 설정변경

root@f46c490016e0:/workdir/models/research# cp ./object_detection/samples/configs/faster_rcnn_resnet50_coco.config  /checkpoints/models/configs
root@f46c490016e0:/workdir/models/research# vi /checkpoints/models/configs/faster_rcnn_resnet50_coco.config
model {
  faster_rcnn {
    num_classes: 2     # label 갯수 90->2 (Cat/Dog) 
    image_resizer {                 #  
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet50'     ## Resnet 50 사용확인 
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }

....
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6       ## IOU threhold 도 조절가능  
        max_detections_per_class: 100      ## 이전과 동일하게 Post Processing으로 Class당 Max 100개 
        max_total_detections: 300          ## 이전과 다르게 MAX 300 설정됨 
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}


train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "/checkpoints/models/resnet_v1_50_2016_08_28/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 300          ### 전체 Step 수 200000->300 (임시테스트를 위해 변경)
  data_augmentation_options { 
    random_horizontal_flip {
    }
  }
}

....  
###  상위 SSD와 동일 

train_input_reader: {
  tf_record_input_reader {
    input_path: "/data/pascal.record"  # train TF Record 
  } 
  label_map_path: "/data/label_map.pbtxt" # label_map.pbtxt
}

eval_config: {
  num_examples: 8000                                  ## evalution 
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/data/pascal.record"  # 설정만 변경 추후 eval을 사용할 경우 다시 변경 
  }
  label_map_path: "/data/label_map.pbtxt" # 설정만 변경 추후 
  shuffle: false
  num_readers: 1
}

Faster RCNN Precision FP32로 변경해서 실행해야하며, 현재 optimaizer 부분이 문제가 있다.
일단 Training은 되지만 관련부분을 자세히 볼 필요가 있다.

SSD Inception v2 Pipeline 설정변경

root@f46c490016e0:/workdir/models/research# cp ./object_detection/samples/configs/ssd_inception_v2_coco.config  /checkpoints/models/configs 
root@f46c490016e0:/workdir/models/research# vi /checkpoints/models/configs/ssd_inception_v2_coco.config 

model {
  ssd {
    num_classes: 2   ## Lable Number , label_map.pbtxt 참조 
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5   ## 테스트시, output_dict['detection_scores']가 0.5 이상인것만 
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }

..........

    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }

..........
    feature_extractor {
      type: 'ssd_inception_v2'    # SSD의 feature_extractor를 Inception_v2로 사용 
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
      override_base_feature_extractor_hyperparams: true
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6                 ## IOU Threshold 
        max_detections_per_class: 100  ## Class당 100개설정         output_dict['detection_classes'] 
        max_total_detections: 100      ## Max detection 100개 설정  output_dict['num_detections']
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 6     ## 24 -> 6  나의 경우 GPU 성능문제로 변경 
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "/checkpoints/ssd_inception_v2_coco_11_06_2017/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 1000      ## 20000 -> 1000   랩탑에서 조금만 테스트하기 위해 변경 
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/data/pascal.record"          ## Train Record
  }
  label_map_path: "/data/label_map.pbtxt"      ## Train Labelmap
}

eval_config: {
  num_examples: 8000                                  ## evalution 
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/data/pascal.record"                 ## evalution의 test record
  }
  label_map_path: "/data/label_map.pbtxt"             ## evalution의 label map 
  shuffle: false
  num_readers: 1
}

세부 분석은 이전의 SSD 분석참조

eval_config 의 num_example
https://github.com/tensorflow/models/issues/5059
https://stackoverflow.com/questions/47086630/what-does-num-examples-2000-mean-in-tensorflow-object-detection-config-file

2.3 SSD / Faster RCNN Training

NVIDIA에서는 쉽게 Training 할 수 있도록 Shell Script로 쉽게 설정하였다. SSD의 경우 Precision을 FP16으로 사용하고 있지만,
Faster RCNN은 FP16으로 하면 에러가 발생하므로 주의해야한다.
간단히 Shell Script 내부를 보면 ./object_detection/model_main.py를 이용하여 실행하므로 이것으로 직접 실행해도 무방하다

Training Shell Script 수정 및 기본분석

root@1bfb89078878:/workdir/models/research# vi ./examples/SSD320_FP16_1GPU.sh     //Pipeline Config 부분 확인 및 수정 
CKPT_DIR=${1:-"/results/SSD320_FP16_1GPU"}
### Pipeline 추가하고 Resnet50 or InceptionV2 중 선택사용 
## SSD-Resnet 50 Pipleline  (FP16지원, 기본설정 )
PIPELINE_CONFIG_PATH=${2:-"/workdir/models/research/configs"}"/ssd320_full_1gpus.config"

## SSD-Inception v2 Pipeline  (FP16지원, 추가설정)
#PIPELINE_CONFIG_PATH=${2:-"/workdir/models/research/configs"}"/ssd_inception_v2_coco.config"

## Fastter RCNN-Resnet 50 Pipleline (FP32로만 사용)
#PIPELINE_CONFIG_PATH=${2:-"/workdir/models/research/configs"}"/faster_rcnn_resnet50_coco.config"

#FP16 PRESCISON MODE로 설정 (FP32로 설정시 주석처리) 
export TF_ENABLE_AUTO_MIXED_PRECISION=1


TENSOR_OPS=0
export TF_ENABLE_CUBLAS_TENSOR_OP_MATH_FP32=${TENSOR_OPS}
export TF_ENABLE_CUDNN_TENSOR_OP_MATH_FP32=${TENSOR_OPS}
export TF_ENABLE_CUDNN_RNN_TENSOR_OP_MATH_FP32=${TENSOR_OPS}

time python -u ./object_detection/model_main.py \
       --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
       --model_dir=${CKPT_DIR} \
       --alsologtostder \
       "${@:3}"

상위에서 본인 이 사용하고 싶은 Pipeline 을 정하고 아래와 같이 실행

SSD-Resnet 50 Training

root@f46c490016e0:/workdir/models/research#  bash ./examples/SSD320_FP16_1GPU.sh /checkpoints/train_resnet /checkpoints/models/configs 
// 1st checkpoints path , output
// 2nd pipeline path
..........

Training 결과 인 checkpoint는 이곳에 저장: /checkpoints/train_resnet

Fast-RCNN-Resnet 50 Training

root@f46c490016e0:/workdir/models/research#  bash ./examples/SSD320_FP16_1GPU.sh /checkpoints/fasterrcnn_train_resnet /checkpoints/models/configs 
// 1st checkpoints path , output
// 2nd pipeline path
..........

Training 결과 인 checkpoint는 이곳에 저장: /checkpoints/fasterrcnn_train_resnet

SSD-Inception V2 Training

root@1bfb89078878:/workdir/models/research# bash ./examples/SSD320_FP16_1GPU.sh /checkpoints/train_inception /checkpoints/models/configs 
// 1st checkpoints path , output
// 2nd pipeline path

Training 결과 인 checkpoint는 이곳에 저장: /checkpoints/train_inception

Training 후 생성된 CheckPoint File 확인 (e.g SSD-Resnet50)

root@f46c490016e0:/workdir/models/research# ls /checkpoints/train_resnet/
checkpoint                                   graph.pbtxt                       model.ckpt-0.index                  model.ckpt-300.data-00001-of-00002
eval                                         model.ckpt-0.data-00000-of-00002  model.ckpt-0.meta                   model.ckpt-300.index
events.out.tfevents.1574317170.7bdf29dc41cb  model.ckpt-0.data-00001-of-00002  model.ckpt-300.data-00000-of-00002  model.ckpt-300.meta

TF_ENABLE_AUTO_MIXED_PRECISION 관련내용
https://medium.com/tensorflow/automatic-mixed-precision-in-tensorflow-for-faster-ai-training-on-nvidia-gpus-6033234b2540

2.4 SSD validation/evaluation

Training 중 일부를 사용한다고 하며, Training 중 검증을 하기 위해서 사용한다고 하는데, 정확한 설정과 관련부분을 이해 해야 할 것 같다.

Shell script 수정

root@f46c490016e0:/workdir/models/research# vi examples/SSD320_evaluate.sh  //아래와 같이 pipeline 설정 
CHECKPINT_DIR=$1

TENSOR_OPS=0
export TF_ENABLE_CUBLAS_TENSOR_OP_MATH_FP32=${TENSOR_OPS}
export TF_ENABLE_CUDNN_TENSOR_OP_MATH_FP32=${TENSOR_OPS}
export TF_ENABLE_CUDNN_RNN_TENSOR_OP_MATH_FP32=${TENSOR_OPS}

## Resnet or Inception 선택 
python object_detection/model_main.py --checkpoint_dir $CHECKPINT_DIR --model_dir /results --run_once --pipeline_config_path /checkpoints/models/configs/ssd320_full_1gpus.config

# python object_detection/model_main.py --checkpoint_dir $CHECKPINT_DIR --model_dir /results --run_once --pipeline_config_path /checkpoints/models/configs/ssd_inception_v2_coco.config

validation 실행

root@f46c490016e0:/workdir/models/research# bash examples/SSD320_evaluate.sh /checkpoints/train_resnet 
or 
root@f46c490016e0:/workdir/models/research# bash examples/SSD320_evaluate.sh /checkpoints/train_inception

상위 결과를 Tensorboard로 확인하고자 하면, 아래의 위치로 변경해서 확인

root@f46c490016e0:/workdir/models/research# ls /results/eval/    // /result/eval Tensorboard Log 생성 
events.out.tfevents.1574322912.74244b7e90c7

2.5 Training 과 Validation 기본분석

Training 과 Validation 명령어는 아래의 명령어로 동일하며, 현재 생각으로는 Training 만 해도 Validation도 같이 동작되는 것으로 생각이 된다.
그리고, pipeline config에 이미 관련 옵션을 설정을 했기 때문에 validation도 진행을 하는 것으로 생각하며,

이유는 Training 만 돌려도 Tensorboard의 Validation Log까지 나오는 것으로 봐도 그렇다.

이전의 Validation 전용 명령어는 --run_once를 넣어 eval-only 한번 돌리는 것 뿐인 것 같다.

root@f46c490016e0:/workdir/models/research# python object_detection/model_main.py -h

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Binary to run train and evaluation on object detection model.
flags:

object_detection/model_main.py:
  --[no]allow_xla: Enable XLA compilation
    (default: 'false')
  --checkpoint_dir: Path to directory holding a checkpoint.  If `checkpoint_dir` is provided, this binary operates in eval-only mode, writing
    resulting metrics to `model_dir`.
  --eval_count: How many times the evaluation should be run
    (default: '1')
    (an integer)
  --[no]eval_training_data: If training data should be evaluated for this job. Note that one call only use this in eval-only mode, and
    `checkpoint_dir` must be supplied.
    (default: 'false')
  --hparams_overrides: Hyperparameter overrides, represented as a string containing comma-separated hparam_name=value pairs.
  --model_dir: Path to output model directory where event and checkpoint files will be written.
  --num_train_steps: Number of train steps.
    (an integer)
  --pipeline_config_path: Path to pipeline config file.
  --[no]run_once: If running in eval-only mode, whether to run just one round of eval vs running continuously (default).
    (default: 'false')
  --sample_1_of_n_eval_examples: Will sample one of every n eval input examples, where n is provided.
    (default: '1')
    (an integer)
  --sample_1_of_n_eval_on_train_examples: Will sample one of every n train input examples for evaluation, where n is provided. This is only used if
    `eval_training_data` is True.
    (default: '5')
    (an integer)

root@f46c490016e0:/workdir/models/research# vi python object_detection/model_main.py 
..........
  if FLAGS.checkpoint_dir:
    if FLAGS.eval_training_data:    ## 기본이 FALSE
      name = 'training_data'
      input_fn = eval_on_train_input_fn   
    else:
      name = 'validation_data'     ## name은 이것으로 설정 
      # The first eval input will be evaluated.
      input_fn = eval_input_fns[0]
    if FLAGS.run_once:             ## validation 할 경우 이곳만 실행 
      estimator.evaluate(input_fn,
                         steps=None,
                         checkpoint_path=tf.train.latest_checkpoint(
                             FLAGS.checkpoint_dir))
    else:                          ##  Training 할 경우 이곳 실행 
      model_lib.continuous_eval(estimator, FLAGS.checkpoint_dir, input_fn,
                                train_steps, name)  
.........

이외에도 간단한 training 하는 명령어가 존재하며, 그것을 사용해도 상관 없다.

3. Inference (chpt -> pb)

Training 이 종료가 되면 아래와 같이 최종 Inference를 위해서 pb파일로 변경
파이프라인의 step의 숫자에 따라 checkpoint 파일명은 달라지므로, 본인의 설정에 따라 아래 명령도 변경

SSD-Resnet 50 inference

root@f46c490016e0:/workdir/models/research# python object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path /checkpoints/models/configs/ssd320_full_1gpus.config \
    --trained_checkpoint_prefix  /checkpoints/train_resnet/model.ckpt-100 \
    --output_directory /checkpoints/train_resnet/inference_graph_100

SSD-Inception V2 inference

root@f46c490016e0:/workdir/models/research# python object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path /checkpoints/models/configs/ssd_inception_v2_coco.config \
    --trained_checkpoint_prefix  /checkpoints/train_inception/model.ckpt-100 \
    --output_directory /checkpoints/train_inception/inference_graph_100

Faster RCNN-Resnet 50 inference

root@f46c490016e0:/workdir/models/research# python object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path /checkpoints/models/configs/faster_rcnn_resnet50_coco.config \
    --trained_checkpoint_prefix  /checkpoints/fasterrcnn_train_resnet/model.ckpt-100 \
    --output_directory /checkpoints/fasterrcnn_train_resnet/inference_graph_100

4. Tensorboard 로 확인

Training or Validation 이 종료된 후 Tensorflow의 Log를 분석
Training에 관련된 부분만 분석

Tensorboard

root@f46c490016e0:/workdir/models/research# tensorboard --logdir=/checkpoints/train_resnet  // SSD-Resnet50
or 
root@f46c490016e0:/workdir/models/research# tensorboard --logdir=/checkpoints/train_inception     //SSD-Inceptionv2  
or
root@f46c490016e0:/workdir/models/research# tensorboard --logdir=/checkpoints/fasterrcnn_train_resnet     //Faster RCNN-Resnet50

Tensorboard Browser 연결

http://localhost:6006/

5. Object Detection TEST

jupyter를 이용하여 상위에서 만들어진 pb파일을 이용하여 Test Image를 준비하고 관련 소스를 수정하여 최종 테스트를 진행하자

root@f46c490016e0:/workdir/models/research# jupyter notebook --ip=0.0.0.0 --port=8888 --allow-root

Jupyter 연결

http://localhost:8888/

object_detection/object_detection_tutorial.ipynb 를 실행하여 검증

5.1 object_detection_tutorial.ipynb 수정사항

현재 inference 한 pb파일을 가지고 object_detection/object_detection_tutorial.ipynb 에서 소스를 수정하여 가볍게 테스트가 가능하다.

Download 미실행하며, Variables 의 수정

MODEL_NAME = '/checkpoints/train_resnet/inference_graph_100'
PATH_TO_LABELS = os.path.join('/data', 'label_map.pbtxt')
PATH_TO_TEST_IMAGES_DIR = '/data/test_images'

기존의 소스는 Download를 진행하여 Pre-trained 된 모델을 바로 이용하는 것이지만, 이를 우리가 inference한 것으로 변경하고
TEST Image하여 테스트를 진행하자

GPU Memory 문제발생시 추가

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config)

Allocator (GPU_0_bfc) ran out of memory trying to allocate
https://eehoeskrap.tistory.com/360

5.2 object_detection_tutorial.ipynb 기본소스 이해

이 소스의 중요 포인트는 run_inference_for_single_image 이며 이곳에서 나온 출력 값을 test 이미지에 적용하여 테스트해보는 것이다.

아래의 key in에 있는 정보들은 반드시 상위 정의된 pipeline config와 연동이 되며, 이 부분을 알아두도록하자. (SSD기준)

def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])       ### 현재 Pipeline에서 100으로 정의해서 항상 100개를 찾음 
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)                                ### 내가 정의한 label_map.pbtxt 기준으로 100개를 찾음 
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]          ### bbox의 
      output_dict['detection_scores'] = output_dict['detection_scores'][0]        ### 100개의 각각의 Confidence를 알수 있지만, 화면에 표시되는 것은 Threshold값이 넘은 것들 
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,                                          ## image output
      output_dict['detection_boxes'],      ## bbox의  정보배열 100개  (4개의 정보)  ymin/ymax/xmin/ymax = box * height/ box * width
      output_dict['detection_classes'],    ## class  정보배열 100개   (1,2 )
      output_dict['detection_scores'],     ## confidence 정보배열 100개 (0.5 이상만표시)
      category_index,                                    ## 상위 내가 정의한 label_map.pbtxt 정보 
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)                                  ## line의 두께설정 
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image_np)
  plt.title(image_path)

 # print("num_detections",output_dict['num_detections'])         ### Training이 되어 Max 100개 를 찾음 (상위 SSD Pipeline 부분 참조)
 # print("detection_boxes",output_dict['detection_boxes'])       ### 찾은 100개 배열 의 box의 위치 
 # print("detection_classes",output_dict['detection_classes'])   ### 찾은 100개 배열 의 class 1 or 2 (현재 1,2만 선언)
 # print("detection_scores",output_dict['detection_scores'])     ### 찾은 100개 배열 의 confidence 이며 pipeline의 threshold 값 이상인 것만 화면 표시 

  for i,v in enumerate (output_dict['detection_scores']):   ### i : index  v: list의 member 
      if v > 0.5:                                           ### 100의 중에 0.5가 넘는 것만 표시 
        print("  - class-name:", category_index.get(output_dict['detection_classes'][i]).get('name') )   ### category_index는 상위 정의된 lable_map.pbtxt 적용하여 이름을 출력          
        print("  - confidence: ",v * 100 )                  ### percent로 변경

6. 결론

SSD / Faster RCNN은 기본적으로 잘동작하고 있지만, 나의 랩탑에서 간단한 테스트는 가능하지만,

STEPS를 늘려 최종 테스트를 하는것은 힘들어서 Server에서 돌렸다.
(Laptop에서 문제가 발생하는 것은 거의 GPU Memory관련 문제였음)
Laptop에서는 GPU Memory를 항상 봐야하며, 한계가 있으며, Server 다르게 동작하므로 주의하도록 하자.

그리고, Transfer Learning 과 Fine Tuning은 개인적으로 지인의 일때문에, 한 달간 진행했지만, 좀 더 하면 금방익숙해 질거라고 본다.

나중에 기회가 되면 다시한번해보지만, 너무 어렵게 생각할 필요 없다.

항상 GPU Memory 확인

$ watch -n 0.1 nvidia-smi

7. 기타 추후 참고사이트 및 참고사이트

기타 참고사이트

  https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md
  https://github.com/vijendra1125/Tensorflow_Object_detection_API-Custom_Faster_RCNN
  https://github.com/vijendra1125/Tensorflow_Object_detection_API-Custom_Faster_RCNN/issues/1

다양한 TFRecord Format 관련 부분

https://github.com/tensorflow/models/tree/master/research/object_detection/dataset_tools
https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pascal_tf_record.py

추후관련사이트들을 다시 보고 정리

Object Detection 관련 참조사이트

https://bcho.tistory.com/1192
https://ukayzm.github.io/python-object-detection-tensorflow/

TF-TRF (Tensorflow 와 TensorRT)

https://developers-kr.googleblog.com/2018/05/tensorrt-integration-with-tensorflow.html

CHPT 와 PT

  https://gusrb.tistory.com/21
  http://jaynewho.com/post/8
  https://goodtogreate.tistory.com/entry/Saving-and-Restoring

Custom Object Detection

  https://github.com/5taku/custom_object_detection
  https://github.com/engiego/Custom-Object-Detection-engiegocustom
  https://www.slideshare.net/fermat39/mlnet-automl?next_slideshow=1
  https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087
  https://medium.com/coinmonks/tensorflow-object-detection-with-custom-objects-34a2710c6de5
  https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73
  https://hwauni.tistory.com/entry/API-Custom-Object-Detection-API-Tutorial-%EB%8D%B0%EC%9D%B4%ED%84%B0-%EC%A4%80%EB%B9%84-Part-1
  https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/
  https://jameslittle.me/blog/2019/tensorflow-object-detection

Colab

https://hackernoon.com/object-detection-in-google-colab-with-custom-dataset-5a7bb2b0e97e
https://medium.com/analytics-vidhya/detecting-fires-using-tensorflow-b5b148952495

11/11/2019

LabelImg - Annotation

1. LabelImg 설치 및 수정

가장 많이 사용되어지는 Annotation Tools 인것 같으며, 저장되는 형식은 XML이며 Pascal VOC 와 YOLO로 구분되어 저장가능하다.

LabelImg 설치방법 및 소스
https://github.com/tzutalin/labelImg

LabelImg Download 및 필요 Package 설치

$ cd /works/custom
$ git clone https://github.com/tzutalin/labelImg.git
$ cd labelImg

$ sudo apt install pyqt5-dev-tools
$ sudo pip3 install -r requirements/requirements-linux-python3.txt
$ make qt5py3

소스기반으로 설치한 이유는 직접소스에서 HotKey를 변경하기 위해서 상위와 같이 설치
만약 쉽게설치하고 싶다면 pip install 로도 설치가능

참고
https://eehoeskrap.tistory.com/331

Source 에서 직접 아래와 같이 HotKey 변경

나의 경우 빠른 Annotation을 위해서 아래와 같이 Hotkey를 소스에서 수정하였다.
소스를 보면 python으로 작성이 되어있어 쉽게 이해가능하므로, 소스로 설치하여 본인이 원하는 곳을 고치자.

$ vi labelImg.py 
 212         openNextImg = action(getStr('nextImg'), self.openNextImg,
 213 #                             'd', 'next', getStr('nextImgDetail'))
 214                              'f', 'next', getStr('nextImgDetail'))
....
 216         openPrevImg = action(getStr('prevImg'), self.openPrevImg,
 217 #                             'a', 'prev', getStr('prevImgDetail'))
 218                              's', 'prev', getStr('prevImgDetail'))
....
 223         save = action(getStr('save'), self.saveFile,
 224  #                     'Ctrl+S', 'save', getStr('saveDetail'), enabled=False)
 225                       'a', 'save', getStr('saveDetail'), enabled=False)
....
 240         createMode = action(getStr('crtBox'), self.setCreateMode,
 241 #                            'w', 'new', getStr('crtBoxDetail'), enabled=False)
 242                             'e', 'new', getStr('crtBoxDetail'), enabled=False)
..
 246         create = action(getStr('crtBox'), self.createShape,
 247 #                        'w', 'new', getStr('crtBoxDetail'), enabled=False)
 248                         'e', 'new', getStr('crtBoxDetail'), enabled=False)

Hotkey
w Create a rect box : e 변경
d Next image : f 변경
a Previous image : s 변경
Ctrl + s Save : a 변경

상위와 같이 변경한 이유는 한 손에 전부 넣어 빨리 편집하기위해서 Hot Key를 변경

LabelImg 실행 (Args는 옵션)

1st Arg : 수정할 Image PATH

2nd Arg : Annotation 할때 붙여지는 Class 정의된 File을 직접 선택가능

미리 정의된 Class
https://github.com/tzutalin/labelImg/blob/master/data/predefined_classes.txt

$ python3 labelImg.py

$ python3 labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]

실행하면 좌측에 메뉴

OpenDir : Image 위치설정
Change Save Dir : Xml 저장위치설정

실행시 이전에 저장되어진 Xml 저장위치 기준으로 XML를 가져와서 BBOX를 표시를 해준다.

https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#annotating-images

1.1 파일이름 변경 방법

본인이 원하는 DATA Image들을 다운을 받고 여러 파일들을 한꺼 번에 이름을 변경하고자 할때 많을 것 같다.
아래와 같이 rename 명령어를 사용하거나 간단하게 Shell script를 이용하여 만들어서 이를 해결하자

rename 명령어를 이용하여 파일이름 변경

$ ls 
image_01.jpg image_02.jpg image_10.jpg 

$ rename 's/image/image_test/' *.jpg 
image_test_01.jpg image_test_02.jpg image_test_10.jpg

rename 명령어로 특정 Pattern이 있는 것을 찾아 이름을 변경을 해보자.
만약 특정 Pattern이 없다면 아래와 같이 Shell Script를 사용해서 본인이 원하는 대로 변경하자

rename.sh shell script 작성(rename 명령어로 한계가 있어 Shell Script 직접작성함)

	#!/bin/bash
	#
	# rename jpg files in sequence with same pattern
	#
	# Author : Jeonghun Lee
	# Version : 0.1
	#
	# ./rename.sh
	# or
	# ./rename.sh 1stArg 2ndArg
	CNT=0
	PREFIX=${1:-"image_"}
	POSTFIX=${2:-".jpg"}

	echo -e "\e[91mStart: rename all jpg files to ${PREFIX}x${POSTFIX}\e[39m\n"

	for FILE in *$POSTFIX
	do
	NAME=${PREFIX}${CNT}${POSTFIX}
	echo -e "ORG:$FILE \e[34m NEW:$NAME \e[39m Index:$CNT"
	mv $FILE $NAME
	CNT=$((CNT+1))
	done

	echo -e "\e[91mfinished \e[39m\n"

view raw rename.sh hosted with ❤ by GitHub

rename.sh shell script 실행

$ cd ~/works/custom/data/images
$ chmod +x rename.sh
$ ./rename.sh          // *.jpg 파일들을 찾아 image_xx.jpg로 변경 
or 
$ ./rename.sh Image_data .png  // *.png 파일들을 찾아 image_data_xx.png 로 변경

1.2 DATA SET 구성 과 Annotation 진행

DATA SET 구성

아래와 같이 구성하고 TEST할 Image들을 가져와서 Image 안에 넣어 구성을 한다.

$ cd ~/works/fire/data
$ mkdir -p images/train
$ mkdir -p images/test
$ mkdir -p annotation/train
$ mkdir -p annotation/test

$ tree -L 2
.
├── annotation
│   ├── test    // test xml  (pascal voc type)
│   └── train   // train xml (pascal voc type)
└── images
    ├── test    // test image (eval)
    └── train   // train image

LabelImg class 정의

$ cd /works/custom/labelImg

$ vi data/fire_classes.txt
fire
smoke

주의
상위 정의된 이름과 label_map.pbtxt에 정의된 이름이 완전히 동일해야 한다.
XML에서 이름만 가지고 찾아 찾기 때문에

labelImg (train)

Image path 와 class text path 과 함께 실행
좌측 Change Save Dir 로 XML 저장장소 ~/work/fire/data/annotation/train 변경

$ python3 labelImg.py ~/works/fire/data/images/train  data/fire_classes.txt

labelImg (test)

Image path 와 class text path 과 함께 실행
좌측 Change Save Dir 로 XML 저장장소 ~/work/fire/data/annotation/test 변경
변경을 하자마자 바로 적용이 안되므로 Prev Image / Next Image 로 Refresh

$ python3 labelImg.py ~/works/fire/data/images/test  data/fire_classes.txt

1.3 label map file 생성 및 정의

본인이 원하는 item을 정하여 각각의 name id를 정의해서 넣자

lable map 만들기

$ cd ~/works/fire/data  // 상위 이름과 동일
$ vi label_map.pbtxt
item {
    id: 1
    name: 'fire'
}

item {
    id: 2
    name: 'smoke'
}

https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#creating-label-map

1.4 TF Record File 변환

TF Record 는 기본으로 Tensorflow가 설치가 되어야 가능하므로 이전에 설치진행 혹은 Docker에서 진행

Tensorflow Manual
https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#creating-tensorflow-records

Tensorflow 실행 및 준비

$ docker run --gpus all --rm -it \
--shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
-p 8888:8888 -p 6006:6006  \
-v /home/jhlee/works/fire/data:/data \
--ipc=host \
--name nvidia_ssd \
nvidia_ssd

root@3aac229c45c3:/workdir/models/research# pip install lxml    //XML를 위해 필요 

root@3aac229c45c3:/workdir/models/research# vi create_pascal_tf_record.py  //아래의 소스로 작성

Pascal SET 을 TF Record 생성

주의해야할 것은 --data_dir을 /data/images/train or /data/images/test 로 하면 안된다
왜냐하면 아래의 소스를 보면 --data_dir 과 XML의 folder로 찾아 넣는다.

root@3aac229c45c3:/workdir/models/research#  python create_pascal_tf_record.py \
 --data_dir=/data/images \
 --annotations_dir=/data/annotation/train \
 --label_map_path=/data/label_map.pbtxt \
 --output_path=/data/train.record

/workdir/models/research/object_detection/utils/dataset_util.py:75: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
  if not xml:

root@3aac229c45c3:/workdir/models/research#  python create_pascal_tf_record.py \
 --data_dir=/data/images \
 --annotations_dir=/data/annotation/test \
 --label_map_path=/data/label_map.pbtxt \
 --output_path=/data/test.record

/workdir/models/research/object_detection/utils/dataset_util.py:75: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
  if not xml:

2. create_pascal_tf_record.py 소스 분석

create_pascal_tf_record.py를 간단히 분석을 해보면, XML 기반으로 JPEG Image를 넣어 TF Record를 만들어 넣는다.

	# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License.
	# ==============================================================================

	r"""Convert raw PASCAL dataset to TFRecord for object_detection.
	Example usage:

	python object_detection/dataset_tools/create_pascal_tf_record.py \
	--data_dir=/home/user/VOCdevkit \
	--output_path=/home/user/pascal.record
	--label_map_path=/home/user/dataset/label.pbtxt


	"""
	from __future__ import absolute_import
	from __future__ import division
	from __future__ import print_function

	import hashlib
	import io
	import logging
	import os

	from lxml import etree
	import PIL.Image
	import tensorflow as tf

	from object_detection.utils import dataset_util
	from object_detection.utils import label_map_util


	flags = tf.app.flags
	flags.DEFINE_string('data_dir', '', 'Root directory to raw PASCAL VOC dataset.')
	flags.DEFINE_string('set', 'train', 'Convert training set, validation set or '
	'merged set.')
	flags.DEFINE_string('annotations_dir', 'Annotations',
	'(Relative) path to annotations directory.')
	flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
	flags.DEFINE_string('label_map_path', 'data/pascal_label_map.pbtxt',
	'Path to label map proto')
	flags.DEFINE_boolean('ignore_difficult_instances', False, 'Whether to ignore '
	'difficult instances')
	FLAGS = flags.FLAGS

	SETS = ['train', 'val', 'trainval', 'test']

	def dict_to_tf_example(data,
	dataset_directory,
	label_map_dict,
	ignore_difficult_instances=False):
	"""Convert XML derived dict to tf.Example proto.
	Notice that this function normalizes the bounding box coordinates provided
	by the raw data.
	Args:
	data: dict holding PASCAL XML fields for a single image (obtained by
	running dataset_util.recursive_parse_xml_to_dict)
	dataset_directory: Path to root directory holding PASCAL dataset
	label_map_dict: A map from string label names to integers ids.
	ignore_difficult_instances: Whether to skip difficult instances in the
	dataset (default: False).
	Returns:
	example: The converted tf.Example.
	Raises:
	ValueError: if the image pointed to by data['filename'] is not a valid JPEG
	"""
	img_path = os.path.join(data['folder'], data['filename']) ## Image Path --data_dir /XML의 folder/filename
	full_path = os.path.join(dataset_directory, img_path)
	with tf.gfile.GFile(full_path, 'rb') as fid:
	encoded_jpg = fid.read()
	encoded_jpg_io = io.BytesIO(encoded_jpg)
	image = PIL.Image.open(encoded_jpg_io)
	if image.format != 'JPEG': ## JPEG만 가능
	raise ValueError('Image format not JPEG')
	key = hashlib.sha256(encoded_jpg).hexdigest()

	width = int(data['size']['width']) ## XML Information
	height = int(data['size']['height'])

	xmin = []
	ymin = []
	xmax = []
	ymax = []
	classes = []
	classes_text = []
	truncated = []
	poses = []
	difficult_obj = []
	if 'object' in data:
	for obj in data['object']: #XML의 object
	difficult = bool(int(obj['difficult'])) #XML의 difficult
	if ignore_difficult_instances and difficult:
	continue

	difficult_obj.append(int(difficult))

	xmin.append(float(obj['bndbox']['xmin']) / width) ## XML information
	ymin.append(float(obj['bndbox']['ymin']) / height)
	xmax.append(float(obj['bndbox']['xmax']) / width)
	ymax.append(float(obj['bndbox']['ymax']) / height)
	classes_text.append(obj['name'].encode('utf8'))
	classes.append(label_map_dict[obj['name']])
	truncated.append(int(obj['truncated']))
	poses.append(obj['pose'].encode('utf8'))

	example = tf.train.Example(features=tf.train.Features(feature={
	'image/height': dataset_util.int64_feature(height),
	'image/width': dataset_util.int64_feature(width),
	'image/filename': dataset_util.bytes_feature(
	data['filename'].encode('utf8')),
	'image/source_id': dataset_util.bytes_feature(
	data['filename'].encode('utf8')),
	'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')),
	'image/encoded': dataset_util.bytes_feature(encoded_jpg),
	'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')),
	'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
	'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
	'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
	'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
	'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
	'image/object/class/label': dataset_util.int64_list_feature(classes),
	'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
	'image/object/truncated': dataset_util.int64_list_feature(truncated),
	'image/object/view': dataset_util.bytes_list_feature(poses),
	}))
	return example


	def main(_):
	if FLAGS.set not in SETS:
	raise ValueError('set must be in : {}'.format(SETS))

	data_dir = FLAGS.data_dir ## --data_dir= image의 앞 directory , image 는 --data_dir 과 XML의 folder 이용
	writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
	label_map_dict = label_map_util.get_label_map_dict(FLAGS.label_map_path)
	annotations_dir = os.path.join(data_dir, FLAGS.annotations_dir)

	## FLAGS.annotations_dir --annotations_dir= 절대 PATH 설정하면 뒤의 설정으로 Join가능

	examples_list = os.listdir(annotations_dir)
	for el in examples_list:
	if el[-3:] !='xml':
	del examples_list[examples_list.index(el)]
	for el in examples_list:
	examples_list[examples_list.index(el)] = el[0:-4]

	for idx, example in enumerate(examples_list):
	if idx % 100 == 0:
	logging.info('On image %d of %d', idx, len(examples_list))
	path = os.path.join(annotations_dir, example + '.xml')
	with tf.gfile.GFile(path, 'r') as fid:
	xml_str = fid.read()
	xml = etree.fromstring(xml_str)
	data = dataset_util.recursive_parse_xml_to_dict(xml)['annotation'] ## XML annotation Parsing

	tf_example = dict_to_tf_example(data, FLAGS.data_dir, label_map_dict,
	FLAGS.ignore_difficult_instances)
	writer.write(tf_example.SerializeToString())

	writer.close()

	if __name__ == '__main__':
	tf.app.run()

view raw create_pascal_tf_record.py hosted with ❤ by GitHub

https://github.com/vijendra1125/Tensorflow_Object_detection_API-Custom_Faster_RCNN/blob/master/extra/create_pascal_tf_record.py

11/10/2019

COCO Set Annotation Tools

1. COCOSET Annotation Tools

COCOSET을 위한 Annotation Tools

사용을 해보면, COCOSET의 변경은 인터넷으로 연결되어 변경되어 Custom DATASET이 힘들것 같다
  https://eehoeskrap.tistory.com/353

다른 Annotation Tools
  https://patrickwasp.com/create-your-own-coco-style-dataset/

다른 Annotation Tools
  https://rectlabel.com/
  https://rectlabel.com/help

피드 구독하기: 글 ( Atom )