Jeonghun (James) Lee: AI-OCR

레이블이 AI-OCR인 게시물을 표시합니다. 모든 게시물 표시

5/19/2019

tesseract OCR

1. tesseract 이란?

OCR에서 많이 사용되는 Open Source라고 하며, 시작은 HP에서 시작은 했지만, 나중에는 Google도 스폰이되어 개발되고 있다고 한다.

우선 OCR의 기능인 Image To Text를 잘 지원하며, 아랍어같은 우측에서 왼쪽으로 가는 진행도 가능하다고 한다. (right-to-left)
언어는 다양하게 지원하고 있으며, Trained Data만 있으면 어떤 언어도 지원이 가능하다고 한다.

wiki에의 마지막 부분을 읽어봐도 Input Image의 전처리가 알맞게 되지 않는다면, tesseract의 기능은 별로 좋지가 않다.
OpenCV로 전처리 하는 부분이 중요할 것 같다.

https://en.wikipedia.org/wiki/Tesseract_(software)

공식 Homepage 및 관련설명

  https://opensource.google.com/projects/tesseract
  https://github.com/tesseract-ocr/tesseract
  https://github.com/tesseract-ocr/tesseract/wiki

1.1 자동차번호인식

관심을 가지게 된것은 아래의 기사때문에, 테스트 하기위해서 간단히 설치진행을 해봤으며, 성능은 일반 글자 (책, 문서)는 잘 동작되는 것 같다.

Embedded에서도 설치진행을 해봤지만, 성능이 괜찮다.

  http://www.epnc.co.kr/news/articleView.html?idxno=83022

자동차에 번호인식에 적용할 경우

YOLO와 같이 자동차를 인식
자동차에서 번호판 인식
번호판을 OpenCV로 전처리 진행
Tesseract 사용

우선 최상의 성능을 위해서는 4.0을 사용해야 할 것 같고, 번호판의 인식하는 부분과 OpenCV의 전처리가 아주 중요할 것 같다.

마지막으로는 자동차번호판을 위한 Tesseract 전용 데이타를 만들어 Training해야 할 것같다.

1.2 Tesseract 설치 및 사용법

현재 4.0까지 나와있으며, Ubuntu에서도 쉽게 설치가 가능하다.
Tesseract 4.0은 RNN의 일종인 LSTM(Long Short Term Memory)지원이 가능

설치방법은 인터넷에 쉽게 자세히 잘나와있으며, 설치 또한 쉽고 사용하기도 편하며, python도 지원이 가능하다.

OS: Ubuntu 16.04 LTS

Tesseract 4.0 설치방법 (4.0 설치는 생략, 아래참조)

  https://m.blog.naver.com/tommybee/221307497468
  https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/
  http://m.blog.daum.net/rayolla/1141?tp_nil_a=2

RNN의 LSTM 관련내용

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

tesseract 설치방법 (3.0)

$ sudo apt list | grep tesseract  //   설치확인 

$ sudo apt install tesseract-ocr
$ sudo apt install tesseract-ocr-dev
$ sudo apt install tesseract-ocr-eng   // 영어 모델
$ sudo apt install tesseract-ocr-kor   // 한글 모델

python 을 추가로 설치해서 이를 python으로도 작성이 가능하다.

기본사용법 (3.0)

$ tesseract test.jpg outbase -l eng -psm 6

만약 두개의 언어를 사용하고 싶다면, 언어를 추가하고 상위 -l eng+원하는 언어 추가

관련 python package 설치

상위와 같이 기본으로 설치해주고, python 관련부분을 설치해주자.

$ sudo apt install python-pip  // pip 가 없다면 설치 

$ pip install opencv-python               // OpenCV 
$ pip install opencv-contrib-python

$ pip install pytesseract   // tesseract python 설치 

$ sudo apt-get install python-matplotlib  // python 2 로 사용할 경우 
$ pip install matplotlib   // matplot python  설치

OpenCV 설치
  https://dejavuqa.tistory.com/228

Pytesseract 설치 및 사용법
  https://pypi.org/project/pytesseract/

2. Tesseract Training 방법 및 Data

기존에 Trained Data는 이곳 이외에도 많이 존재하며, Download하여 개별테스트를 진행하자.

Tesseract 4.0 TrainedData

https://github.com/tesseract-ocr/tessdata

How to Train Tesseract Data

Trained Data를 만들고 싶다면 아래의 사이트를 참고

  https://github.com/tesseract-ocr/tesseract/wiki/Training-Tesseract
  https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
  https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM

3. OpenCV+Tesseract Python

  https://www.pyimagesearch.com/2017/07/10/using-tesseract-ocr-python/
  https://github.com/goncalopp/simple-ocr-opencv

Trained data 소스 위치

$ ls /usr/share/tesseract-ocr/tessdata/
configs           eng.cube.lm      eng.cube.size          eng.traineddata  osd.traineddata
eng.cube.bigrams  eng.cube.nn      eng.cube.word-freq     equ.traineddata  pdf.ttf
eng.cube.fold     eng.cube.params  eng.tesseract_cube.nn  kor.traineddata  tessconfigs

	import cv2
	import sys
	import pytesseract
	import numpy as np
	from matplotlib import pyplot as plt
	#import matplotlib.pyplot as plt

	#
	# /usr/share/tesseract-ocr/4.00/tessdata
	#

	def ImageToText(string,img):

	# psm
	# 0 = Orientation and script detection (OSD) only.
	# 1 = Automatic page segmentation with OSD.
	# 2 = Automatic page segmentation, but no OSD, or OCR. (not implemented)
	# 3 = Fully automatic page segmentation, but no OSD. (Default)
	# 4 = Assume a single column of text of variable sizes.
	# 5 = Assume a single uniform block of vertically aligned text.
	# 6 = Assume a single uniform block of text.
	# 7 = Treat the image as a single text line.
	# 8 = Treat the image as a single word.
	# 9 = Treat the image as a single word in a circle.
	# 10 = Treat the image as a single character.
	# 11 = Sparse text. Find as much text as possible in no particular order.
	# 12 = Sparse text with OSD.
	# 13 = Raw line. Treat the image as a single text line,
	# bypassing hacks that are Tesseract-specific.
	config = ('-l eng+kor --psm 6')
	text = pytesseract.image_to_string(img,config=config)
	print"------start --------(%s)" %(string)
	print(text)
	print"------end ---------\n"
	return text

	if __name__ == '__main__':

	if len(sys.argv) < 2 :
	print('Usage : python ocr_simple.py image.jpg')
	sys.exit(1)

	imPath = sys.argv[1]
	print("Open CV Version"+ cv2.__version__)

	img = cv2.imread(imPath,cv2.IMREAD_GRAYSCALE)

	#Global Thresholding (v = 127)
	ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)

	#Adaptive Mean Thresholding
	th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,11,2)

	#Adaptive Gaussian Thresholding
	th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,2)

	#others
	#blur = cv2.GaussianBlur(img,(5,5),0)
	#ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

	#th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,11,2)
	#th3 = cv2.Laplacian(th3,cv2.CV_64F)


	titles = ['Original Image', 'Global Thresholding (v = 127)',
	'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
	images = [img, th1, th2, th3]

	for i in xrange(4):
	data = ImageToText(titles[i],images[i])
	plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
	plt.title(titles[i])
	plt.xlabel(data)
	plt.xticks([]),plt.yticks([])

	plt.show()

	# ImageToText(titles[0],img)
	# ImageToText(titles[1],th1)
	# ImageToText(titles[2],th2)
	# ImageToText(titles[3],th3)

view raw view.py hosted with ❤ by GitHub

python을 실행하면, 아래와 같이 OpenCV의 필터별로 인식하며 각각 비교하며 보여준다.

피드 구독하기: 글 ( Atom )