Tesseract OCR распознавание мелких цифр

Question

Tesseract OCR не распознаёт мелкие цифры, а именно 6 и 9, другие распознаёт как надо

Исх изображение:

pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"

img = cv2.imread('src_path...')

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
scale_percent = 400  # percent of original size
width = int(img_gray.shape[1] * scale_percent / 100)
height = int(img_gray.shape[0] * scale_percent / 100)
dim = (width, height)
resized_img = cv2.resize(img_gray, dim, interpolation=cv2.INTER_AREA)

blur_img = cv2.GaussianBlur(resized_img, (3, 3), 0)
blur_img = cv2.medianBlur(blur_img, 3)

thresh, new_img = cv2.threshold(blur_img, 0, 255, cv2.THRESH_OTSU |cv2.THRESH_BINARY)

custom_config = '--psm 12 --oem 3 -c tessedit_char_whitelist=0123456789'
digits = pytesseract.image_to_string(new_img, lang='eng', config=custom_config)
print(digits)

После всех преобразований получаем вот такое изображение:

Но и это tesseract не распознаёт, что можно сделать?

upd 1: Временное решение, увеличил изображение в 10 раз и применил сильный blur (medianBlur значение от 13 до 21)... tesseract Стал определять. Но есть ли более правильное решение моей проблемы?

Answer 1

Исходным изображением взял:

Инвертируйте цвета, тк тессеракт лучше воспринимает черный шрифт на белом фоне и поставьте psm 8

Возможные варианты psm:

0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR. (not implemented)
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
11 = Sparse text. Find as much text as possible in no particular order.
12 = Sparse text with OSD.
13 = Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.

oem:

0 = Original Tesseract only.
1 = Neural nets LSTM only.
2 = Tesseract + LSTM.
3 = Default, based on what is available.

Описания режимов взяты отсюда

import pytesseract
import cv2

img = cv2.imread('ZZt0xKV.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
scale_percent = 400  # percent of original size
width = int(img_gray.shape[1] * scale_percent / 100)
height = int(img_gray.shape[0] * scale_percent / 100)
dim = (width, height)
resized_img = cv2.resize(img_gray, dim, interpolation=cv2.INTER_AREA)

blur_img = cv2.GaussianBlur(resized_img, (3, 3), 0)
blur_img = cv2.medianBlur(blur_img, 3)

thresh, new_img = cv2.threshold(blur_img, 0, 255, cv2.THRESH_OTSU |cv2.THRESH_BINARY)


pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"

new_img = cv2.bitwise_not(new_img)

custom_config = '--psm 8 -c tessedit_char_whitelist=0123456789'
digits = pytesseract.image_to_string(new_img, lang='eng', config=custom_config)
print(digits.strip()) # 9

Полезная ссылка с советами по повышению качества распознавания

upd: взял код преобразования из вопроса, изменил параметры psm

БЛОГ НА HUSL

Tesseract OCR распознавание мелких цифр

Ответы (1 шт):