让我们观察一下你的代码在做什么。
我们需要看到文本的哪一部分被定位和检测。
为了理解代码行为,我们将使用
image_to_data
函数。
替换代码0】将显示检测到的是图像的哪一部分。
# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')
# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)
# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Initialize shape for displaying the current localized region
shape = [(x, y), (w, h)]
# Draw the region
finalImgDraw.rectangle(shape, outline="red")
# Display
finalImg.show()
# OCR "psm 6: Assume a single uniform block of text."
txt = pytesseract.image_to_string(cropped, config="--psm 6")
# Result
print(txt)
Result:
因此,结果是图像本身显示的内容没有被检测到。代码是没有功能的。输出没有显示所需的结果。
可能有各种原因。
下面是输入图像的一些事实。
二进制图像。
大的长方形人工制品。
文字有点扩张了。
# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')
# Get height and width of the image
w, h = finalImg.size
# Get part of the desired text
finalImg = finalImg.crop((0, int(h/6), w, int(h/4)))
# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)
# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Initialize shape for displaying the current localized region
shape = [(x, y), (w, h)]
# Draw the region
finalImgDraw.rectangle(shape, outline="red")
# Display
finalImg.show()
# OCR "psm 6: Assume a single uniform block of text."
txt = pytesseract.image_to_string(cropped, config="--psm 6")
# Result
print(txt)