用pytesseract python模块识别图片中的文字问题

0 人关注

我附上了一张300DPI的图片。我正在使用下面的代码来提取文本，但我没有得到文本。有谁知道这个问题吗？

finalImg = Image.open('withdpi.jpg') text = pytesseract.image_to_string(finalImg)

从图像中提取文本

1 个评论

Dag ：

btw 由于某些原因，当我把图片上传到这里的时候，它把dpi降低到96。

python

image-processing

python-tesseract

Dag

发布于 2021-02-26

1 个回答

Ahx

发布于 2021-02-26

已采纳

0 人赞同

让我们观察一下你的代码在做什么。

我们需要看到文本的哪一部分被定位和检测。

为了理解代码行为，我们将使用 image_to_data 函数。

替换代码0】将显示检测到的是图像的哪一部分。

# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')
# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)
# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
    # Initialize shape for displaying the current localized region
    shape = [(x, y), (w, h)]
    # Draw the region
    finalImgDraw.rectangle(shape, outline="red")
    # Display
    finalImg.show()
    # OCR "psm 6: Assume a single uniform block of text."
    txt = pytesseract.image_to_string(cropped, config="--psm 6")
    # Result
    print(txt)
Result:
因此，结果是图像本身显示的内容没有被检测到。代码是没有功能的。输出没有显示所需的结果。
可能有各种原因。
下面是输入图像的一些事实。
二进制图像。
大的长方形人工制品。
文字有点扩张了。
# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')
# Get height and width of the image
w, h = finalImg.size
# Get part of the desired text
finalImg = finalImg.crop((0, int(h/6), w, int(h/4)))
# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)
# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
    # Initialize shape for displaying the current localized region
    shape = [(x, y), (w, h)]
    # Draw the region
    finalImgDraw.rectangle(shape, outline="red")
    # Display
    finalImg.show()
    # OCR "psm 6: Assume a single uniform block of text."
    txt = pytesseract.image_to_string(cropped, config="--psm 6")
    # Result
    print(txt)