用pytesseract python模块识别图片中的文字问题

0 人关注

我附上了一张300DPI的图片。我正在使用下面的代码来提取文本,但我没有得到文本。有谁知道这个问题吗?

finalImg = Image.open('withdpi.jpg') text = pytesseract.image_to_string(finalImg)

从图像中提取文本

1 个评论
Dag
btw 由于某些原因,当我把图片上传到这里的时候,它把dpi降低到96。
python
image-processing
python-tesseract
Dag
Dag
发布于 2021-02-26
1 个回答
Ahx
Ahx
发布于 2021-02-26
已采纳
0 人赞同

让我们观察一下你的代码在做什么。

  • 我们需要看到文本的哪一部分被定位和检测。

  • 为了理解代码行为,我们将使用 image_to_data 函数。

  • 替换代码0】将显示检测到的是图像的哪一部分。

    # Open the image and convert it to the gray-scale
    finalImg = Image.open('hP5Pt.jpg').convert('L')
    # Initialize ImageDraw class for displaying the detected rectangle in the image
    finalImgDraw = ImageDraw.Draw(finalImg)
    # OCR detection
    d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
    # Get ROI part from the detection
    n_boxes = len(d['level'])
    # For each detected part
    for i in range(n_boxes):
        # Get the localized region
        (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
        # Initialize shape for displaying the current localized region
        shape = [(x, y), (w, h)]
        # Draw the region
        finalImgDraw.rectangle(shape, outline="red")
        # Display
        finalImg.show()
        # OCR "psm 6: Assume a single uniform block of text."
        txt = pytesseract.image_to_string(cropped, config="--psm 6")
        # Result
        print(txt)
    

    Result:

  • 因此,结果是图像本身显示的内容没有被检测到。代码是没有功能的。输出没有显示所需的结果。

  • 可能有各种原因。

  • 下面是输入图像的一些事实。

  • 二进制图像。

  • 大的长方形人工制品。

  • 文字有点扩张了。

    # Open the image and convert it to the gray-scale
    finalImg = Image.open('hP5Pt.jpg').convert('L')
    # Get height and width of the image
    w, h = finalImg.size
    # Get part of the desired text
    finalImg = finalImg.crop((0, int(h/6), w, int(h/4)))
    # Initialize ImageDraw class for displaying the detected rectangle in the image
    finalImgDraw = ImageDraw.Draw(finalImg)
    # OCR detection
    d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
    # Get ROI part from the detection
    n_boxes = len(d['level'])
    # For each detected part
    for i in range(n_boxes):
        # Get the localized region
        (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
        # Initialize shape for displaying the current localized region
        shape = [(x, y), (w, h)]
        # Draw the region
        finalImgDraw.rectangle(shape, outline="red")
        # Display
        finalImg.show()
        # OCR "psm 6: Assume a single uniform block of text."
        txt = pytesseract.image_to_string(cropped, config="--psm 6")
        # Result
        print(txt)
    
  •