Python中用于从图像中提取文本的8大OCR库

42 阅读 0 评论 0 点赞

在Python中，从图像中提取文本的功能通常依赖于光学字符识别（OCR）技术。OCR可以通过多种库实现，每种库都有其独特的优势和使用场景。以下是8个常用的OCR库及其简要说明与代码示例。

1. Tesseract

Tesseract 是一个流行的开源OCR引擎，原由HP开发，现在由Google维护。它支持多种语言，并且效果非常好。

import pytesseract
from PIL import Image

# 读取图片
img = Image.open('example.png')
# 使用Tesseract提取文本
text = pytesseract.image_to_string(img, lang='chi_sim')
print(text)

2. EasyOCR

EasyOCR是一个非常简单易用的OCR库，支持多种语言，尤其对于中文的识别能力较强。

import easyocr

reader = easyocr.Reader(['ch_sim', 'en'])  # 指定中文和英文语言
result = reader.readtext('example.png')

for (bbox, text, prob) in result:
    print(f"识别文本: {text}, 概率: {prob:.2f}")

3. OCR.space

OCR.space 提供了一个在线OCR服务，可以通过调用API来完成OCR任务。需要注册获取API密钥。

import requests

API_KEY = 'your_api_key'  # 替换为你的API密钥
url = 'https://api.ocr.space/parse/image'

with open('example.png', 'rb') as f:
    r = requests.post(url,
                      files={ 'filename': f },
                      data={ 'apikey': API_KEY })

result = r.json()
print(result['ParsedResults'][0]['ParsedText'])

4. Keras-OCR

Keras-OCR是一个基于Keras的OCR工具，使用深度学习技术进行字符检测与识别，适合需要更加准确结果的场景。

import keras_ocr

# 创建一个识别器
pipeline = keras_ocr.pipeline.Pipeline()
# 读取图片
images = [keras_ocr.tools.read('example.png')]
# 提取文本
prediction_groups = pipeline.recognize(images)

for predictions in prediction_groups:
    for text, box in predictions:
        print(f"识别文本: {text}")

5. PaddleOCR

PaddleOCR是百度的一个OCR工具，支持多种语言和场景，性能优秀。

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='ch')  # need to run the following command to install PaddleOCR
# 提取文本
result = ocr.ocr('example.png')

for line in result[0]:
    print(f"识别文本: {line[1][0]}")

6. PyOCR

PyOCR是一个在Python中使用Tesseract的封装库，支持Tesseract和Cuneiform引擎。

import sys
from PIL import Image
import pyocr
import pyocr.builders

tool = pyocr.get_available_tools()[0]
img = Image.open('example.png')

text = tool.image_to_string(
    img,
    lang='chi_sim',
    builder=pyocr.builders.TextBuilder()
)

print(text)

7. OCRmyPDF

OCRmyPDF是一个将OCR功能添加到PDF文件中的工具，利用Tesseract实现。

ocrmypdf --rotate-pages -l chi_sim input.pdf output.pdf

8. Textract

Textract可以从多种文档类型中提取文本，包括PDF和DOCX等。

import textract

text = textract.process('example.pdf', encoding='utf-8')
print(text.decode('utf-8'))

总结

以上8个OCR库各有优势，用户可根据具体需求选择合适的工具。如果是处理简单的图像文本提取，可以选择Tesseract或EasyOCR；如果需要处理复杂文档或多语言，PaddleOCR和TensorFlow OCR可能更合适。希望这些示例能帮助您了解如何在Python中实现OCR。

点赞(0) 打赏

本文分类：后端
本文标签：ocr 开发语言 python
浏览次数：42 次浏览
发布日期：2024-09-25 09:43:45
本文链接：http://makehui.com/houduan/1285.html

上一篇 > Python酷库之旅-第三方库Pandas(102)
下一篇 > Python的NLTK模块详细介绍与实战案例

Python中用于从图像中提取文本的8大OCR库

1. Tesseract

2. EasyOCR

3. OCR.space

4. Keras-OCR

5. PaddleOCR

6. PyOCR

7. OCRmyPDF

8. Textract

总结

微信扫一扫：分享

【Py/Java/C++三种语言OD独家2024E卷真题】20天拿下华为OD笔试之【模拟】2024E-转骰子【欧弟算法】全网注释最详细分类最全的华为OD真题题解

【Rust】——【面向对象语言的特征】

【Golang】关于Gin框架请求参数的获取

初级爬虫实战——巴黎圣母院新闻

微信扫一扫：分享