另存为图片,后缀名.png
下载tesseract
配置环境变量 上方新建第一空 TESSDATA_PREFIX 第二空找到tesseract.exe 右键属性,安全–对象属性
ctrl+左键 pytesseract 进入修改路径 C:\Tesseract-OCR\tesseract.exe 改成双斜杠
代码:
import pytesseract
from PIL import Image
# image = Image.open('code.png')
image = Image.open('code4.png')
# image1 = Image.open('code.png')
# image.show()
tesseract_data = '--tessdata-dir "C:\\Tesseract-OCR\\tessdata"'
#彩色图变成灰度图
image = image.convert('L')
# image.show()
#取出干扰线
# threshold = 170
threshold = 125
table = []
for i in range(256):
if i<threshold:
table.append(0)
else:
table.append(1)
image = image.point(table,'1')
# image.show()
image_str = pytesseract.image_to_string(image,config=tesseract_data)
print(image_str)
运行结果
E:\project\python.exe C:/Users/Administrator/Desktop/四阶xpat爬虫系列/Requests/Requests01/Requests01/day14/demo_tesseract.py
KVGi
Process finished with exit code 0