文章目录

  • 法一: 使用easyocr模块
  • 法二:通过pytesseract调用tesseract
  • 法三:调用百度API
  • 方式一: 通过urllib直接调用,替换自己的api_key和secret_key即可
  • 方式二:通过HTTP-SDK模块进行调用
  • 法一: 使用easyocr模块

    easyocr是基于torch的深度学习模块
    easyocr安装后调用过程中出现opencv版本不兼容问题,所以放弃此方案。

    法二:通过pytesseract调用tesseract

    优点:部署快,轻量级,离线可用,免费
    缺点:自带的中文库识别率较低,需要自己建数据进行训练

    法三:调用百度API

    优点:使用方便,功能强大
    缺点:大量使用需要收费
    我自己采用的是调用百度API的方式,下面是我的步骤:
    注册百度账号,创建OCR应用可以参考其他教程。
    购买后使用python调用方法

    方式一: 通过urllib直接调用,替换自己的api_key和secret_key即可

    # coding=utf-8
    
    import sys
    import json
    import base64
    
    
    # 保证兼容python2以及python3
    IS_PY3 = sys.version_info.major == 3
    if IS_PY3:
        from urllib.request import urlopen
        from urllib.request import Request
        from urllib.error import URLError
        from urllib.parse import urlencode
        from urllib.parse import quote_plus
    else:
        import urllib2
        from urllib import quote_plus
        from urllib2 import urlopen
        from urllib2 import Request
        from urllib2 import URLError
        from urllib import urlencode
    
    # 防止https证书校验不正确
    import ssl
    ssl._create_default_https_context = ssl._create_unverified_context
    
    API_KEY = 'YsZKG1wha34PlDOPYaIrIIKO'
    
    SECRET_KEY = 'HPRZtdOHrdnnETVsZM2Nx7vbDkMfxrkD'
    
    
    OCR_URL = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic"
    
    
    """  TOKEN start """
    TOKEN_URL = 'https://aip.baidubce.com/oauth/2.0/token'
    
    
    """
        获取token
    """
    def fetch_token():
        params = {'grant_type': 'client_credentials',
                  'client_id': API_KEY,
                  'client_secret': SECRET_KEY}
        post_data = urlencode(params)
        if (IS_PY3):
            post_data = post_data.encode('utf-8')
        req = Request(TOKEN_URL, post_data)
        try:
            f = urlopen(req, timeout=5)
            result_str = f.read()
        except URLError as err:
            print(err)
        if (IS_PY3):
            result_str = result_str.decode()
    
    
        result = json.loads(result_str)
    
        if ('access_token' in result.keys() and 'scope' in result.keys()):
            if not 'brain_all_scope' in result['scope'].split(' '):
                print ('please ensure has check the  ability')
                exit()
            return result['access_token']
        else:
            print ('please overwrite the correct API_KEY and SECRET_KEY')
            exit()
    
    """
        读取文件
    """
    def read_file(image_path):
        f = None
        try:
            f = open(image_path, 'rb')
            return f.read()
        except:
            print('read image file fail')
            return None
        finally:
            if f:
                f.close()
    
    
    """
        调用远程服务
    """
    def request(url, data):
        req = Request(url, data.encode('utf-8'))
        has_error = False
        try:
            f = urlopen(req)
            result_str = f.read()
            if (IS_PY3):
                result_str = result_str.decode()
            return result_str
        except  URLError as err:
            print(err)
    
    if __name__ == '__main__':
    
        # 获取access token
        token = fetch_token()
    
        # 拼接通用文字识别高精度url
        image_url = OCR_URL + "?access_token=" + token
    
        text = ""
    
        # 读取测试图片
        file_content = read_file('test.jpg')
    
        # 调用文字识别服务
        result = request(image_url, urlencode({'image': base64.b64encode(file_content)}))
    
        # 解析返回结果
        result_json = json.loads(result)
        print(result_json)
        for words_result in result_json["words_result"]:
            text = text + words_result["words"]
    
        # 打印文字
        print(text)
    
    

    方式二:通过HTTP-SDK模块进行调用

    from aip import AipOcr
    APP_ID = '25**9878'
    API_KEY = 'VGT8y***EBf2O8xNRxyHrPNr'
    SECRET_KEY = 'ckDyzG*****N3t0MTgvyYaKUnSl6fSw'
    
    client = AipOcr(APP_ID,API_KEY,SECRET_KEY)
    
    
    def get_file_content(filePath):
        with open(filePath, 'rb') as fp:
            return fp.read()
    
    image = get_file_content('test.jpg')
    res = client.basicGeneral(image)
    print(res)
    #res = client.basicAccurate(image)
    #print(res)
    

    直接识别屏幕指定区域上的文字

    from aip import AipOcr
    APP_ID = '25**9878'
    API_KEY = 'VGT8y***EBf2O8xNRxyHrPNr'
    SECRET_KEY = 'ckDyzG*****N3t0MTgvyYaKUnSl6fSw'
    
    client = AipOcr(APP_ID,API_KEY,SECRET_KEY)
    
    from io import BytesIO
    from PIL import ImageGrab
    out_buffer = BytesIO()
    img = ImageGrab.grab((100,200,300,400))
    img.save(out_buffer,format='PNG')
    res = client.basicGeneral(out_buffer.getvalue())
    print(res)
    

    来源:dandeseed

    物联沃分享整理
    物联沃-IOTWORD物联网 » python之OCR文字识别

    发表评论