OCRSpace python example code problem

raywat · September 10, 2023, 2:11am

Hi there,

I try your example python code but the result is not the same with when I try on https://ocr.space
On https://ocr.space your resutl is perfect for me.

how to solve this porblem ?

I use:

Detect orientation and auto-rotate image if needed
Do receipt scanning and/or table recognition
Auto-enlarge content (recommended for low DPI)
Create Searchable PDF : Just extract text and show overlay (fastest option)
Select OCR Engine to use (View current load) : Use OCR Engine2 (Good OCR also for numbers and special characters like %@$ )

Then the OCR’ed Result Text is perfect for me, so I use my jupyter notebook on my PC, how can I do that to get the same result ?

Here my code:

import requests


def ocr_space_file(filename, overlay=False, api_key='helloworld', language='eng'):
    """ OCR.space API request with local file.
        Python3.5 - not tested on 2.7
    :param filename: Your file path & name.
    :param overlay: Is OCR.space overlay required in your response.
                    Defaults to False.
    :param api_key: OCR.space API key.
                    Defaults to 'helloworld'.
    :param language: Language code to be used in OCR.
                    List of available language codes can be found on https://ocr.space/OCRAPI
                    Defaults to 'en'.
    :return: Result in JSON format.
    """

    payload = {'isOverlayRequired': overlay,
               'apikey': api_key,
               'language': language,
               }
    with open(filename, 'rb') as f:
        r = requests.post('https://api.ocr.space/parse/image',
                          files={filename: f},
                          data=payload,
                          )
    return r.content.decode()


def ocr_space_url(url, overlay=False, api_key='helloworld', language='eng'):
    """ OCR.space API request with remote file.
        Python3.5 - not tested on 2.7
    :param url: Image url.
    :param overlay: Is OCR.space overlay required in your response.
                    Defaults to False.
    :param api_key: OCR.space API key.
                    Defaults to 'helloworld'.
    :param language: Language code to be used in OCR.
                    List of available language codes can be found on https://ocr.space/OCRAPI
                    Defaults to 'en'.
    :return: Result in JSON format.
    """

    payload = {'url': url,
               'isOverlayRequired': overlay,
               'apikey': api_key,
               'language': language,
               }
    r = requests.post('https://api.ocr.space/parse/image',
                      data=payload,
                      )
    return r.content.decode()


# Use examples:
test_file = ocr_space_file(filename='example_image.png', language='pol')
test_url = ocr_space_url(url='http://i.imgur.com/31d5L5y.jpg')

Here is my jpg file

Thanks

ocr-api-team · September 14, 2023, 7:18pm

Hi, are you using the correct OCR Engine2? On first glance the “ocrengine=2” parameter seems to be missing from your code, or?

raywat · September 15, 2023, 5:28am

Hi,

Now, The problem is solved.

    payload = {'isOverlayRequired': overlay,
               'apikey': api_key,
               'detectOrientation': True,               
               'scale': True,
               'isTable': True,
               'OCREngine': 2,
               'language': language,
               }

Thanks