AWS S3 file url returns "Unable to recognize the file type"

Question from a user: When I use such an AWS link:

https://s3.us-west-2.amazonaws.com/rpa.upload/files/23225/16789892345.jpg

…I get an “Unable to recognize the file type” error.

{
    "OCRExitCode": 99,
    "IsErroredOnProcessing": true,
    "ErrorMessage": [
        "Unable to recognize the file type",
        "Unable to detect the file extension, or the file extension is incorrect, and no 'file type' provided in request. Please provide a file with a proper content type or extension, or provide a file type in the request to manually set the file extension."
    ],
    "ProcessingTimeInMilliseconds": "406"
}

Solution: AWS does not send the correct file type info. You can configure the content type inside the AWS S3 bucket, or simply tell the OCR API what filetype it is: => Add filetype=jpg your API call.

Also, for this image, use OCR Engine2 instead of Engine1, the result is much better.

1 Like