I’m struggling to understand what is meant by the following;
“file : Multipart encoded image file with filename”
I am trying to post a PDF file but I am returned a 411 - “Content-Length” required error. I am unsure what is meant by a encoded image file, I am just trying to POST a PDF file. Also, I am unsure what is meant by “with” filename. Should I be sending the PDF document in the body of the HTTP POST request or is it a value in the “file” header request.
Please advise and all your help is much appreciated.
Thanks for the swift response ulrich, this worked in postman, however, I wonder is postman is in fact encoding the document before posting it? Would you know?
Hi ulrich, thank you so much for your help. This did in fact work somewhat, though not completely. I had to build the entire body for the POST request (it’s basically just a raw string which looks like this;
This is my result;
{“ParsedResults”:[{“TextOrientation”:“0”,“FileParseExitCode”:1,“ParsedText”:"",“ErrorMessage”:"",“ErrorDetails”:""}],“OCRExitCode”:1,“IsErroredOnProcessing”:false,“ProcessingTimeInMilliseconds”:“437”,“SearchablePDFURL”:“Searchable PDF not generated as it was not requested.”}
I’m wondering whether or not the API expects a JSON formatted object (containing the PDF) in the actual body of the POST request.
Ok so I’ve tried a few different things. I think everything is generally ok, however, when I POST a base64 encoded PDF, I get a response of “PDF Corrupted”.When I sent the raw PDF contents, the ParsedText in the JSON is empty. I think I would need someone on the OCR.Space side to check what they’re receiving, may be something is being lost during the HTTP communication, but I can’t see that from my side.
Ok so I found the solution, it is actually very simple and many people have experienced a similar issue to no avail, and I can see why; it would help if the OCR api docs were updated to make it clearer because what stumped me was that a different method for posting PDF document has been documented, but in fact it’s exactly the same as posting a regular base64 image. Any way, so the solution is as follows;
Instead of trying to upload an actual file (or blob, which is the file content), POST a “Data URI” instead. Note: it’s not a URL as you’ll see below;
Convert the file into base64 string.
Append the following to the front of the base64 string;
NOTE: Upload the file and select the correct output format - Data URI.
Add this to the form body with a key, in this case it would be “base64image”, note this isn’t literally an image as in a .gif, .png or .jpg, it actually means the base64string. See screenshot below;
Thank you very much for this text will be a hidden thread. I am trying to solve this problem of PDF document from the last week. I have got my answer in this thread. This thread is full of knowledge and information. So thank you very much.
I don’t know about you, but for me the only way out of this whole situation with documents is programs. The programs will help you with any document. Let’s say I don’t know about you, but I like one program, but someone likes other programs because they are used to it or they are comfortable with it. Now I am using a program that I recently found on the Internet. Thanks to one site that was able to provide this program for the document. Now I have no more questions, as the program explains almost everything to me and helps in difficult situations.