The output from both the servers are different

I am using the your website, https://ocr.space/ and when I use the OCR Engine 1, the result I am getting is different than OCR Engine 2. I am not talking about the letters/words or some numbers here. The issue is with the order like when I am using the Engine 1, I get the results in proper order that is 1st for 1st text, 2nd for the 2nd text and 3rd for the 3rd text and when I use the Engine 2, I get the results in improper order i.e. 1st for the 1st text 2nd for the 3rd text and 3rd for the 2nd text. Why is this happening? I looked for the JSON format too, it’s the same. I tried using the table recognition and scaling too but the results on OCR Engine 2 is the same. The results from the both engine are as follows:
OCR Engine 1 Results:
Waue City Mayor (Here’s the Waue is actually Wave, which is fine as of now but if possible fix it)
Veronica Boroswick
Enmu

OCR Engine 2 Results:
Waue City Mayor (Here’s the Waue is actually Wave, which is fine as of now but if possible fix it)
Enmu
Veronica Boroswick

Hi, did you use the isTable = true switch? That should fix it.

https://ocr.space/tablerecognition

In the OCR API the isTable = true switch triggers the table scanning logic. More details are available in the table OCR flag section of the OCR API documentation

i tried it, the result on the website was correct, but when I try with the API (Javascript) the result remains the same. I don’t know why? The ParsedText changes when I turn on the isTable feature and the data in the JSON doesn’t change at all.

The website uses scale=true - did you try this?

I have the options isTable=true & scale=true and the result of ParsedText is proper but the result of JSON format remains the same as previous.

So you mean the result on the OCR website is ok, but not in your OCR API call?

No, the result on the OCR website is also wrong. Let me give you an example comparing to my first example which I gave in my first message.
If I turn on the scale=true & isTable=true. The results in TEXT is:
Waue City Mayor
Veronica Boroswick
Enmu

And the result in JSON remains the same:
Waue City Mayor
Enmu
Veronica Boroswick

So I am saying that the result in JSON format doesn’t change. It’s a bug if I am not wrong.

Ah, now I understand the issue. But this is not a bug:

(1) In Json (“Overlay”: section): Each word includes the x/y bounding box coordinates. So if you want, you could sort the words yourself in any way you want.

(2) In text, - with table=true - the text is sorted line by line. Of course, the text is also part of the JSON ocr api response. It is the “ParsedText” section. Example:

"ParsedText": ". 55\n18:49:42\nQUITE AN EXPERIENCE\nTO LIVE IN FEAR, ISNT\nI1? THATS WHAT IT IS\nTO BE A SLAVE\n. 20",

So what table=true does is trigger a line by line table sorting of the OCR parsedText output.

Okay, thanks for the explanation. I guess I have to sort it on my own.

Edit: I sorted it out. Thanks once again.

1 Like