Word block coordinates

When I send PDF via API I get json file with coordinates.
However, I have no reference twhat are width and height of page.
In my program, I want to receive searchable pdf and generate a preview using aspose.net (tiff image) of the same size as you did and merge with word coordinates so that the user can click on a given block with the mouse.
Currently, when sending different PDFs A4 I receives different results on blocks.
Even on your online test web page the preview generated from A4 PDF has different sizes. (801x1136 , sometimes 629x892). So i can’t use constant factor ;-(
How to obtain it properly ?
Thank you.

Hi, is this the same questions as Words coordinates in pdf file - #2 by admin ?

It was asked by you a long time ago :wink:

We went to Abbyy at the time, but now, with engine2 released, we have a second attempt at ocr.space.
The old problem still exists. The static scale is not working because your engine generates a random size of the previews and coordinates. (based on A4 PDF files)
We also tried sending the tiff, in addition to pdf, but you still don’t support it on engine2. ;-(
So we need to know the min and max page size or tell us how we should get these dimensions. (What algorithm or component are you using). We will use the same to generate a preview on our side.
In fact, if we get the word: Left: 500, Top: 200, we don’t know where to print the block, in the middle? (We don’t know what the page size is).
We are currently using abbyy for over 300 clients and we are looking for alternatives.
Thank you.

Thanks for the info. Let me make sure I understand the issue correctly:

  1. You send PDF documents of various sizes and formats

  2. You get the OCR API json data back

  3. Now you convert the PDF to images. And on these images you want to mark the bounding box in the document (= very similar to what we do with the searchable PDF creator option).

Is that understanding correct?

PS: Tiff support for engine2 will be available soon

Generally yes,
Ad1: From my perspective PDF looks the same A4 format 8,27 x 11,69 (but sometimes generated by scanner but sometimes sent by clients from their erp software).

When I call API with single page in PNG format - every coordinates fit. But when I send PDF I can’t find the rule how to generate preview (on my side) to fit yours json coordinates. I’m trying to avoid to make 2 or more API calls (for searchablepdf and after that for each page independant). It would be stupid.
Thank you.

P.S
When multipage tiff should be available in Engine2?

The OCR API always takes the same resolution screenshots (PDF to PNG conversion) from the PDF. So to match this with your PDF to image conversion do the following:

  • OCR a test PDF (any PDF will do).
  • Look at the returned X/Y for certain words e. g. maybe you find 100/200 for the Word “Hello”
  • Now compare this to your image, maybe the word “Hello” is at 80/160
  • This way you know that the correction factor for x is 80/100 and for y it is 160/200.
  • This factor will be constant for all the PDF documents

In a few weeks (September)