The OCR API always takes the same resolution screenshots (PDF to PNG conversion) from the PDF. So to match this with your PDF to image conversion do the following:
- OCR a test PDF (any PDF will do).
- Look at the returned X/Y for certain words e. g. maybe you find 100/200 for the Word “Hello”
- Now compare this to your image, maybe the word “Hello” is at 80/160
- This way you know that the correction factor for x is 80/100 and for y it is 160/200.
- This factor will be constant for all the PDF documents