Hello!
I would like to request a feature to support jbig2 image formats that provide lossless compression and are specifically optimized for 2-color (black and white) images. Perfect for scanned docs, particularly those which are preprocessed if available.
A 250kb PNG of a black and white image will be compressed to about 20k. - which I believe would be something interesting for both the API provider and consumers alike.
I understand that this format is supported by Tesseract, but even so, jb2 files can be easily decoded into png (losslessly) injected as part of the API workflow - at very high speeds (fractions of a second):
Example command which would convert a jb2 file into a PNG:
jbig2dec img_01.png.jb2 -o img_01.png.jb2.png
More info on the Ghostscript affiliated site: https://jbig2dec.com/
Would OCR.space consider including support for this format to API invocation payload sizes - or at least for the PAID version?
Currently, when uploading an image of a scanned page in jb2 format, the API responds with:
"ErrorMessage": [
"File failed validation. File does not have a valid extension. Allowed file extensions: .pdf,.jpg,.png,.jpeg,.bmp,.gif,.tif,.tiff,.webp"
],
Thanks for your consideration!