If TESSERACT_DISABLE_DEBUG_FONTS is defined, tesseract doesn't
atetmpt to create any debug fonts. This not only saves memory,
but it (combined with the change to optionally use Pix as
internal storage for the ImageData) allows us to use an
embedded Leptonica library with no format handlers at all.
By default, when we ImageData::SetPix, we write the data out as a
PNG, just to read it back in to get a compressed buffer of data.
We then use this to generate a new Pix.
In builds of Tesseract on systems where we don't have temp files,
writing files out is problematic.
Not only that, but compressing/uncompressing is slow, and on minimal
builds of leptonica, where we've disabled the format writers to reduce
memory footprint, we get no compression anyway.
In such cases, it'd be far nicer just to keep the original Pix as
the internal data.
Also, when recovering the pixmap from the ImageData, if we know we're
only going to read from the data, we can avoid duplicating it and
just use the original. This is exactly the case when GRAPHICS_DISABLED
is set.
So, introduce a TESSERACT_IMAGEDATA_AS_PIX predefine that we can use
to cause the internal data to be a Pix rather than a compressed
buffer.
Given we don't do compression, and they were writing to memory,
this was all just more effort than we needed.
Also, if we're using GRAPHICS_DISABLED, we might as well just
pixCopy rather than pixClone as only the scaler uses this.
clang warnings:
src/ccstruct/pageres.cpp:903:20: warning:
implicit conversion from 'int' to 'float' changes value from
2147483647 to 2147483648 [-Wimplicit-int-float-conversion]
src/ccstruct/pageres.cpp:904:23:
warning: implicit conversion from 'int' to 'float' changes value from
-2147483647 to -2147483648 [-Wimplicit-int-float-conversion]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Those files are C++, and the wrong modeline is not needed at all.
Remove also some empty descriptions and old history in the comments.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Commit 94d0f77f56 tried to fix issue #2741
but created a new problem.
This commit should fix both old and new issue.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Fix two occurrences of this LGTM warning:
Multiplication result may overflow 'double'
before it is converted to 'long double'.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The function derives the file name for the .box file from an image name.
For training from existing line images, it is useful to directly support
the image names which are commonly used.
While generated images for Tesseract training typically use the name
pattern NAME.tif, other ground truth sets use NAME.bin.png for binarized
or NAME.nrm.png for grayscale images.
BoxFileName is also now a local function as it is only used locally.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes a clang warning:
src/ccstruct/polyblk.cpp:412:12: warning: result of comparison of
unsigned enum expression >= 0 is always true
[-Wtautological-unsigned-enum-zero-compare]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Replace the macros which were declared in vecfuncs.h by member functions
and move a function which was only used in chop.cpp to that file.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Report from Coverity Scan:
CID 1405560 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
2. uninit_member: Non-static class member end is not initialized in
this constructor nor in any functions that it calls.
CID 1405561 [...]
Modernize and optimize class WERD_RES. This not only fixes the issues
but also reduces the size and eliminates the functions InitNonPointers
and InitPointers.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The class no longer uses bit fields. Re-ordering the member variables
avoids holes and reduces the size of BLOBNBOX from 168 to 152 bytes.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes three LGTM warnings:
Multiplication result may overflow 'float' before it is converted to 'double'.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It is only used in unittest/layout_test.cc after moving a test from
baseapi_test.cc to that file, so it can be made local.
Signed-off-by: Stefan Weil <sw@weilnetz.de>