This fixes an LGTM alert:
This parameter of type ParamsTrainingHypothesis is 136 bytes -
consider passing a const pointer/reference instead.
It might also improve the performance.
Signed-off-by: Stefan Weil <sw@weil.de>
If building with TESSERACT_IMAGEDATA_AS_PIX, then tesseract
doesn't compress/decompress images, but rather holds the
data as internal Pix structures. Unfortunately, I forgot to
make the ImageData destructor free these, so memory leaked
during use. Fixed here.
Some runtime parameters which are only relevant with graphics enabled
were now removed from builds when graphics was disabled.
TableFinder::DisplayColSegmentGrid is never used, so remove it completely.
Builds with --disable-graphics significantly reduce the code size and avoid
some function calls which might be important for certain applications:
text data bss dec hex filename
3219230 41136 13920 3274286 31f62e .libs/libtesseract.so (--disable-graphics, old)
3211347 40976 13600 3265923 31d583 .libs/libtesseract.so (--disable-graphics, new)
3360942 43656 15392 3419990 342f56 .libs/libtesseract.so (default)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This replaces the proprietary STRING data type
(801 instead of 838 lines remaining).
It also removes STRING from osdetect.h and serialis.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Runtime error reported by sanitizer:
src/ccstruct/coutln.cpp:1018:19: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:48:14: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/ccstruct/coutln.cpp:1018:19 in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes a bug reported by OSS Fuzz:
https://oss-fuzz.com/issue/5697280134348800
The old code passed a negative value (-1) as argument to step_dir
when destindex was 0.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
If TESSERACT_DISABLE_DEBUG_FONTS is defined, tesseract doesn't
atetmpt to create any debug fonts. This not only saves memory,
but it (combined with the change to optionally use Pix as
internal storage for the ImageData) allows us to use an
embedded Leptonica library with no format handlers at all.
By default, when we ImageData::SetPix, we write the data out as a
PNG, just to read it back in to get a compressed buffer of data.
We then use this to generate a new Pix.
In builds of Tesseract on systems where we don't have temp files,
writing files out is problematic.
Not only that, but compressing/uncompressing is slow, and on minimal
builds of leptonica, where we've disabled the format writers to reduce
memory footprint, we get no compression anyway.
In such cases, it'd be far nicer just to keep the original Pix as
the internal data.
Also, when recovering the pixmap from the ImageData, if we know we're
only going to read from the data, we can avoid duplicating it and
just use the original. This is exactly the case when GRAPHICS_DISABLED
is set.
So, introduce a TESSERACT_IMAGEDATA_AS_PIX predefine that we can use
to cause the internal data to be a Pix rather than a compressed
buffer.
Given we don't do compression, and they were writing to memory,
this was all just more effort than we needed.
Also, if we're using GRAPHICS_DISABLED, we might as well just
pixCopy rather than pixClone as only the scaler uses this.
clang warnings:
src/ccstruct/pageres.cpp:903:20: warning:
implicit conversion from 'int' to 'float' changes value from
2147483647 to 2147483648 [-Wimplicit-int-float-conversion]
src/ccstruct/pageres.cpp:904:23:
warning: implicit conversion from 'int' to 'float' changes value from
-2147483647 to -2147483648 [-Wimplicit-int-float-conversion]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Those files are C++, and the wrong modeline is not needed at all.
Remove also some empty descriptions and old history in the comments.
Signed-off-by: Stefan Weil <sw@weilnetz.de>