The kDictWildcard is never actually used, so removing it makes
no difference. It causes warnings in MSVC builds as MSVC doesn't
know how to pack a unicode value into chars.
If building with TESSERACT_IMAGEDATA_AS_PIX, then tesseract
doesn't compress/decompress images, but rather holds the
data as internal Pix structures. Unfortunately, I forgot to
make the ImageData destructor free these, so memory leaked
during use. Fixed here.
strtok_s can be used with MSVC as a replacement for strtok_r, so less
special handling is needed in the code and class SVNetwork can be
made smaller by removing member has_content.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Some runtime parameters which are only relevant with graphics enabled
were now removed from builds when graphics was disabled.
TableFinder::DisplayColSegmentGrid is never used, so remove it completely.
Builds with --disable-graphics significantly reduce the code size and avoid
some function calls which might be important for certain applications:
text data bss dec hex filename
3219230 41136 13920 3274286 31f62e .libs/libtesseract.so (--disable-graphics, old)
3211347 40976 13600 3265923 31d583 .libs/libtesseract.so (--disable-graphics, new)
3360942 43656 15392 3419990 342f56 .libs/libtesseract.so (default)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It was reported by oss-fuzz (issue 23962).
Add log output to find real images which trigger that issue.
Avoid also some conversions from float to double by always using float.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This replaces the proprietary STRING data type
(801 instead of 838 lines remaining).
It also removes STRING from osdetect.h and serialis.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
MSVC does not support /arch:FMA or /arch:SSE4.1.
For /arch:AVX and /arch:AVX2 no check is needed because they are supported since a long time.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Runtime error reported by sanitizer:
src/ccstruct/rect.h:191:44: runtime error: 50961 is outside the range of representable values of type 'short'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/ccstruct/rect.h:191:44 in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Runtime error reported by sanitizer:
src/ccstruct/coutln.cpp:1018:19: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:48:14: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/ccstruct/coutln.cpp:1018:19 in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Runtime errors reported by sanitizer:
src/textord/pithsync.cpp:75:31: runtime error: unsigned integer overflow: 2147483648 + 2147483648 cannot be represented in type 'unsigned int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/textord/pithsync.cpp:75:31 in
src/textord/pithsync.cpp:75:43: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'unsigned int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/textord/pithsync.cpp:75:43 in
src/textord/pithsync.cpp:125:29: runtime error: unsigned integer overflow: 2147483648 + 2147483648 cannot be represented in type 'unsigned int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/textord/pithsync.cpp:125:29 in
src/textord/pithsync.cpp:125:41: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'unsigned int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/textord/pithsync.cpp:125:41 in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Runtime error with enabled sanitizer:
src/textord/colpartition.cpp:2243:66: runtime error: index -1 out of bounds for type 'tesseract::ColPartition *[6]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/textord/colpartition.cpp:2243:66 in
Signed-off-by: Stefan Weil <sw@weilnetz.de>