The fontconfig library has some (intentional) memory leaks which
must be suppressed for unit tests with the LeakSanitizer.
This fixes the issues #3156 and #3157.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Avoid 1) floating point division by 127, 2) conversion of
bias to double, 3) FP addition, in favour of 1) integer
multiplication by 127, and 2) integer addition.
(Also costs extra work in the serialisation/deserialisation of
the scale values, and conversion of weights to int formats, but
these are all one offs).
Currently, the size of the scales array is not rounded up
in the same way as the weights are. This blocks us pushing
the scale calculations into the SIMD, as when we "overread"
the end of the scale array, we potentially get errors.
Here, we adjust the intSimdMatrix stuff to ensure that the
scales array reserves enough entries to allow such overreads
to work.
This doesn't make any difference for now, but opens the way
for future optimisations.
They used the function pango_coverage_max which does nothing and
which has been deprecated since pango version 1.44.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This replaces the proprietary STRING data type
(801 instead of 838 lines remaining).
It also removes STRING from osdetect.h and serialis.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This is a copy of projects/tesseract-ocr/build.sh including its history from
https://github.com/google/oss-fuzz.git.
It allows maintaining the build rules with the Tesseract source code.
The build rules for Leptonica were slightly modified to avoid
unneeded compilations.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
- Replace AVX_OPT, AVX2_OPT, FMA_OPT, SSE41_OPT
- Replace AVX, AVX2, FMA, SSE4_1
- Write new HAVE_AVX, HAVE_AVX2, HAVE_FMA, HAVE_SSE4_1 into config_auto.h
- Put related conditionals in Makefile.am in one place
This makes the code clearer and fixes a log message in
IntSimdMatrixTest.AVX2.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
They are moved from src/classify and src/lstm to src/training.
This reduces the size of the Tesseract library.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It is only used in unittest/layout_test.cc after moving a test from
baseapi_test.cc to that file, so it can be made local.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The method was only used in unittest where it can be replaced by
UNICHARSET::load_from_file which also simplifies the code.
This allows removing the class InMemoryFilePointer and fixes a TODO.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Now more tests (those which use fileio) depend on the training build.
This is required since commit c5a50b93ce.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
GTEST_SKIP() returns from the function which caused two warnings:
CID 1402755 (#1 of 1): Resource leak (RESOURCE_LEAK)
CID 1402761 (#1 of 1): Structurally dead code (UNREACHABLE)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The test submodule now adds an image which is needed by the
pagesegmode_test.
That image was newly created for the test. Therefore the box
coordinates in the test had to be fixed by using data from
the hOCR output for the full image.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The test submodule now includes the files needed by the tatweel_test.
Fix also a linker error for tatweel_test.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The function pointers and callbacks file_reader_, file_writer_,
checkpointer_reader_ and checkpoint_writer_ are always set to
the same values. Replacing them by direct function calls
simplifies the code and allows removing more code from tesscallback.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Skip the tests which need the legacy code.
Add also code to those tests to use the user's locale to test that, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Linker error reported in issue #2439:
unittest/baseapi_test.cc:190:
undefined reference to
`tesseract::TessBaseAPI::AdaptToWordStr(tesseract::PageSegMode, char const*)'
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This needs the latest test submodule.
The test uses LoadFromFile which is not used otherwise, so remove that
function from class ParamsModel.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warnings:
src/ccutil/unicharcompress.cpp:172:27: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
src/lstm/recodebeam.cpp:129:29: warning: comparison of integers of different signs: 'std::__cxx1998::vector::size_type' (aka 'unsigned long') and 'int' [-Wsign-compare]
src/lstm/recodebeam.cpp:276:48: warning: comparison of integers of different signs: 'std::__cxx1998::vector::size_type' (aka 'unsigned long') and 'int' [-Wsign-compare]
unittest/imagedata_test.cc:101:21: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
unittest/linlsq_test.cc:33:23: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
unittest/linlsq_test.cc:44:23: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
unittest/nthitem_test.cc:27:23: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
unittest/nthitem_test.cc:68:21: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
unittest/stats_test.cc:26:23: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Add more subtests to langmodel_test
Add more subtests to langmodel_test
fix and enable lstmtrainer_test
fix and enable some subtests from recodebeam_test
partial fix for resultiterator_test
fix typo removing the terminating linefeed.
fix typo
changes