Commit Graph

244 Commits

Author SHA1 Message Date
Stefan Weil
46e2a0f106 Remove more code for builds with disabled legacy engine
Now the Tesseract library no longer includes unused code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-13 17:49:10 +02:00
Egor Pugin
73f713519c
Merge pull request #2614 from stweil/training
Move source files which are used for training only to src/training
2019-08-12 19:35:50 +03:00
Stefan Weil
e84cb24def Move source files which are used for training only to src/training
They are moved from src/classify and src/lstm to src/training.

This reduces the size of the Tesseract library.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-12 17:08:08 +02:00
Stefan Weil
bce585286d Remove global array kPolyBlockNames from Tesseract library
It is only used in unittest/layout_test.cc after moving a test from
baseapi_test.cc to that file, so it can be made local.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-12 14:33:55 +02:00
Stefan Weil
beec85e023 Remove UNICHARSET::load_from_inmemory_file and related code
The method was only used in unittest where it can be replaced by
UNICHARSET::load_from_file which also simplifies the code.

This allows removing the class InMemoryFilePointer and fixes a TODO.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-12 13:07:15 +02:00
Stefan Weil
ab953c1d51 unittest: Fix build and simplify build rules
Now more tests (those which use fileio) depend on the training build.
This is required since commit c5a50b93ce.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-07 13:58:12 +02:00
Stefan Weil
2ba90f02cb unittest: Initialize non-static class members in RecodeBeamTest (CID 1402765)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-10 16:52:27 +02:00
Stefan Weil
d40a2423e8 unittest: Fix two issues reported by Coverity Scan (CID 1402761, 1402755)
GTEST_SKIP() returns from the function which caused two warnings:

CID 1402755 (#1 of 1): Resource leak (RESOURCE_LEAK)
CID 1402761 (#1 of 1): Structurally dead code (UNREACHABLE)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-10 16:38:30 +02:00
Stefan Weil
a85045eeb5 unittest: Add missing precision specifiers (CID 1402752)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-10 16:33:08 +02:00
Stefan Weil
7fab891e36 unittest: Don't build tatweel_test when TensorFlow is disabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-10 16:06:27 +02:00
Stefan Weil
ba27deb3a0 unittest: Add missing libraries to fix linker errors
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-10 15:11:51 +02:00
Stefan Weil
e94392ef10 Update test submodule and fix pagesegmode_test
The test submodule now adds an image which is needed by the
pagesegmode_test.

That image was newly created for the test. Therefore the box
coordinates in the test had to be fixed by using data from
the hOCR output for the full image.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-09 11:48:56 +02:00
Stefan Weil
098180982a Update test submodule and fix tatweel_test
The test submodule now includes the files needed by the tatweel_test.
Fix also a linker error for tatweel_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-09 08:03:11 +02:00
Stefan Weil
71e7e16a61 unittest: Fix and enable pagesegmode_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-07 12:35:41 +02:00
Stefan Weil
6668f2fc9e unittest: Fix and enable tatweel_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-07 10:59:27 +02:00
Stefan Weil
cf46eaeac8 unittest: Fix and enable baseapi_thread_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-30 20:21:56 +02:00
Stefan Weil
b00e53fabf unittest: Fix and enable stridemap_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-30 15:30:03 +02:00
Stefan Weil
4e576f844c unittest: Fix and enable networkio_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-30 15:28:42 +02:00
Stefan Weil
2833db7c67 unittest: Fix and enable equationdetect_test
It requires Tensorflow. Skip one test because equ_gt1.tif is missing.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-30 12:39:54 +02:00
Stefan Weil
5409299763 unittest: Fix tests which need Tensorflow headers
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-29 11:43:12 +02:00
Stefan Weil
655ba7af10 unittest: Fix compiler warnings (signed/unsigned)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-28 08:11:42 +02:00
Stefan Weil
40c1cf671f unittest: Fix and enable pango_font_info_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-28 08:09:28 +02:00
Stefan Weil
04d85b4c0f Add more test code for normstrngs_test
unilib.h is now available, so more code can be enabled.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-28 07:35:27 +02:00
Stefan Weil
aa54bf0f8b Fix code from tensorflow/models/research/syntaxnet/util/utf8
See https://github.com/tensorflow/models/issues/7090.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-28 07:30:40 +02:00
Stefan Weil
0702194246 Add code from tensorflow/models
The new code was copied from the latest code on GitHub
(https://github.com/tensorflow/models/tree/master/research/syntaxnet).

It is required for pango_font_info_test and other unit tests.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-28 07:30:40 +02:00
Stefan Weil
252d80cb6d unittest: Fix function QCHECK (issue #2517)
The function must print an error message if the condition fails.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-23 19:05:42 +02:00
Stefan Weil
efa3cae06d Simplify unittest/Makefile.am
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-22 20:29:39 +02:00
Stefan Weil
bd13069fe8 Simplify class LSTMTrainer
The function pointers and callbacks file_reader_, file_writer_,
checkpointer_reader_ and checkpoint_writer_ are always set to
the same values. Replacing them by direct function calls
simplifies the code and allows removing more code from tesscallback.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-22 09:18:13 +02:00
Stefan Weil
b967c62880 unittest: Add missing Leptonica library for textlineprojection_test
It is needed for builds with --enable-shared.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-17 15:10:16 +02:00
Stefan Weil
ceabab8373 unittest: Catch missing eng.traineddata in baseapi_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-16 08:11:16 +02:00
Stefan Weil
bbd3626d77 unittest: Fix and enable normstrngs_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-16 08:01:11 +02:00
Stefan Weil
73e5241004 unittest: Fix and enable textlineprojection_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-15 10:22:44 +02:00
Stefan Weil
e0e29126ac unittest: Fix and enable scanutils_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-14 16:51:39 +02:00
Stefan Weil
3c507100c6 unittest: Fix and enable ligature_table_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-11 16:40:23 +02:00
Stefan Weil
9a4bd041c8 Fix build for unittests
Commit 29f2cff203 was the wrong fix
for the compiler warnings because it broke the unittest build.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-26 21:36:34 +02:00
Stefan Weil
9551c3d413 unittest: Remove unused methods
This fixes compiler warnings.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-26 20:27:21 +02:00
zdenop
12847d58ad
Merge pull request #2455 from bact/master
Unittest: Fix Thai valid text and add Thai illegal sequences
2019-05-25 18:36:17 +02:00
Stefan Weil
1ba8c97cac Fix linking of unittest with Tensorflow
This does not add Tensorflow tests. It only fixes the linker errors.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 17:08:48 +02:00
bact
aac6f593f3
Update normstrngs_test.cc 2019-05-22 15:21:16 +07:00
bact
e05c5ecfcc
Fix Thai valid text and add Thai illegal sequences
- Fix a invalid sequence in "valid text" `kScriptText`
- Add two illegal sequence in `kBadlyFormedThaiWords`
2019-05-22 15:19:49 +07:00
Stefan Weil
639781b5c8 stringrenderer_test: Get system locale only once
This fixes a runtime exception on macOS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-18 13:24:13 +02:00
Stefan Weil
8e7b1119b5 Run more unittests with the user's locale
Hopefully this improves the test coverage.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 18:12:55 +02:00
Stefan Weil
59e31e958b Fix more build error for compilation without legacy engine
Skip the tests which need the legacy code.
Add also code to those tests to use the user's locale to test that, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 18:12:55 +02:00
Stefan Weil
780986ebfb Fix linker error for baseapi_test when building without legacy engine
Linker error reported in issue #2439:

    unittest/baseapi_test.cc:190:
      undefined reference to
      `tesseract::TessBaseAPI::AdaptToWordStr(tesseract::PageSegMode, char const*)'

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 18:12:55 +02:00
Stefan Weil
28a521fec2 Fix some typos (most found and fixed by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-01 20:30:41 +02:00
Stefan Weil
4194b93e3a unittest: Add missing unittests to Makefile.am as comments
This gives a good overview of the missing unittests.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-04-19 11:14:43 +02:00
Stefan Weil
5529a5db11 unittest: Fix and enable params_model_test
This needs the latest test submodule.

The test uses LoadFromFile which is not used otherwise, so remove that
function from class ParamsModel.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-04-18 17:06:48 +02:00
Stefan Weil
bb52887c36 unittest: Replace TRUE, FALSE by true, false
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-31 17:54:48 +02:00
Stefan Weil
2718b81a3e fuzzer-api: Use environment variable TESSDATA_PREFIX if set
Clean also the code a little bit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-26 11:09:22 +01:00
Stefan Weil
7e9970b4b1 Format fuzzer code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-26 11:09:22 +01:00
Stefan Weil
7cd012f3dd Move fuzzer-api.cpp to subdirectory unittest/fuzzers
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-26 11:09:10 +01:00
Stefan Weil
aaf8c50a12 unittest: Use range-for-loops
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-25 09:36:32 +01:00
Stefan Weil
631882a346 Fix compiler warnings (signed / unsigned mismatch)
clang warnings:

    src/ccutil/unicharcompress.cpp:172:27: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
    src/lstm/recodebeam.cpp:129:29: warning: comparison of integers of different signs: 'std::__cxx1998::vector::size_type' (aka 'unsigned long') and 'int' [-Wsign-compare]
    src/lstm/recodebeam.cpp:276:48: warning: comparison of integers of different signs: 'std::__cxx1998::vector::size_type' (aka 'unsigned long') and 'int' [-Wsign-compare]
    unittest/imagedata_test.cc:101:21: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
    unittest/linlsq_test.cc:33:23: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
    unittest/linlsq_test.cc:44:23: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
    unittest/nthitem_test.cc:27:23: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
    unittest/nthitem_test.cc:68:21: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
    unittest/stats_test.cc:26:23: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-25 08:36:07 +01:00
Stefan Weil
b7279f6d67 unittest: Remove tmp directory from repository and create it during build
This fixes out of tree builds.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-08 16:08:16 +01:00
Stefan Weil
bd95c9d2b8 unittest: Add missing libarchive
It is needed for the tests if Tesseract was built with libarchive.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-08 15:50:14 +01:00
Stefan Weil
b20f89006e unittest: Add another file from Abseil
It is needed for newer versions of Abseil.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-08 15:46:38 +01:00
Stefan Weil
b3bd23edb7 Remove whitespace at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-19 13:53:31 +01:00
Shree Devi Kumar
8612170321 fix resultiterator_test for extra \n
resultiterator_test.cc
2019-02-10 04:58:40 +00:00
Shree Devi Kumar
32af6be4ba disable some subtests in resultiterator_test
(cherry picked from commit 147ef6e5f17f6cd5eedae9c81d291ad296f37090)
2019-02-02 11:54:17 +00:00
Shree Devi Kumar
1ac76d8825 Partially fix and enable more unittests
Add more subtests to langmodel_test

Add more subtests to langmodel_test

fix and enable lstmtrainer_test

fix and enable some subtests from recodebeam_test

partial fix for resultiterator_test

fix typo removing the terminating linefeed.

fix typo

changes
2019-01-27 06:49:57 +00:00
Shree Devi Kumar
eaf5deb6b3 Disable ligature related subtest in stringrenderer 2019-01-27 06:49:56 +00:00
Stefan Weil
50f5662723
Merge pull request #2193 from Shreeshrii/master
More updates to LSTM related unittests
2019-01-24 17:11:00 +01:00
Shree Devi Kumar
dbb12d6fde more updates to lstm related unittests 2019-01-24 15:39:37 +00:00
Stefan Weil
86b0f3625e unittest: Skip test is traineddata is missing in applybox_test
Many tests have preconditions like a correct version of the test submodule
or installed traineddata files at the right location. They fail or even
crash if those preconditions are not met.

The latest version of Googletest supports skipping single tests with
GTEST_SKIP which is used here to skip tests in applybox_test when
tessdata/eng.traineddata is missing.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-24 16:10:52 +01:00
Shree Devi Kumar
36906064a5 Add LF to INFO msgs in lstm_test 2019-01-24 11:40:53 +00:00
Stefan Weil
14086af474 unittest: Add missing Leptonica library for stringrenderer_test
It is needed for builds without `--disable-shared`.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-24 11:29:22 +01:00
Stefan Weil
6b7f7db63e Fix and enable shapetable_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-24 11:23:20 +01:00
Shreeshrii
bbd23bbfd2 Fix and enable lstm related unittests (#2180)
* Fix and build lstm related unittests
* Use ./tmp instead of ./ for files created by unittests
2019-01-24 08:01:19 +01:00
Stefan Weil
4b24d8cdf6 Fix and enable stringrenderer_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-23 13:55:13 +01:00
Stefan Weil
a6da64234e unittest: Fix and enable validate_myanmar_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-23 13:54:27 +01:00
Stefan Weil
d67287a5d9 unittest: Fix and enable validate_khmer_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-23 13:54:27 +01:00
Stefan Weil
611d5e6358 unittest: Fix and enable validate_indic_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-23 13:54:27 +01:00
Stefan Weil
d97f67da63 unittest: Fix and enable validate_grapheme_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-23 13:54:27 +01:00
Stefan Weil
a702f2d2aa unittest: Replace ABSL_ARRAYSIZE by ARRAYSIZE
Remove the local definition of ABSL_ARRAYSIZE
to avoid a conflict with Abseil.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-23 13:54:27 +01:00
Stefan Weil
2c0ddb4220 Update file paths in dawg_test
Get unicharset and wordlist files from test/testing and use the latest
test submodule which provides those files.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-21 20:08:11 +01:00
Shree Devi Kumar
57f74d2b73 Fix file location for unicharset for mastertrainer_test 2019-01-21 17:36:08 +01:00
Shree Devi Kumar
0ee4f63019 Formatting LOG messages from layout_test 2019-01-21 17:36:08 +01:00
Stefan Weil
4edc61fd3f unittest: Add missing license headers for dawg_test and layout_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-21 17:36:08 +01:00
Stefan Weil
05cdbc7c9c Fix and enable dawg_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-21 17:36:08 +01:00
Stefan Weil
aec992ebf8 Update test submmodule and enable additional test in layout_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-21 17:36:08 +01:00
Stefan Weil
4b821b2c6b Fix and enable layout_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-21 17:36:08 +01:00
Shree Devi Kumar
0d6d8108c8 Add sources for layout_test and dawg_test to Makefile 2019-01-21 17:36:08 +01:00
Shree Devi Kumar
0f0eaa9f30 Partial fix for layout_test and dawg_test 2019-01-21 17:36:08 +01:00
Stefan Weil
0ae8fdc859 Fix build for unicharcompress_test
* Add abseil library
* Add minimalistic implementation for WriteStringToFile
* Add missing namespace for std::string

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-19 08:34:00 +01:00
Shree Devi Kumar
e67ad46fca fix typo 2019-01-19 05:24:17 +00:00
Shree Devi Kumar
9e599e1e54 Partial fix for unicharcompress_test 2019-01-19 05:13:03 +00:00
Stefan Weil
9b2bf10391 Fix build for unichar_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-18 21:15:33 +01:00
Shree Devi Kumar
20ed60b31f Fix unicharset_test 2019-01-18 16:41:29 +00:00
Stefan Weil
502bb624c2 More optimisations for IntSimdMatrix
* Move IntDotProductSSE. That allows inlining of the code.
* Improve IntDotProductSSE by moving some instructions.
* Remove unused num_input_groups_ from IntSimdMatrix.
* Re-order elements in IntSimdMatrix to avoid padding.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
95606398f5 Clean code for IntSimdMatrix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
7fc7d28dd0 Compile files for AVX, AVX2 or SSE only when needed
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
a9a1035e55 Move IntSimdMatrixNative from IntSimdMatrix to unittest
It is only used for the unit test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
605b4d66c7 Replace dynamically allocated IntSimdMatrix instances by constants
Two header files are no longer needed and could be removed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
26be7c5d2e Use constructor with parameters for IntSimdMatrix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
7c70147701 Move shaped weights from IntSimMatrix to WeightMatrix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 21:34:37 +01:00
Stefan Weil
c4de29d16f unittest: Allow more time for apiexample_test when using a debug build
OCR of an image needs much more time than 55 s when running with
a debug build without optimisations on a slow host.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 17:56:35 +01:00
Stefan Weil
e67751633a unittest: Fix comment
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 17:56:35 +01:00
Stefan Weil
a5283f293d Add test for the C++ implementation of MatrixDotVector
Check also whether the sum of all results matches the expected value.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 17:56:35 +01:00
Stefan Weil
5d3d251267 Fix build for unittest
Debug builds failed because libpthread (needed for googletest) was missing.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-14 17:56:35 +01:00
Stefan Weil
5dd606c631 Replace NULL by nullptr
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-01 22:45:49 +01:00