Commit Graph

366 Commits

Author SHA1 Message Date
Stefan Weil
e0ce040832 Replace remaining STRING by std::string in src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Egor Pugin
26ceeef6c0 [training] Modernize. 2021-03-14 23:47:42 +03:00
Egor Pugin
bcebf04f8e [unittest] Use more smart ptrs, more std::make_unique instead of .reset(new T()). 2021-03-14 23:06:19 +03:00
Stefan Weil
3b0759940c Replace more STRING by std::string
Remove STRING::add_str_int and STRING::add_str_double which are now unused.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:16:35 +01:00
Stefan Weil
c9f0da49ca Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Egor Pugin
1d5b083447 [clang-format] Format unit tests. 2021-03-13 00:06:34 +03:00
Stefan Weil
b68a2a7b47 Fix tatweel_test for C++-20
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:16:48 +01:00
Egor Pugin
ce058604ba Pass empty strings into Tesseract::init_tesseract(). 2021-03-10 15:21:03 +03:00
Stefan Weil
c12dde2862 Use float instead of double for learning_rate, momentum and adam_beta
Only WeightMatrix::Update used double parameters, all other functions
already used float. So this change avoids unnecessary conversions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
ea446b1eae Remove blanks at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:05:36 +01:00
Stefan Weil
0b8e937655 Use countof to get number of array elements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
bc69e28de3 Update include statements for external header file allheaders.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-13 10:17:20 +01:00
Stefan Weil
971c6e6d6b automake: Flat build for unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-24 18:21:37 +01:00
Shree Devi Kumar
53e1ae9ebf Fix Memory leak in ligature_table_test 2021-01-24 18:17:06 +01:00
Stefan Weil
139d127ff7 Remove unneeded include statement for genericvector.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
71fb535427 Remove unneeded include statement for strngs.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
5a3d6e5e0d Fix memory leak in mastertrainer_test (fixes issue #3215)
The issue was introduced in commit 6e9456415.

Partially reverting this commit fixes it.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 14:54:38 +01:00
Shreeshrii
5d8594cd80
Reduce number of INFO messages from lstm_test (#3250) 2021-01-22 07:59:02 +01:00
Shree Devi Kumar
e07c99d874 Replace deprecated INSTANTIATE_TEST_CASE_P 2021-01-20 18:03:05 +00:00
Shree Devi Kumar
fe4951e4d5 Do not run textlineprojection_test with disable-legacy, uses OSD 2021-01-15 13:04:38 +00:00
Shree Devi Kumar
4df7021d98 Remove unnecessary subtest with missing input image 2021-01-14 15:38:55 +01:00
Stefan Weil
a522377993 Fix stringrenderer_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-10 23:22:45 +01:00
Stefan Weil
59b3a79e0b Fix ligature_table_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-10 23:22:45 +01:00
Stefan Weil
3851e30a48 Fix pango_font_info_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-10 23:22:45 +01:00
Shree Devi Kumar
3c71749b86 Delete TESSDATA_BEST_DIR macro 2021-01-08 20:25:26 +01:00
Stefan Weil
e46141ac99 Replace snprintf by strncpy (fix compiler warning)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-08 19:01:26 +01:00
Stefan Weil
ea4f9de4f4 Add include path for leptonica for fuzzer build
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-07 22:30:09 +01:00
Egor Pugin
9710bc0465 More std::vector. 2021-01-07 13:57:57 +03:00
Stefan Weil
66128429e5 Fix include statement for allheaders.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-05 23:18:39 +01:00
Stefan Weil
d000df7e00 Remove remaining parts of tessopt (fix autotools build)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-05 23:06:17 +01:00
Egor Pugin
db43bb43dc [test] Init FC early. 2021-01-06 00:30:52 +03:00
Egor Pugin
e6b00e6579 [test] Init fontconfig early. 2021-01-05 20:48:09 +03:00
Egor Pugin
6e94564152 [training] More unique ptrs. 2021-01-05 17:03:26 +03:00
Egor Pugin
4415209fd6 Remove tessopt. This fixes mastertrainer test in shared build. 2021-01-05 17:00:27 +03:00
Egor Pugin
d30b5415fd Reorder headers. 2021-01-05 16:46:24 +03:00
Egor Pugin
4ed601956e More std::vector. 2021-01-05 14:46:11 +03:00
Egor Pugin
e3dcfb648a Reorder includes. 2021-01-04 18:11:23 +03:00
Stefan Weil
a96a05df7a Add some basic tests for ELIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-03 22:02:51 +01:00
Stefan Weil
4186978dfc Add Leptonica library for ligature_table_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-03 21:45:27 +01:00
Egor Pugin
fd8907471c Improve C API. Add tests.
1. Add simple C API test in C++ program.
2. Add simple C API test in C program.
3. Fix including capi.h in C++ files.
2021-01-02 03:57:25 +03:00
Stefan Weil
47af1282f4 Make autotools builds for unittest less noisy by default
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-31 18:17:25 +01:00
Stefan Weil
19213e23a0 Fix broken autotools build for unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-31 17:53:01 +01:00
Egor Pugin
a32c8b2d93 Remove GenericVector::compare_callback. This fixes several tests after previous commit. 2020-12-31 17:26:40 +03:00
Egor Pugin
c86325e2f7 Use TESS_API for every public symbol. Public symbol is exported from the library. This also applies to unit test and training symbols. Users will be limited to public api, but set of exported symbols will be wider still.
Remove TESS_LOCAL.
Fix several symbol issues that made visible with these changes.

All build systems must set -fvisibility-hidden for *nix systems.
2020-12-31 16:32:29 +03:00
Egor Pugin
3a66282e92 Remove GOOGLE_TESSERACT ifdefs. 2020-12-31 14:23:52 +03:00
Stefan Weil
fc4002dda8 Remove helpers.h from public API
Remove also outdated references to apitypes.h which no longer exists.

Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-31 09:06:16 +01:00
Egor Pugin
7b8af67eb5 [test] Fix intsimdmatrix test. Update result value based on updated TRand engine. 2020-12-31 03:28:36 +03:00
Stefan Weil
eb9349a0eb Run more unittests without requiring tensorflow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 20:10:26 +01:00
Stefan Weil
a520b2a2fa Improve CHECK macro for unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 20:08:34 +01:00
Stefan Weil
f7d7aa6b95 Make tmp directory for all unit tests
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 18:17:58 +01:00
Egor Pugin
b7df4bc1dd [test] Create tmp dir in more tests. 2020-12-30 16:44:59 +03:00
Egor Pugin
a3f8172918 [test] Remove set locale as it causes errors on some systems. It includes grouping for numbers, then pid and some other numbers in gtest are formatted incorrectly. 2020-12-30 16:30:40 +03:00
Egor Pugin
aacd8ec3cf Fix more lstm tests. 6 failing tests left. 2020-12-30 15:15:11 +03:00
Egor Pugin
79226fa7cf [test] Fix params model test. 2020-12-30 14:20:15 +03:00
Egor Pugin
7300e87f3e Merge branch 'master' of github.com-egorpugin:tesseract-ocr/tesseract 2020-12-30 14:16:33 +03:00
Egor Pugin
14cc5fca5a [test] Fix shapetable test. 2020-12-30 14:16:10 +03:00
Stefan Weil
688ef20f62 Replace GenericVector<RowInfo> by std::vector<RowInfo>
This fixes an LGTM alert:

    This parameter of type RowInfo is 144 bytes -
    consider passing a const pointer/reference instead.

It might also improve the performance.

Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 12:14:43 +01:00
Egor Pugin
fa776eefd9 [test] Disable loading equ.traineddata in equationsdetect test until IdentifySpecialText is turned back on. 2020-12-30 14:12:49 +03:00
Egor Pugin
b538a25809 [test] Reorder includes. 2020-12-30 13:53:49 +03:00
Stefan Weil
3a34f17037 Order and clean include statements
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 10:50:39 +01:00
Stefan Weil
deec8ef46f Replace std::list by std::vector
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 07:10:29 +01:00
Stefan Weil
4043204c2b Use old genericvector.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 07:10:29 +01:00
Egor Pugin
7e3ea8e3d3 [test] Fix bitvector test by creating tmp dir. 15 failing tests left. 2020-12-30 03:39:07 +03:00
Egor Pugin
3817fed897 [test] Reorder includes. 2020-12-30 03:33:38 +03:00
Egor Pugin
dc9bfde8ec [test] Fix mkdir on unix in dawg test. 2020-12-30 03:33:28 +03:00
Egor Pugin
f8957ebcc5 [test] Fix dawg. 2020-12-30 02:38:11 +03:00
Egor Pugin
694f0097fd Fix baseapi test. Use C++ regex instead of gtest ones. 2020-12-30 01:28:50 +03:00
Stefan Weil
f4e380f64a Remove serialis.h from public API
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-29 11:28:50 +01:00
Stefan Weil
90af3e7b5c Remove strngs.h from public API
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
fec9c11c8c Use std::vector, std::string in baseapi.h
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
64e902ddf7 Remove genericvector.h from public API
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
085f6b2572 Use std::list for paragraph models
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Egor Pugin
98974a6913 [test] Fix include order. 2020-12-28 20:36:04 +03:00
Egor Pugin
4dcfb5006c [test] Correctly use assert instead of expect. 2020-12-28 03:24:05 +03:00
Egor Pugin
3187f2ef08 Move doubleptr.h to unittests as it is used only there. 2020-12-28 02:32:27 +03:00
Egor Pugin
6cc00aa332 Improve some unit tests. 2020-12-28 01:11:13 +03:00
Stefan Weil
2fe1532926 Fix some compiler errors for heap_test (more remaining)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-27 18:30:56 +01:00
Stefan Weil
a61d7ac2ee Add / fix namespace tesseract for unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-27 10:54:43 +01:00
Stefan Weil
5c579de68a Fix dependency on tmp directory for unittest programs
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-24 08:54:54 +01:00
Stefan Weil
30e3f10b3f Fix tar command for variants which require -j or -z
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-23 20:06:18 +01:00
Stefan Weil
49deadd799 Simplify code for equationdetect_test
It no longer depends on TensorFlow code, so it is now always enabled.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 17:03:06 +01:00
Stefan Weil
fef6004e6f Simplify code for cleanapi_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 17:01:07 +01:00
Stefan Weil
ce8ee86204 Remove unwanted # at EOL
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 16:44:23 +01:00
Stefan Weil
2bfa52d517 Force fontconfig pangocairo backend for stringrenderer_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 12:59:57 +01:00
Stefan Weil
0d1e540267 Force fontconfig pangocairo backend for ligature_table_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 12:59:57 +01:00
Stefan Weil
4ce4e5ef66 Add more dependencies for unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 12:59:57 +01:00
Stefan Weil
5aec08d9f2 Add rules to get fonts required for unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-23 12:59:57 +01:00
Stefan Weil
00a09c2f42 Force fontconfig pangocairo backend for pango_font_info_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 21:32:05 +01:00
Stefan Weil
e75b217b37 Enable pango_font_info_test for unit tests
Most parts of that test can now be used without Tensorflow code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 08:29:53 +01:00
Stefan Weil
e66243fcea Fix unittest for flag training build
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-21 16:24:02 +01:00
Stefan Weil
0b97bc5c16 Fix include statements for Leptonica header
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-13 13:26:36 +01:00
Stefan Weil
6fcf8d23bc Use more compiler and linker flags from pkg-config
This fixes some build issues with Homebrew on MacOS.

Signed-off-by: Stefan Weil <stefan@Sabines-Mac-mini.fritz.box>
2020-12-13 13:24:46 +01:00
Stefan Weil
b303dd6ac2 Add more patterns to suppress memory leaks from libfontconfig
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-04 13:30:58 +01:00
Stefan Weil
5eb5e6ea23 Suppress some LeakSanitizer errors in unit tests
The fontconfig library has some (intentional) memory leaks which
must be suppressed for unit tests with the LeakSanitizer.

This fixes the issues #3156 and #3157.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-04 07:25:49 +01:00
Shree Devi Kumar
31710098e3 fixes issue 3099 2020-11-23 13:30:26 +00:00
Stefan Weil
92b6c652f3 Use std::vector for scales_
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-29 08:00:11 +01:00
Stefan Weil
c15dd26b84 Don't pass scales_ to IntSimdMatrix::Init
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-28 20:35:53 +01:00
Stefan Weil
fe76142a3d Remove GenericVector::scale() again
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-28 16:24:59 +01:00
Robin Watts
872816897a Rejig intsimdmatrix to reduce FP ops.
Avoid 1) floating point division by 127, 2) conversion of
bias to double, 3) FP addition, in favour of 1) integer
multiplication by 127, and 2) integer addition.

(Also costs extra work in the serialisation/deserialisation of
the scale values, and conversion of weights to int formats, but
these are all one offs).
2020-10-12 04:30:46 -07:00
Robin Watts
9dfdac51c6 Tweak scales array for intSimdMatrix case.
Currently, the size of the scales array is not rounded up
in the same way as the weights are. This blocks us pushing
the scale calculations into the SIMD, as when we "overread"
the end of the scale array, we potentially get errors.

Here, we adjust the intSimdMatrix stuff to ensure that the
scales array reserves enough entries to allow such overreads
to work.

This doesn't make any difference for now, but opens the way
for future optimisations.
2020-10-12 11:47:16 +01:00