Stefan Weil
6bf5080d4c
Remove unused include statements for strngs.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 23:11:08 +01:00
Egor Pugin
11a55c6c79
[readme] Require C++17 for building.
2021-03-13 00:56:40 +03:00
Egor Pugin
a393df5038
Add missing export header.
2021-03-13 00:07:19 +03:00
Egor Pugin
2d10be5209
[clang-format] Format generated protobuf source.
2021-03-13 00:07:03 +03:00
Egor Pugin
1d5b083447
[clang-format] Format unit tests.
2021-03-13 00:06:34 +03:00
Egor Pugin
618b185d14
Include missing config_auto.h
2021-03-12 23:39:18 +03:00
Egor Pugin
8b0c5405e2
Add missing forward decl.
2021-03-12 22:35:30 +03:00
Egor Pugin
0eb7ba88bf
[clang-format] Execute clang format on include and src dirs.
...
Script:
find include src -type f | sort > all.txt
find include src -type f | grep -v "\.cpp" | grep -v "\.h" | sort > skip.txt
comm -23 all.txt skip.txt | xargs clang-format -i
2021-03-12 22:35:02 +03:00
Egor Pugin
afa476bc23
[clang-format] Update config.
2021-03-12 22:33:22 +03:00
Egor Pugin
0e9deb68c9
Revert "Format public API files with 'clang-format-11 -i include/tesseract/*.h'"
...
This reverts commit c20da5e10f
.
2021-03-12 20:20:34 +03:00
Stefan Weil
c20da5e10f
Format public API files with 'clang-format-11 -i include/tesseract/*.h'
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:26:38 +01:00
Stefan Weil
b68a2a7b47
Fix tatweel_test for C++-20
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:16:48 +01:00
Stefan Weil
4c6cc5a04d
Replace GenericVector by std::vector in class ImageData
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:10:25 +01:00
Egor Pugin
520aeb34aa
Merge pull request #3323 from Shreeshrii/ci
...
Actions CI: Add vcpkg build for tesseract 4.1 (windows and linux)
2021-03-12 11:51:44 +03:00
Shree
33c129f50f
Actions CI: comment #push
2021-03-12 05:02:55 +00:00
Shree
edf6e0f433
Actions CI: Add vcpkg build for tesseract 4.1
2021-03-12 04:59:41 +00:00
Stefan Weil
fc00834920
autobuild: Require C++17
...
This completes commit 73a325494e
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-11 21:57:02 +01:00
Ger Hobbelt
779aa79350
Fix build ( #3322 )
...
* fix errors after merge commit: missing changes that are needed too to make this codebase compile.
* Update src/wordrec/wordrec.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-11 21:43:07 +01:00
Egor Pugin
3444618075
Fix linux build.
2021-03-10 15:35:13 +03:00
Egor Pugin
ce058604ba
Pass empty strings into Tesseract::init_tesseract().
2021-03-10 15:21:03 +03:00
Egor Pugin
911dd93f12
Pass init strings as std::string instead of const char * internally. This does not affect public APIs.
2021-03-10 15:17:00 +03:00
Egor Pugin
9792f3c4ff
Remove STRING::size() method.
2021-03-10 14:58:37 +03:00
Egor Pugin
6de97309a1
Remove unused STRING::strdup().
2021-03-10 14:42:50 +03:00
Egor Pugin
f0e30a2af2
Remove unused STRING::unsigned_size().
2021-03-10 14:41:31 +03:00
Egor Pugin
d36adf3d40
Replace STRING::truncate_at() with resize().
2021-03-10 14:40:28 +03:00
Egor Pugin
e9a2fc0083
More std::string replacements.
2021-03-10 14:36:59 +03:00
Egor Pugin
73a325494e
[cmake] Require C++17.
2021-03-10 00:41:47 +03:00
Stefan Weil
0f1296c6f6
Clean implementation for (de-)serialization of a vector
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-08 13:33:48 +01:00
Egor Pugin
0cd6a07e42
Update .travis.yml
2021-03-08 03:02:25 +03:00
Stefan Weil
6cfe604d58
Fix serialization for vector of RecodedCharID
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-07 23:01:25 +01:00
Shreeshrii
33868a52ae
Travis: build linux matrix ( #3320 )
2021-03-07 19:31:02 +01:00
Egor Pugin
576c064b44
Merge pull request #3318 from Shreeshrii/travis
...
Add multiple architectures for travis run
2021-03-06 12:20:25 +03:00
Shree Devi Kumar
4fd0bca6c9
Add multiple architectures for travis run
2021-03-06 08:30:14 +00:00
Stefan Weil
0cde3ede98
Add heuristic to fix swap (partially fixes issue #2586 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
a2769aebb4
Replace GenericVector<TBOX> by std::vector<TBOX>
...
Fix also endianness handling for (de)serialisation of TBOX.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
c31c1a7d60
Fix two compiler warnings for serialis.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
fe614c6069
Enable less FP exceptions for clang compiler when running tesseract
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-03 22:56:07 +01:00
Egor Pugin
c39b1daa6b
GenericVector -> std::vector.
2021-03-03 22:22:00 +03:00
Egor Pugin
0a693a9519
Allow to serialize std vectors with classes from TFile. Implementation from GenericVector.
2021-03-03 22:21:40 +03:00
Stefan Weil
ff830775f9
Fix memory leak in DocumentCache
...
It was introduced in commit 5cac52173e
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-01 11:31:48 +01:00
Stefan Weil
339c01894e
Avoid fp division by 0 (fix issue #3314 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-28 19:42:01 +01:00
Egor Pugin
838a754d24
Merge pull request #3313 from stweil/learning_rate
...
Add new checks for floating point errors and fix a division by zero
2021-02-27 23:20:09 +03:00
Stefan Weil
cd60728e8a
Avoid float division by zero when calculating adaptive learning rate
...
The following line results in a division by zero when
momentum is -1 and num_samples is even:
learning_rate /= 1.0f - pow(momentum, num_samples);
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
c12dde2862
Use float instead of double for learning_rate, momentum and adam_beta
...
Only WeightMatrix::Update used double parameters, all other functions
already used float. So this change avoids unnecessary conversions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
422452b9f4
Check for float errors when running tesseract and lstmtraining
...
Some illegal floating point calculations like division by zero,
illegal value or overflow will now abort tesseract with an error
message.
For lstmtraining there is now a new parameter --debug_float to
enable the same kind of checks. It is currently disabled by default
because such errors occur and would abort the training process.
That should be fixed in the future.
If tesseract also shows floating point errors which cannot be
fixed easily, a similar parameter to enable the checks can be
added there, too.
The new code requires the function feenableexcept which is only
available with the GNU libc, so it is only used on Linux.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:49:27 +01:00
Stefan Weil
51a214a51b
Remove unused include statements for imagedata.h and document used ones
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:28 +01:00
Stefan Weil
1d7a981203
Disable code for unused classes WordFeature and FloatWordFeature
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:17 +01:00
Stefan Weil
5cac52173e
Replace PointerVector by std::vector in class DocumentCache
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:07 +01:00
Stefan Weil
387acd9881
Initialize weight matrix with 0.0 (fix issue #3229 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 18:49:39 +01:00
Egor Pugin
1ab6b0fbc6
Merge pull request #3311 from stweil/master
...
Replace calls of exit function
2021-02-26 17:43:53 +03:00