Stefan Weil
7178ebd799
Add missing TESS_API for new function tesseract::split
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-14 08:16:33 +01:00
Stefan Weil
36f9131e04
Move implementation of tesseract::split from header to cpp file
...
This fixes duplicate symbols for some builds.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:39:58 +01:00
Stefan Weil
3b0759940c
Replace more STRING by std::string
...
Remove STRING::add_str_int and STRING::add_str_double which are now unused.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:16:35 +01:00
Stefan Weil
c9f0da49ca
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
91f7675848
Replace more STRING by std::string for src/ccmain
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d084c7cca8
Replace remaining STRING by std::string for src/api
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
96d1644da1
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
a42c6c7dcd
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
9cf5b9870d
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
51909d5a2e
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d6495d9026
Replace STRING by std::string in src/lstm
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:51 +01:00
Stefan Weil
1f2ec4dfb1
Fix network specification for NT_SYMCLIP
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 13:10:37 +01:00
Stefan Weil
6bf5080d4c
Remove unused include statements for strngs.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 23:11:08 +01:00
Egor Pugin
a393df5038
Add missing export header.
2021-03-13 00:07:19 +03:00
Egor Pugin
2d10be5209
[clang-format] Format generated protobuf source.
2021-03-13 00:07:03 +03:00
Egor Pugin
618b185d14
Include missing config_auto.h
2021-03-12 23:39:18 +03:00
Egor Pugin
8b0c5405e2
Add missing forward decl.
2021-03-12 22:35:30 +03:00
Egor Pugin
0eb7ba88bf
[clang-format] Execute clang format on include and src dirs.
...
Script:
find include src -type f | sort > all.txt
find include src -type f | grep -v "\.cpp" | grep -v "\.h" | sort > skip.txt
comm -23 all.txt skip.txt | xargs clang-format -i
2021-03-12 22:35:02 +03:00
Stefan Weil
4c6cc5a04d
Replace GenericVector by std::vector in class ImageData
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:10:25 +01:00
Ger Hobbelt
779aa79350
Fix build ( #3322 )
...
* fix errors after merge commit: missing changes that are needed too to make this codebase compile.
* Update src/wordrec/wordrec.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-11 21:43:07 +01:00
Egor Pugin
3444618075
Fix linux build.
2021-03-10 15:35:13 +03:00
Egor Pugin
ce058604ba
Pass empty strings into Tesseract::init_tesseract().
2021-03-10 15:21:03 +03:00
Egor Pugin
911dd93f12
Pass init strings as std::string instead of const char * internally. This does not affect public APIs.
2021-03-10 15:17:00 +03:00
Egor Pugin
9792f3c4ff
Remove STRING::size() method.
2021-03-10 14:58:37 +03:00
Egor Pugin
6de97309a1
Remove unused STRING::strdup().
2021-03-10 14:42:50 +03:00
Egor Pugin
f0e30a2af2
Remove unused STRING::unsigned_size().
2021-03-10 14:41:31 +03:00
Egor Pugin
d36adf3d40
Replace STRING::truncate_at() with resize().
2021-03-10 14:40:28 +03:00
Egor Pugin
e9a2fc0083
More std::string replacements.
2021-03-10 14:36:59 +03:00
Stefan Weil
0f1296c6f6
Clean implementation for (de-)serialization of a vector
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-08 13:33:48 +01:00
Stefan Weil
6cfe604d58
Fix serialization for vector of RecodedCharID
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-07 23:01:25 +01:00
Stefan Weil
0cde3ede98
Add heuristic to fix swap (partially fixes issue #2586 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
a2769aebb4
Replace GenericVector<TBOX> by std::vector<TBOX>
...
Fix also endianness handling for (de)serialisation of TBOX.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
c31c1a7d60
Fix two compiler warnings for serialis.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
fe614c6069
Enable less FP exceptions for clang compiler when running tesseract
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-03 22:56:07 +01:00
Egor Pugin
c39b1daa6b
GenericVector -> std::vector.
2021-03-03 22:22:00 +03:00
Egor Pugin
0a693a9519
Allow to serialize std vectors with classes from TFile. Implementation from GenericVector.
2021-03-03 22:21:40 +03:00
Stefan Weil
ff830775f9
Fix memory leak in DocumentCache
...
It was introduced in commit 5cac52173e
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-01 11:31:48 +01:00
Stefan Weil
339c01894e
Avoid fp division by 0 (fix issue #3314 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-28 19:42:01 +01:00
Stefan Weil
cd60728e8a
Avoid float division by zero when calculating adaptive learning rate
...
The following line results in a division by zero when
momentum is -1 and num_samples is even:
learning_rate /= 1.0f - pow(momentum, num_samples);
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
c12dde2862
Use float instead of double for learning_rate, momentum and adam_beta
...
Only WeightMatrix::Update used double parameters, all other functions
already used float. So this change avoids unnecessary conversions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
422452b9f4
Check for float errors when running tesseract and lstmtraining
...
Some illegal floating point calculations like division by zero,
illegal value or overflow will now abort tesseract with an error
message.
For lstmtraining there is now a new parameter --debug_float to
enable the same kind of checks. It is currently disabled by default
because such errors occur and would abort the training process.
That should be fixed in the future.
If tesseract also shows floating point errors which cannot be
fixed easily, a similar parameter to enable the checks can be
added there, too.
The new code requires the function feenableexcept which is only
available with the GNU libc, so it is only used on Linux.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:49:27 +01:00
Stefan Weil
51a214a51b
Remove unused include statements for imagedata.h and document used ones
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:28 +01:00
Stefan Weil
1d7a981203
Disable code for unused classes WordFeature and FloatWordFeature
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:17 +01:00
Stefan Weil
5cac52173e
Replace PointerVector by std::vector in class DocumentCache
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:07 +01:00
Stefan Weil
387acd9881
Initialize weight matrix with 0.0 (fix issue #3229 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 18:49:39 +01:00
Egor Pugin
1ab6b0fbc6
Merge pull request #3311 from stweil/master
...
Replace calls of exit function
2021-02-26 17:43:53 +03:00
Stefan Weil
58304cbfdd
Don't compile OpenCL code when OpenCL is disabled
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 15:40:23 +01:00
Stefan Weil
a6946c3bf9
Replace calls of exit function
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:36 +01:00
Stefan Weil
373a3527ec
Format code
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:09 +01:00
Stefan Weil
ea446b1eae
Remove blanks at line endings
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:05:36 +01:00