Commit Graph

2089 Commits

Author SHA1 Message Date
Stefan Weil
158c845228 Catch another FP division by 0 (fixes issue #3483)
Rewriting the code avoids FP operations (so makes it potentially faster)
and fixes the division by 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-03 15:37:24 +02:00
Stefan Weil
4b630a8813 Catch FP division by 0 (fixes issue #3483)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-02 15:04:31 +02:00
Stefan Weil
a701454ae5
Fix vector resize with init for all elements (issue #3473) (#3474)
Fixes: c8b8d266d6
Fixes: 9710bc0465
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-06-29 21:05:29 +03:00
nagadomi
ff1062d39d
Add --reset_learning_rate option to lstmtraining (#3470)
When the --reset_learning_rate option is specified,
it resets the learning rate stored in each layer of the network
loaded with --continue_from to the value specified by the --learning_rate option.
If checkpoint is available, it does nothing.
2021-06-28 11:48:07 +03:00
nagadomi
d8bd78f8e2
Fix missing reset of best_error_history_ in LSTMTrainer::InitIterations() (#3469) 2021-06-27 09:26:32 +03:00
nagadomi
b2fa77f8f0
Show layer specified learning rates with combine_tessdata -l (#3468) 2021-06-26 08:08:54 +03:00
MonkeybreadSoftware
75e6c3ea4c
Null check for GetSourceYResolution (#3457)
* Null check for GetSourceYResolution

Added missing NULL check to avoid crash when we read property in our tesseract wrapper.

* Added missing return value.

added -1 to return if undefined.
2021-06-16 16:35:24 +03:00
Amit Dovev
bf979c801a Remove unused variable 2021-05-21 20:34:09 +03:00
Egor Pugin
a72408fdef
Merge pull request #3438 from amitdo/pango
Raise Minimum required Pango version to 1.38.0
2021-05-21 20:09:27 +03:00
Amit Dovev
8615f65cc4 Raise Minimum required Pango version to 1.38.0 2021-05-21 19:56:37 +03:00
Amit Dovev
c24538518c ThresholdMethod::TiledSauvola -> ThresholdMethod::Sauvola
The fact that this method uses tiles is implementation detail. It does not change the result compared to Sauvola without tiles. The use of tiles minimize memory consumption.
2021-05-21 18:15:30 +03:00
Stefan Weil
93348a83a3 Remove scripts for training
They were replaced by Python3 scripts (part of the tesstrain repository).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-18 10:47:44 +02:00
nagadomi
42e4b91132 Refactor ObjectCache::DeleteUnusedObjects with reverse iterator 2021-05-17 14:50:30 +02:00
nagadomi
dc4a8a6ce0 Fix crash in ObjectCache::DeleteUnusedObjects 2021-05-17 10:25:17 +09:00
Stefan Weil
0c4e2f1cb5 Fix comment in code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-16 07:47:19 +02:00
Stefan Weil
57b7974292 Remove an arbitrary limit for the image size
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
a0cf117c5d Fix compiler warning in binarization code (uninitialized local variable)
Simplify the code also a little bit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
bf84fb9f2d Optimize code for binarization
Some code is only needed for Otsu or even not at all.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
4b5dd25b84 Fix compiler warning
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
12c29639fc Add conditional compilation with GRAPHICS_DISABLED
This fixes a compiler warning when GRAPHICS_DISABLED is defined.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-13 17:22:24 +02:00
Nick White
ad7010a5eb lstmeval: Only print char and word error rates for verbosity 2/3 2021-05-11 13:15:35 +01:00
Nick White
4787414d88 lstmeval: Print char and word error rates for each line tested 2021-05-11 10:54:34 +01:00
Nick White
9c82cc63c2 Switch to NFC normalisation everywhere 2021-05-11 10:18:06 +01:00
Egor Pugin
43747d6ea8 Postfix for #3418. 2021-05-10 15:06:27 +03:00
Egor Pugin
e7c01a6f15
Merge pull request #3418 from amitdo/thresholder
Add more binarization options
2021-05-10 14:45:03 +03:00
Amit Dovev
21e76c7a13 Convert enum ThreshMethod to enum class 2021-05-09 18:49:09 +03:00
Egor Pugin
176d0927bd Allow explicit casts of Image to Pix**. 2021-05-07 21:30:42 +03:00
Amit Dovev
11c73c9481 Add more binarization options
Use functions from Leptonica to provide more binarization options. The new options are: 1) Adaptive Otsu and 2) Sauvola (Tiled) .
2021-05-07 16:48:26 +03:00
Egor Pugin
65118b2e3a [misc] Fix variable type. Fixes warning. 2021-05-04 16:12:40 +03:00
Egor Pugin
346b77c94e Remove unneeded header. 2021-05-04 16:10:52 +03:00
Egor Pugin
4fbe9f1de2 Revert d6cdc52. Fixes #3412. 2021-05-04 00:51:39 +03:00
Ger Hobbelt
bd8adff829 fix compile error: PrintFontsTable() is for legacy builds only
# Conflicts:
#	googletest
2021-04-29 23:27:20 +02:00
Lucas Cimon
b852d658cb Adding --print-fonts-table parameter & tessedit_font_id configuration option 2021-04-29 11:25:40 +02:00
Stefan Weil
2e2a5b3ef4 Improved fix for issue #3405
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-27 22:15:36 +02:00
Stefan Weil
0b7fc068d2 Revert "Fix double free. Closes #3405."
This reverts commit 3997cf54d2.
It will be replaced by a simpler fix.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-27 22:15:18 +02:00
Egor Pugin
3a195e5b05 Misc. 2021-04-27 22:08:29 +03:00
Egor Pugin
3997cf54d2 Fix double free. Closes #3405. 2021-04-27 22:08:06 +03:00
Egor Pugin
e3ac1835e0 Remove unneeded ctor. 2021-04-23 04:26:18 +03:00
Egor Pugin
a7f938d28e Make FontSet just a vector. 2021-04-23 04:25:45 +03:00
Egor Pugin
4ae5a7d6b5 Properly init font set. 2021-04-23 04:05:59 +03:00
Egor Pugin
048e63c02b Replace FontSet struct with vector. It may be improved further (remove pointer?). 2021-04-23 02:38:25 +03:00
Egor Pugin
d6cdc521e5 Remove unused headers. 2021-04-23 02:06:06 +03:00
Stefan Weil
740d10b61b Fix issue #3404 (empty page regression)
The regression was caused by a bug in commit 5db92b26aa.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-22 20:51:23 +02:00
Stefan Weil
66a963b50a Remove two assertions which are triggered by fuzzing
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-20 19:04:49 +02:00
Stefan Weil
26c21a6db4 Fix some compiler warnings with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-20 07:58:31 +02:00
Stefan Weil
6d0595b443 Fix memory leak (OSS-Fuzz issue 33220)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-19 20:59:18 +02:00
Robert Pösel
c74ff1259b Fix wrong parameter name and documentation
set_only_init_params -> set_only_non_debug_params
2021-04-19 16:55:01 +02:00
Stefan Weil
2dfa38a072 Fix old TODO for struct EDGEPT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-17 18:08:27 +02:00
Fabrizio Di Vittorio
2be896d2b9 Add SVSemaphore destructor to avoid system objects leaks 2021-04-15 09:23:22 +02:00
Stefan Weil
e6e871bc73 Replace pointer by value for ScrollView mutex
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-15 06:30:05 +02:00
Stefan Weil
4daf781916 Fix NULL pointer access (issue #3394)
The regression was caused by commit 57c90eee02.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 22:10:12 +02:00
Stefan Weil
91b2b4f4a0 Fix OSS-Fuzz issue 32142 (container-overflow write)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 13:45:12 +02:00
Stefan Weil
f83f00496e Clean, format and optimize code in edgblob.cpp / edgblob.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 08:03:30 +02:00
Egor Pugin
a732565cad Fix headers. 2021-04-12 01:40:40 +03:00
Egor Pugin
4f6ff85123 Remove unneeded header. 2021-04-12 01:19:00 +03:00
Egor Pugin
57c90eee02 [edgblob] Replace unique ptr with vector. Fix possible index issues.
Closes #1921.
2021-04-12 01:17:57 +03:00
Stefan Weil
cca46e6b29 Fix another use-after-free (issue #3394)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 21:37:46 +02:00
Stefan Weil
33fa9d3223 Fix use-after-free (issue #3394)
This bug was introduced by commit f77b1c6881.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 19:10:44 +02:00
Egor Pugin
423f00c351
Merge pull request #3393 from eighttails/fix_zero_division
Fix division by zero during CJK training.
2021-04-11 15:38:28 +03:00
Tadahito Yao
8a8204e62a Reverted one of zero value checks. 2021-04-11 21:30:02 +09:00
Tadahito Yao
05eef742df Fix division by zero during CJK training. 2021-04-11 20:14:45 +09:00
Stefan Weil
0401b9470c Fix some typos (most found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 11:06:36 +02:00
Stefan Weil
f77b1c6881 Fix memory leak (OSS-Fuzz issue #32246)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-10 21:35:31 +02:00
Amit D
a4a84c4c92
lstmrecognizer.cpp: Call OutputStats() only when 'invert' is true (#3387)
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 17:55:23 +02:00
Amit Dovev
e6ce048426 Change message from 'Found SSE' to 'Found SSE4.1' 2021-04-08 17:51:09 +02:00
Stefan Weil
63f4463028 Add const attribute to some functions (API change)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
253751c331 Simplify class REJ by replacing two std::bitset<16> by one std::bitset<32>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
2fbcca783b Make more functions in class REJ inline
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
a74bbb6032 Remove bits16.h and BITS16 data type
Add also const attribute to some functions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
2fa96b765b Modernize and optimize list_rec a little bit
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:30:33 +02:00
Stefan Weil
7fd90498ca Modernize code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:30:33 +02:00
Egor Pugin
edfce72340 Refactor microfeatures a bit. 2021-04-07 17:29:46 +03:00
Egor Pugin
47715e576a Replace microfeatures from oldlist to std::forward_list. 2021-04-07 17:10:16 +03:00
Egor Pugin
2e17ee7327 Correct template args. 2021-04-07 13:28:57 +03:00
Stefan Weil
10255d013a Fix new / delete class mismatch
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 09:25:37 +02:00
Egor Pugin
b1731b6e73 Add missing TESS_API. 2021-04-07 00:59:36 +03:00
Egor Pugin
6e3259593a Reorder list templates. 2021-04-07 00:29:07 +03:00
Egor Pugin
409aa5296f Misc. 2021-04-07 00:17:04 +03:00
Egor Pugin
9d40512ade [elist2] Convert macros to template. Remove source file macro ELIST2IZE. 2021-04-07 00:15:01 +03:00
Egor Pugin
03435adca0 [elist] Rework macro into template and small macro. Move common iterator template into 'list_iterator.h'. 2021-04-07 00:04:30 +03:00
Egor Pugin
b9329e599f Misc. 2021-04-06 23:45:28 +03:00
Egor Pugin
746b87363b Remove unused methods. 2021-04-06 23:45:22 +03:00
Egor Pugin
29e75d0f51 [elist] Remove unused macros QUOTE_IT. 2021-04-06 23:40:56 +03:00
Egor Pugin
539f4b8255 [clist] Remove unused methods. 2021-04-06 23:40:35 +03:00
Egor Pugin
18e61d10ce Rework big clist macro into template and small macro. Remove unused macros QUOTE_IT and CLISTIZE (source file macro). 2021-04-06 23:37:14 +03:00
Raf Schietekat
6bbfef7c85 RAII: TessBaseAPI::GetIterator()
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 17:57:23 +02:00
Raf Schietekat
d71413f4aa RAII: TessBaseAPI::AnalyseLayout()
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 17:46:26 +02:00
Stefan Weil
897e59613d Clean code for hOCR renderer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 16:36:23 +02:00
Stefan Weil
3705989c94 Optimize length method for ELIST, ELIST2
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:57:12 +02:00
Stefan Weil
4104876b08 Add const attribute to some methods of ELIST, ELIST2 and related classes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:48:18 +02:00
Stefan Weil
fb904d2265 Remove redundant debug code for CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:26:04 +02:00
Stefan Weil
b47ce5643b Modernize CLIST code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:16:57 +02:00
Stefan Weil
fd187b0c18 Optimize CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:08:35 +02:00
Stefan Weil
4a628729b2 Delete assignment and copy constructor for ELIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:59:31 +02:00
Stefan Weil
b0b5600c30 Delete assignment and copy constructor for ELIST2, ELIST2_LINK
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:59:00 +02:00
Stefan Weil
24f91fab0b Delete assignment and copy constructor for CLIST, CLIST_LINK
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:42:01 +02:00
Stefan Weil
eeb67e8ae8 Replace find / insert by insert on unordered set to optimize GridSearch
Both find and insert can be slow for a large unordered set.

Instead of using both methods, it is sufficient to simply try only
the insert method which returns whether the insertion was possible
or not.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 18:11:33 +02:00
Egor Pugin
50aec308b3 Remove unnecessary pointer hasher for uset. 2021-04-04 14:00:46 +03:00
Stefan Weil
0611c892b6 Disable more code with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-02 16:43:26 +02:00
Egor Pugin
7a73875bd1
Merge pull request #3375 from amitdo/viewer
Disable more code with GRAPHICS_DISABLED
2021-04-02 12:27:24 +03:00
Amit Dovev
6d94b22c80 Disable more code with GRAPHICS_DISABLED 2021-04-02 11:12:38 +03:00
Egor Pugin
34e0d017ab Add Image::operator&=(). 2021-04-01 19:15:58 +03:00
Egor Pugin
9e3da4a724 Add Image::operator|=(). 2021-04-01 19:10:48 +03:00
Egor Pugin
e077b7255d Remove arg from Image::copy(). 2021-04-01 19:08:47 +03:00
Egor Pugin
d5fb7f9843 Init variable. 2021-04-01 17:16:46 +03:00
Egor Pugin
fe02ba2363 Add Image::isZero(). 2021-04-01 17:15:48 +03:00
Egor Pugin
306d296979 Add Image::clone(). 2021-04-01 17:06:30 +03:00
Egor Pugin
2aca22439e Add Image::copy(). 2021-04-01 16:55:43 +03:00
Stefan Weil
5159f9aa12 Fix name conflict between class and function named Image
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-01 14:00:08 +02:00
Egor Pugin
e429b607ae [misc] Update header guard. 2021-04-01 01:36:22 +03:00
Egor Pugin
1628a9aae3 Revert 4fa05b9147. Make a note. 2021-04-01 01:35:50 +03:00
Egor Pugin
a792b67983 Basic usage of new Image class. Only pixDestroy is wrapped at the moment.
Add new methods to Image class and replace them in non-public code.
2021-03-31 22:39:43 +03:00
Egor Pugin
ce6e2f1821 Initial tesseract Image wrapper.
Provide basic Pix conversions.
Add destroy() method.

It can be extended later to 1) image owner (raii), 2) different image libraries.
2021-03-31 22:38:32 +03:00
Egor Pugin
4fa05b9147 Remove unused ifdef. 2021-03-31 21:54:12 +03:00
Stefan Weil
722767633e Partially fix issue #3374
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-31 19:23:07 +02:00
Stefan Weil
b7c6d971f3 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-31 07:08:53 +02:00
Stefan Weil
6684a727c1 Improve some structs further (fixes several CID issues)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-30 14:20:52 +02:00
Nick White
abea25ee2f lstm: Include missing header 2021-03-29 18:53:35 +02:00
Stefan Weil
2e349dbba5 Fix compilation for Tensorflow code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:19:06 +02:00
Stefan Weil
3c03d70e64 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:12:52 +02:00
Stefan Weil
f639500a81 Add missing TESS_API for sw builds
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:34:23 +02:00
Stefan Weil
5c4de14567 Replace strdup / free by std::string in SVSync::StartProcess
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00
Stefan Weil
3790413cc5 Replace remaining malloc / free in training code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00
Stefan Weil
7c1bea505a Replace strdup / free by std::string for StringRenderer::features_
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00
Stefan Weil
201686feb8 Use lept_free instead of free for memory which was allocated by Leptonica
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 10:55:33 +02:00
Stefan Weil
1b95eb1d19 Replace malloc / free by std::string for LABELEDLISTNODE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 10:29:08 +02:00
Stefan Weil
1620daffcd Replace malloc / free by std::string in LABELEDLISTNODE and MERGE_CLASS_NODE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 10:17:42 +02:00
Stefan Weil
0976e23387 Replace malloc / free by new / delete for KDTREE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 23:19:46 +02:00
Stefan Weil
c05d849381 Replace malloc / free by new / delete for NORM_PROTOS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 22:37:47 +02:00
Stefan Weil
174210c849 Replace malloc / free by new / delete for MFEDGEPT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 22:24:51 +02:00
Stefan Weil
0c3d244238 Replace new / delete by std::vector for INT_CLASS_STRUCT::ProtoLengths
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 22:09:06 +02:00
Stefan Weil
486c257f42 Replace malloc / free by new / delete for MICROFEATURE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 21:20:59 +02:00
Stefan Weil
30f44f333a Replace malloc / free by new / delete for KDNODE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 21:11:22 +02:00
Stefan Weil
47a1fd7b45 Replace malloc / free by new / delete for INT_CLASS_STRUCT::ProtoLengths
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:41:37 +02:00
Stefan Weil
d6caae3793 Replace malloc / free by std::vector for BUCKETS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:32:57 +02:00
Stefan Weil
78f8a47d05 Replace malloc / free by std::vector for PROTOTYPE::Distrib
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
b8488dac7a Replace malloc / free for TEMPCLUSTER
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
2a569c9cfb Replace malloc / free for FLOATUNION::Elliptical
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
5bf1af257c Use std::vector<BIT_VECTOR> for CLASS_STRUCT::Configurations
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
6f499f7fb5 Use std::vector<PROTO_STRUCT> for CLASS_STRUCT::Prototypes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
441f74c1e6 Replace malloc / free for STATISTICS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
57d3a1eb99 Replace malloc / free for CLUSTER::Mean and PROTOTYPE::Mean
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:32 +02:00
Stefan Weil
667eee2344 Replace malloc / free for CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
0077bc46cf Replace malloc / free for ELIST2
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
2c273c1b3b Replace malloc / free for ELIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
582260a9bf Replace malloc / free for C_OUTLINE::steps
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
b15b5d1de7 Replace malloc / free by new / delete for FEATURE_STRUCT, FEATURE_SET_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
aa8dda89a3 Replace malloc / free by new / delete for CHAR_DESC_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 18:43:14 +01:00
Stefan Weil
0f90ccb9cd Replace malloc / free by new / delete for CHISTRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 16:45:14 +01:00
Stefan Weil
0a46866bcd Replace malloc / free by new / delete for PERM_CONFIG_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 16:19:40 +01:00
Stefan Weil
92359a4a11 Replace malloc / free by new / delete for TEMP_CONFIG_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 15:59:28 +01:00
Stefan Weil
fdf4539769 Replace malloc / free by new / delete for ADAPT_CLASS_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:49:57 +01:00
Stefan Weil
0a0a3e1946 Replace malloc / free by new / delete
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:00:18 +01:00
Stefan Weil
884a28b366 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:00:18 +01:00
Stefan Weil
77514d693f Modernize BitVector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:00:18 +01:00
Stefan Weil
0f72e0fdb3 Simplify checks for emptiness
Replace the patterns (x.size() == 0) and (x.length() == 0) by x.empty().

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-26 23:22:50 +01:00
Egor Pugin
067c971774 Misc. 2021-03-24 14:36:45 +03:00
Egor Pugin
7c975a0eee Remove default locale setting in debug config. Any locale errors must be fixed separately (if any).
Fixes #3290.
2021-03-24 14:36:40 +03:00
Stefan Weil
595346d548 Replace some snprintf by std::to_string and modernize more code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-24 08:01:59 +01:00
Stefan Weil
2048f328e0 Suppress output of page number for TIFF files with a single image
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 18:25:15 +01:00
Stefan Weil
264dfb3685 Don't convert for loop after '#pragma omp parallel' with clang-tidy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 15:48:59 +01:00
Stefan Weil
1205f036ea Remove TessBaseAPI::SetThresholder (API change)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 08:59:04 +01:00
Stefan Weil
7d70ed4b41 Modernize code for OTSU and reduce public API further
Remove thresholder.h from the public API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 08:59:04 +01:00
Stefan Weil
ef645ce334 Avoid lots of messages for training with single line images
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 16:06:30 +01:00
Egor Pugin
7677b80408
Merge pull request #3355 from eighttails/output_training_command_line
Print command line options if run_command() failed.
2021-03-22 15:13:31 +03:00
Tadahito Yao
3b436a72c5 Print command line options if run_command() failed. 2021-03-22 20:46:44 +09:00
Stefan Weil
67dcbdda2f Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 10:36:38 +01:00
Stefan Weil
4530763329 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:15:09 +01:00
Stefan Weil
fbaac9dc9d Modernize code (clang-tidy -checks='-*,google-readability-braces-around-statements')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:03:51 +01:00
Stefan Weil
a54dc6390d Modernize code (clang-tidy -checks='-*,modernize-use-auto')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:02:57 +01:00
Stefan Weil
77ed2886a7 Modernize code (clang-tidy -checks='-*,modernize-loop-convert')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:02:51 +01:00
Stefan Weil
d4d51910e1 Add braces to single line statements (clang-tidy -checks='-*,google-readability-braces-around-statements')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:02:13 +01:00
Stefan Weil
5384aa7b21 Modernize code (clang-tidy -checks='-*,modernize-use-equals-delete')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
406233f1ae Modernize code (clang-tidy -checks='-*,modernize-use-equals-default')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
27293fad62 Modernize code (clang-tidy -checks='-*,modernize-use-emplace')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
6fc31c44f8 Modernize code (clang-tidy -checks='-*,modernize-use-bool-literals')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
35e143ddfc Modernize code (clang-tidy -checks='-*,modernize-use-auto')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
1439efa734 Modernize code (clang-tidy -checks='-*,modernize-make-unique')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
02774bda6e Modernize code (clang-tidy -checks='-*,modernize-loop-convert')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
719dc1d7da Modernize code using override
The modifications were made using this command:

run-clang-tidy -header-filter='.*' -checks='-*,modernize-use-override' -fix

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 20:06:38 +01:00
Stefan Weil
187ac4136a Fix LGTM alert (local variable hides a parameter)
LGTM alert:

    Local variable 'correct_text' hides a parameter of the same name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 18:20:13 +01:00
Egor Pugin
7d17b72ba5 Use more smart pointers. 2021-03-21 15:19:21 +03:00
Stefan Weil
0c20d3f843 Fix compiler warnings (mostly -Wsign-compare)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 09:29:34 +01:00
Stefan Weil
55d87f642c Disable most Leptonica messages for tesseract by default
They were disabled in earlier builds which used NDEBUG, too.

Allow manual setting of the Leptonica message level
with environment variable LEPT_MSG_SEVERITY.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 20:16:16 +01:00
Stefan Weil
19afcdb79b Remove unused function UnicharIdArrayUtils::find_in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:51:28 +01:00
Stefan Weil
7af5b75b8f Disable unused WriteMemoryCallback if libcurl is not used
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:49:06 +01:00
Egor Pugin
db7a977eab Use smart pointers. 2021-03-20 16:04:45 +03:00
Egor Pugin
69ab5bbf65 Misc. 2021-03-20 16:04:00 +03:00
Stefan Weil
f176e7c274 Fix double free caused by commit f33e80e (fixes issue #3348)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 12:37:56 +01:00
Stefan Weil
87b0a4de97 Rename GenericVector::get
The new name GenericVector::at is compatible with standard containers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:42:19 +01:00
Stefan Weil
2c1c09bd6a Rename UnicityTable::get, UnicityTable::get_mutable
The new name UnicityTable::at is compatible with standard containers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:40:00 +01:00
Stefan Weil
883353df63 Replace std::array by std::vector to avoid stack overflow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:39:16 +01:00
Stefan Weil
ec2c989d00 Modernize code in src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:06:40 +01:00
Stefan Weil
54aec32586 Replace remaining PointerVector by std::vector for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 22:22:04 +01:00
Stefan Weil
0d739530a5 Remove unused PointerVector::DeSerialize, PonterVector::DeSerializeElement
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:17 +01:00
Stefan Weil
7207cf13d7 Replace more PointerVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:08 +01:00
Stefan Weil
aa64d83c2f Replace more PointerVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 15:22:29 +01:00
Stefan Weil
79477dc2fe Replace more PointerVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 14:46:25 +01:00
Stefan Weil
752779aaed Replace more PointerVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
cac116dd11 Replace more PointerVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
dae5accceb Replace remaining PointerVector by std::vector for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
9e006a8bbc Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
65d882f96e Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
8ed6dee8e9 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
abc22976e4 Replace remaining PointerVector by std::vector for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
7f11261076 Suppress resolution warning if no resolution was given
Tesseract reported confusing information for images without resolution:

    Warning: Invalid resolution 0 dpi. Using 70 instead.
    Estimating resolution as 642

The warning is also shown when the resolution is not used at all
when preparing data for training.

It is now suppressed when there is no resolution information
(resolution == 0).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 10:45:54 +01:00
Stefan Weil
52a82b4356 Fix new alert reported by LGTM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 23:27:17 +01:00
Stefan Weil
f33e80e2fb Replace remaining PointerVector by std::vector for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 20:14:40 +01:00
Stefan Weil
07d147d4a6 Replace more PointerVector by std::vector for src/textord
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 19:04:00 +01:00
Stefan Weil
b0e30bd247 Replace remaining PointerVector by std::vector for src/wordrec
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 18:56:08 +01:00
Stefan Weil
b62a86a93f Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 17:16:43 +01:00
Stefan Weil
177703c562 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
9e566de0f2 Remove unused classes WordFeature, FloatWordFeature
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
7b92614efa Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
a584ee5ac0 Add missing include statement (fix CI build)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:59:58 +01:00
Stefan Weil
9eab1d60c1 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:04:56 +01:00
Stefan Weil
f8d55f30d8 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:31:13 +01:00
Stefan Weil
d9739ba459 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:27:37 +01:00
Stefan Weil
4b428df131 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:18:49 +01:00
Stefan Weil
92e98a30e1 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:04:22 +01:00
Stefan Weil
573e7d6bb9 Replace more GenericVector by std::vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:58:13 +01:00
Stefan Weil
a80689559b Partially revert "Replace more GenericVector by std::vector for src/ccutil"
This partially reverts and cleans commit 96d72298b12f744a72e5c3cea67924779e859e42
which had broken intfeaturemap_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:43:32 +01:00
Stefan Weil
576d8d6c63 Partially revert "Replace remaining GenericVector by std::vector for src/training"
This partially reverts commit 7df1cb0bab
which had broken lstm_squashed_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 10:59:07 +01:00
Stefan Weil
77dbd3ee02 Remove two type casts
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 09:04:39 +01:00
Stefan Weil
7fdf79aff4 Move function ExtractFontName to baseapi.cpp
It is only used there, so now a local function.
This also allows removing blobclass.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a847e0f9b5 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
7df1cb0bab Replace remaining GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
4d8e9dc659 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
37c9cf4940 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a00e7bc2bb Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
1609014525 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
cb207ce645 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
b0b6bbf019 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
699f727f3e Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
edab5ddee8 Replace remaining choose_nth_item by std::nth_element
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 07:23:40 +01:00
Stefan Weil
94a3a70fda Fix new alerts reported by LGTM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
f5a10618bf Add missing reference & for loop iterator
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
5dc3f25aca Make only locally used functions row_y_order and row_spacing_order static
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
edd599fa7b Replace more GenericVector by std::vector and remove GenericVector::choose_nth_item
KDVector is now derived from std::vector.

This requires an update for unittest nthitem_test because
std::nth_element does not handle all corner cases of choose_nth_item.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4779615679 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4103c40a29 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
e0b1093249 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
71dfb82065 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
dcef5a5df1 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
314933823a Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
6c589e044f Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
9728bbc596 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
415d9aa2da Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
ef39692451 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
2fb6f9eb72 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
c8c9428824 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
71df85a4b1 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
d5aa220347 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
114c058fe4 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
9f1041efa7 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
aea7440847 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
a17f63f43e Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
0f632e1dda Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
6fcbea3533 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
fa93232517 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
487f5fad11 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
666ea8d560 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
c03ffda45a Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Brechtken
288b8cac11 Merge branch 'master' of https://github.com/Sintun/tesseract 2021-03-17 11:09:01 +01:00
Stefan Brechtken
ec8d7dd6bb Changing structure name MyTable -> TessTable and using tesseract namespace 2021-03-17 11:07:51 +01:00
Sintun
c4ba513994
Update src/textord/tablerecog.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:42:31 +01:00
Sintun
55fbee2d4c
Update src/textord/tablerecog.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:42:23 +01:00
Sintun
14408861ea
Update src/ccstruct/tabletransfer.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:36:15 +01:00
Sintun
02055d667c
Update src/ccstruct/tabletransfer.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:36:09 +01:00
Stefan Brechtken
5e8c8c2b4d conflict merge, removing an unnecessary include 2021-03-16 23:47:43 +01:00
Stefan Weil
223f356027 Fix alerts reported by LGTM
They were caused by recent commits which replaced GenericVector by std::vector.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 19:04:00 +01:00
Stefan Weil
8cfaf7bf64 Fix removal of duplicates in StructuredTable::FindLinedStructure
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 17:49:54 +01:00
Stefan Weil
5db92b26aa Replace remaining GenericVector by std::vector for src/textord
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 16:59:12 +01:00
Stefan Weil
1f94d79c81 Replace remaining GenericVector by std::vector for src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 16:55:38 +01:00
Stefan Brechtken
d856acba56 Change License to Apache V2, add new file to Makefile.am, change file name to .h ending 2021-03-16 14:16:02 +01:00
Stefan Weil
bf42f8313d Replace remaining GenericVector by std::vector for src/dict
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Weil
17eee8648f Replace more GenericVector by std::vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Weil
2a3682a35e Replace remaining GenericVector by std::vector in src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Brechtken
e10d19b084 updating function documentation and removing unnecessary include 2021-03-15 17:25:10 +01:00
Stefan Brechtken
594a000ecd merging with tesseract master in order to create a pull request 2021-03-15 17:02:19 +01:00
Stefan Weil
e51fcb2d31 Remove last usage of STRING
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
57920174dc Remove unused parts of class STRING
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
576c09bf31 Replace remaining STRING by std::string in unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
0edd69eb10 Replace remaining STRING by std::string in src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
d16fba9bed Replace all but one remaining STRING by std::string in src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
21cf7cf84e Replace remaining STRING by std::string in src/dict
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
21d9aad594 Replace remaining STRING by std::string in src/viewer and src/wordrec
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
e0ce040832 Replace remaining STRING by std::string in src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
db9f963411 Replace remaining STRING by std::string in src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Egor Pugin
d7823a71c2 Remove unused file. 2021-03-15 09:47:04 +03:00
Egor Pugin
efd17e205a Replace typedef structs with structs.
typedef enums are left intact.
2021-03-15 09:47:04 +03:00
Egor Pugin
262f65a4d2
snprintf will add '\0' at the end itself. 2021-03-14 23:54:29 +03:00
Egor Pugin
26ceeef6c0 [training] Modernize. 2021-03-14 23:47:42 +03:00
Shree Devi Kumar
efe9ff611f Limit unicharset from training_text only to Indic languages 2021-03-14 17:58:57 +00:00
Shree Devi Kumar
a589ded25f Create unicharset from training text to avoid normalization errors 2021-03-14 16:39:00 +00:00
Egor Pugin
f06b2c7c8d [capi] Restore some of wrongly removed apis.
Removed C++ APIs are not restored.
Additionally remove unused C++ typedefs which were in removed C++ functions.
If you still need them, use C++ API instead.
2021-03-14 17:20:52 +03:00
Egor Pugin
dabdaa1def Misc. 2021-03-14 17:14:41 +03:00
Stefan Weil
7178ebd799 Add missing TESS_API for new function tesseract::split
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-14 08:16:33 +01:00
Stefan Weil
36f9131e04 Move implementation of tesseract::split from header to cpp file
This fixes duplicate symbols for some builds.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:39:58 +01:00
Stefan Weil
3b0759940c Replace more STRING by std::string
Remove STRING::add_str_int and STRING::add_str_double which are now unused.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:16:35 +01:00
Stefan Weil
c9f0da49ca Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
91f7675848 Replace more STRING by std::string for src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d084c7cca8 Replace remaining STRING by std::string for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
96d1644da1 Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
a42c6c7dcd Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
9cf5b9870d Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
51909d5a2e Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d6495d9026 Replace STRING by std::string in src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:51 +01:00
Stefan Weil
1f2ec4dfb1 Fix network specification for NT_SYMCLIP
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 13:10:37 +01:00
Stefan Weil
6bf5080d4c Remove unused include statements for strngs.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 23:11:08 +01:00
Egor Pugin
a393df5038 Add missing export header. 2021-03-13 00:07:19 +03:00
Egor Pugin
2d10be5209 [clang-format] Format generated protobuf source. 2021-03-13 00:07:03 +03:00
Egor Pugin
618b185d14 Include missing config_auto.h 2021-03-12 23:39:18 +03:00
Egor Pugin
8b0c5405e2 Add missing forward decl. 2021-03-12 22:35:30 +03:00
Egor Pugin
0eb7ba88bf [clang-format] Execute clang format on include and src dirs.
Script:
find include src -type f | sort > all.txt
find include src -type f | grep -v "\.cpp" | grep -v "\.h" | sort > skip.txt
comm -23 all.txt skip.txt | xargs clang-format -i
2021-03-12 22:35:02 +03:00
Stefan Weil
4c6cc5a04d Replace GenericVector by std::vector in class ImageData
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:10:25 +01:00
Ger Hobbelt
779aa79350
Fix build (#3322)
* fix errors after merge commit: missing changes that are needed too to make this codebase compile.
* Update src/wordrec/wordrec.h

Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-11 21:43:07 +01:00
Egor Pugin
3444618075 Fix linux build. 2021-03-10 15:35:13 +03:00
Egor Pugin
ce058604ba Pass empty strings into Tesseract::init_tesseract(). 2021-03-10 15:21:03 +03:00
Egor Pugin
911dd93f12 Pass init strings as std::string instead of const char * internally. This does not affect public APIs. 2021-03-10 15:17:00 +03:00
Egor Pugin
9792f3c4ff Remove STRING::size() method. 2021-03-10 14:58:37 +03:00
Egor Pugin
6de97309a1 Remove unused STRING::strdup(). 2021-03-10 14:42:50 +03:00
Egor Pugin
f0e30a2af2 Remove unused STRING::unsigned_size(). 2021-03-10 14:41:31 +03:00
Egor Pugin
d36adf3d40 Replace STRING::truncate_at() with resize(). 2021-03-10 14:40:28 +03:00
Egor Pugin
e9a2fc0083 More std::string replacements. 2021-03-10 14:36:59 +03:00
Stefan Weil
0f1296c6f6 Clean implementation for (de-)serialization of a vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-08 13:33:48 +01:00
Stefan Weil
6cfe604d58 Fix serialization for vector of RecodedCharID
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-07 23:01:25 +01:00
Stefan Weil
0cde3ede98 Add heuristic to fix swap (partially fixes issue #2586)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
a2769aebb4 Replace GenericVector<TBOX> by std::vector<TBOX>
Fix also endianness handling for (de)serialisation of TBOX.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
c31c1a7d60 Fix two compiler warnings for serialis.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
fe614c6069 Enable less FP exceptions for clang compiler when running tesseract
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-03 22:56:07 +01:00
Egor Pugin
c39b1daa6b GenericVector -> std::vector. 2021-03-03 22:22:00 +03:00
Egor Pugin
0a693a9519 Allow to serialize std vectors with classes from TFile. Implementation from GenericVector. 2021-03-03 22:21:40 +03:00
Stefan Weil
ff830775f9 Fix memory leak in DocumentCache
It was introduced in commit 5cac52173e.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-01 11:31:48 +01:00
Stefan Weil
339c01894e Avoid fp division by 0 (fix issue #3314)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-28 19:42:01 +01:00
Stefan Weil
cd60728e8a Avoid float division by zero when calculating adaptive learning rate
The following line results in a division by zero when
momentum is -1 and num_samples is even:

     learning_rate /= 1.0f - pow(momentum, num_samples);

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
c12dde2862 Use float instead of double for learning_rate, momentum and adam_beta
Only WeightMatrix::Update used double parameters, all other functions
already used float. So this change avoids unnecessary conversions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
422452b9f4 Check for float errors when running tesseract and lstmtraining
Some illegal floating point calculations like division by zero,
illegal value or overflow will now abort tesseract with an error
message.

For lstmtraining there is now a new parameter --debug_float to
enable the same kind of checks. It is currently disabled by default
because such errors occur and would abort the training process.
That should be fixed in the future.

If tesseract also shows floating point errors which cannot be
fixed easily, a similar parameter to enable the checks can be
added there, too.

The new code requires the function feenableexcept which is only
available with the GNU libc, so it is only used on Linux.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:49:27 +01:00
Stefan Weil
51a214a51b Remove unused include statements for imagedata.h and document used ones
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:28 +01:00
Stefan Weil
1d7a981203 Disable code for unused classes WordFeature and FloatWordFeature
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:17 +01:00
Stefan Weil
5cac52173e Replace PointerVector by std::vector in class DocumentCache
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:07 +01:00
Stefan Weil
387acd9881 Initialize weight matrix with 0.0 (fix issue #3229)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 18:49:39 +01:00
Egor Pugin
1ab6b0fbc6
Merge pull request #3311 from stweil/master
Replace calls of exit function
2021-02-26 17:43:53 +03:00
Stefan Weil
58304cbfdd Don't compile OpenCL code when OpenCL is disabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 15:40:23 +01:00
Stefan Weil
a6946c3bf9 Replace calls of exit function
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:36 +01:00
Stefan Weil
373a3527ec Format code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:09 +01:00
Stefan Weil
ea446b1eae Remove blanks at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:05:36 +01:00
Stefan Weil
394c56ab15 Replace GenericVector by std::vector in class WERD_CHOICE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 23:14:25 +01:00
Stefan Weil
fccecb2d23 Replace GenericVector by std::vector in class ResultIterator
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 21:07:57 +01:00
Stefan Weil
2257028052 Replace GenericVector by std::vector in reject.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 21:06:59 +01:00