Commit Graph

1817 Commits

Author SHA1 Message Date
Stefan Weil
5384aa7b21 Modernize code (clang-tidy -checks='-*,modernize-use-equals-delete')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
406233f1ae Modernize code (clang-tidy -checks='-*,modernize-use-equals-default')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
27293fad62 Modernize code (clang-tidy -checks='-*,modernize-use-emplace')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
6fc31c44f8 Modernize code (clang-tidy -checks='-*,modernize-use-bool-literals')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
35e143ddfc Modernize code (clang-tidy -checks='-*,modernize-use-auto')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
1439efa734 Modernize code (clang-tidy -checks='-*,modernize-make-unique')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
02774bda6e Modernize code (clang-tidy -checks='-*,modernize-loop-convert')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
719dc1d7da Modernize code using override
The modifications were made using this command:

run-clang-tidy -header-filter='.*' -checks='-*,modernize-use-override' -fix

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 20:06:38 +01:00
Stefan Weil
187ac4136a Fix LGTM alert (local variable hides a parameter)
LGTM alert:

    Local variable 'correct_text' hides a parameter of the same name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 18:20:13 +01:00
Egor Pugin
7d17b72ba5 Use more smart pointers. 2021-03-21 15:19:21 +03:00
Stefan Weil
0c20d3f843 Fix compiler warnings (mostly -Wsign-compare)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 09:29:34 +01:00
Stefan Weil
55d87f642c Disable most Leptonica messages for tesseract by default
They were disabled in earlier builds which used NDEBUG, too.

Allow manual setting of the Leptonica message level
with environment variable LEPT_MSG_SEVERITY.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 20:16:16 +01:00
Stefan Weil
19afcdb79b Remove unused function UnicharIdArrayUtils::find_in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:51:28 +01:00
Stefan Weil
7af5b75b8f Disable unused WriteMemoryCallback if libcurl is not used
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:49:06 +01:00
Egor Pugin
db7a977eab Use smart pointers. 2021-03-20 16:04:45 +03:00
Egor Pugin
69ab5bbf65 Misc. 2021-03-20 16:04:00 +03:00
Stefan Weil
f176e7c274 Fix double free caused by commit f33e80e (fixes issue #3348)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 12:37:56 +01:00
Stefan Weil
87b0a4de97 Rename GenericVector::get
The new name GenericVector::at is compatible with standard containers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:42:19 +01:00
Stefan Weil
2c1c09bd6a Rename UnicityTable::get, UnicityTable::get_mutable
The new name UnicityTable::at is compatible with standard containers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:40:00 +01:00
Stefan Weil
883353df63 Replace std::array by std::vector to avoid stack overflow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:39:16 +01:00
Stefan Weil
ec2c989d00 Modernize code in src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:06:40 +01:00
Stefan Weil
54aec32586 Replace remaining PointerVector by std::vector for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 22:22:04 +01:00
Stefan Weil
0d739530a5 Remove unused PointerVector::DeSerialize, PonterVector::DeSerializeElement
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:17 +01:00
Stefan Weil
7207cf13d7 Replace more PointerVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:08 +01:00
Stefan Weil
aa64d83c2f Replace more PointerVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 15:22:29 +01:00
Stefan Weil
79477dc2fe Replace more PointerVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 14:46:25 +01:00
Stefan Weil
752779aaed Replace more PointerVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
cac116dd11 Replace more PointerVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
dae5accceb Replace remaining PointerVector by std::vector for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
9e006a8bbc Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
65d882f96e Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
8ed6dee8e9 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
abc22976e4 Replace remaining PointerVector by std::vector for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
7f11261076 Suppress resolution warning if no resolution was given
Tesseract reported confusing information for images without resolution:

    Warning: Invalid resolution 0 dpi. Using 70 instead.
    Estimating resolution as 642

The warning is also shown when the resolution is not used at all
when preparing data for training.

It is now suppressed when there is no resolution information
(resolution == 0).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 10:45:54 +01:00
Stefan Weil
52a82b4356 Fix new alert reported by LGTM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 23:27:17 +01:00
Stefan Weil
f33e80e2fb Replace remaining PointerVector by std::vector for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 20:14:40 +01:00
Stefan Weil
07d147d4a6 Replace more PointerVector by std::vector for src/textord
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 19:04:00 +01:00
Stefan Weil
b0e30bd247 Replace remaining PointerVector by std::vector for src/wordrec
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 18:56:08 +01:00
Stefan Weil
b62a86a93f Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 17:16:43 +01:00
Stefan Weil
177703c562 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
9e566de0f2 Remove unused classes WordFeature, FloatWordFeature
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
7b92614efa Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
a584ee5ac0 Add missing include statement (fix CI build)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:59:58 +01:00
Stefan Weil
9eab1d60c1 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:04:56 +01:00
Stefan Weil
f8d55f30d8 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:31:13 +01:00
Stefan Weil
d9739ba459 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:27:37 +01:00
Stefan Weil
4b428df131 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:18:49 +01:00
Stefan Weil
92e98a30e1 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:04:22 +01:00
Stefan Weil
573e7d6bb9 Replace more GenericVector by std::vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:58:13 +01:00
Stefan Weil
a80689559b Partially revert "Replace more GenericVector by std::vector for src/ccutil"
This partially reverts and cleans commit 96d72298b12f744a72e5c3cea67924779e859e42
which had broken intfeaturemap_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:43:32 +01:00
Stefan Weil
576d8d6c63 Partially revert "Replace remaining GenericVector by std::vector for src/training"
This partially reverts commit 7df1cb0bab
which had broken lstm_squashed_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 10:59:07 +01:00
Stefan Weil
77dbd3ee02 Remove two type casts
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 09:04:39 +01:00
Stefan Weil
7fdf79aff4 Move function ExtractFontName to baseapi.cpp
It is only used there, so now a local function.
This also allows removing blobclass.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a847e0f9b5 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
7df1cb0bab Replace remaining GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
4d8e9dc659 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
37c9cf4940 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a00e7bc2bb Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
1609014525 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
cb207ce645 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
b0b6bbf019 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
699f727f3e Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
edab5ddee8 Replace remaining choose_nth_item by std::nth_element
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 07:23:40 +01:00
Stefan Weil
94a3a70fda Fix new alerts reported by LGTM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
f5a10618bf Add missing reference & for loop iterator
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
5dc3f25aca Make only locally used functions row_y_order and row_spacing_order static
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
edd599fa7b Replace more GenericVector by std::vector and remove GenericVector::choose_nth_item
KDVector is now derived from std::vector.

This requires an update for unittest nthitem_test because
std::nth_element does not handle all corner cases of choose_nth_item.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4779615679 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4103c40a29 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
e0b1093249 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
71dfb82065 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
dcef5a5df1 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
314933823a Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
6c589e044f Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
9728bbc596 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
415d9aa2da Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
ef39692451 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
2fb6f9eb72 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
c8c9428824 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
71df85a4b1 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
d5aa220347 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
114c058fe4 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
9f1041efa7 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
aea7440847 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
a17f63f43e Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
0f632e1dda Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
6fcbea3533 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
fa93232517 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
487f5fad11 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
666ea8d560 Replace more GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
c03ffda45a Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Brechtken
288b8cac11 Merge branch 'master' of https://github.com/Sintun/tesseract 2021-03-17 11:09:01 +01:00
Stefan Brechtken
ec8d7dd6bb Changing structure name MyTable -> TessTable and using tesseract namespace 2021-03-17 11:07:51 +01:00
Sintun
c4ba513994
Update src/textord/tablerecog.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:42:31 +01:00
Sintun
55fbee2d4c
Update src/textord/tablerecog.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:42:23 +01:00
Sintun
14408861ea
Update src/ccstruct/tabletransfer.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:36:15 +01:00
Sintun
02055d667c
Update src/ccstruct/tabletransfer.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:36:09 +01:00
Stefan Brechtken
5e8c8c2b4d conflict merge, removing an unnecessary include 2021-03-16 23:47:43 +01:00
Stefan Weil
223f356027 Fix alerts reported by LGTM
They were caused by recent commits which replaced GenericVector by std::vector.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 19:04:00 +01:00
Stefan Weil
8cfaf7bf64 Fix removal of duplicates in StructuredTable::FindLinedStructure
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 17:49:54 +01:00
Stefan Weil
5db92b26aa Replace remaining GenericVector by std::vector for src/textord
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 16:59:12 +01:00
Stefan Weil
1f94d79c81 Replace remaining GenericVector by std::vector for src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 16:55:38 +01:00
Stefan Brechtken
d856acba56 Change License to Apache V2, add new file to Makefile.am, change file name to .h ending 2021-03-16 14:16:02 +01:00
Stefan Weil
bf42f8313d Replace remaining GenericVector by std::vector for src/dict
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Weil
17eee8648f Replace more GenericVector by std::vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Weil
2a3682a35e Replace remaining GenericVector by std::vector in src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Brechtken
e10d19b084 updating function documentation and removing unnecessary include 2021-03-15 17:25:10 +01:00
Stefan Brechtken
594a000ecd merging with tesseract master in order to create a pull request 2021-03-15 17:02:19 +01:00
Stefan Weil
e51fcb2d31 Remove last usage of STRING
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
57920174dc Remove unused parts of class STRING
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
576c09bf31 Replace remaining STRING by std::string in unittest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
0edd69eb10 Replace remaining STRING by std::string in src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
d16fba9bed Replace all but one remaining STRING by std::string in src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
21cf7cf84e Replace remaining STRING by std::string in src/dict
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
21d9aad594 Replace remaining STRING by std::string in src/viewer and src/wordrec
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
e0ce040832 Replace remaining STRING by std::string in src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
db9f963411 Replace remaining STRING by std::string in src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Egor Pugin
d7823a71c2 Remove unused file. 2021-03-15 09:47:04 +03:00
Egor Pugin
efd17e205a Replace typedef structs with structs.
typedef enums are left intact.
2021-03-15 09:47:04 +03:00
Egor Pugin
262f65a4d2
snprintf will add '\0' at the end itself. 2021-03-14 23:54:29 +03:00
Egor Pugin
26ceeef6c0 [training] Modernize. 2021-03-14 23:47:42 +03:00
Shree Devi Kumar
efe9ff611f Limit unicharset from training_text only to Indic languages 2021-03-14 17:58:57 +00:00
Shree Devi Kumar
a589ded25f Create unicharset from training text to avoid normalization errors 2021-03-14 16:39:00 +00:00
Egor Pugin
f06b2c7c8d [capi] Restore some of wrongly removed apis.
Removed C++ APIs are not restored.
Additionally remove unused C++ typedefs which were in removed C++ functions.
If you still need them, use C++ API instead.
2021-03-14 17:20:52 +03:00
Egor Pugin
dabdaa1def Misc. 2021-03-14 17:14:41 +03:00
Stefan Weil
7178ebd799 Add missing TESS_API for new function tesseract::split
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-14 08:16:33 +01:00
Stefan Weil
36f9131e04 Move implementation of tesseract::split from header to cpp file
This fixes duplicate symbols for some builds.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:39:58 +01:00
Stefan Weil
3b0759940c Replace more STRING by std::string
Remove STRING::add_str_int and STRING::add_str_double which are now unused.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:16:35 +01:00
Stefan Weil
c9f0da49ca Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
91f7675848 Replace more STRING by std::string for src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d084c7cca8 Replace remaining STRING by std::string for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
96d1644da1 Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
a42c6c7dcd Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
9cf5b9870d Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
51909d5a2e Replace more STRING by std::string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d6495d9026 Replace STRING by std::string in src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:51 +01:00
Stefan Weil
1f2ec4dfb1 Fix network specification for NT_SYMCLIP
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 13:10:37 +01:00
Stefan Weil
6bf5080d4c Remove unused include statements for strngs.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 23:11:08 +01:00
Egor Pugin
a393df5038 Add missing export header. 2021-03-13 00:07:19 +03:00
Egor Pugin
2d10be5209 [clang-format] Format generated protobuf source. 2021-03-13 00:07:03 +03:00
Egor Pugin
618b185d14 Include missing config_auto.h 2021-03-12 23:39:18 +03:00
Egor Pugin
8b0c5405e2 Add missing forward decl. 2021-03-12 22:35:30 +03:00
Egor Pugin
0eb7ba88bf [clang-format] Execute clang format on include and src dirs.
Script:
find include src -type f | sort > all.txt
find include src -type f | grep -v "\.cpp" | grep -v "\.h" | sort > skip.txt
comm -23 all.txt skip.txt | xargs clang-format -i
2021-03-12 22:35:02 +03:00
Stefan Weil
4c6cc5a04d Replace GenericVector by std::vector in class ImageData
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:10:25 +01:00
Ger Hobbelt
779aa79350
Fix build (#3322)
* fix errors after merge commit: missing changes that are needed too to make this codebase compile.
* Update src/wordrec/wordrec.h

Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-11 21:43:07 +01:00
Egor Pugin
3444618075 Fix linux build. 2021-03-10 15:35:13 +03:00
Egor Pugin
ce058604ba Pass empty strings into Tesseract::init_tesseract(). 2021-03-10 15:21:03 +03:00
Egor Pugin
911dd93f12 Pass init strings as std::string instead of const char * internally. This does not affect public APIs. 2021-03-10 15:17:00 +03:00
Egor Pugin
9792f3c4ff Remove STRING::size() method. 2021-03-10 14:58:37 +03:00
Egor Pugin
6de97309a1 Remove unused STRING::strdup(). 2021-03-10 14:42:50 +03:00
Egor Pugin
f0e30a2af2 Remove unused STRING::unsigned_size(). 2021-03-10 14:41:31 +03:00
Egor Pugin
d36adf3d40 Replace STRING::truncate_at() with resize(). 2021-03-10 14:40:28 +03:00
Egor Pugin
e9a2fc0083 More std::string replacements. 2021-03-10 14:36:59 +03:00
Stefan Weil
0f1296c6f6 Clean implementation for (de-)serialization of a vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-08 13:33:48 +01:00
Stefan Weil
6cfe604d58 Fix serialization for vector of RecodedCharID
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-07 23:01:25 +01:00
Stefan Weil
0cde3ede98 Add heuristic to fix swap (partially fixes issue #2586)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
a2769aebb4 Replace GenericVector<TBOX> by std::vector<TBOX>
Fix also endianness handling for (de)serialisation of TBOX.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
c31c1a7d60 Fix two compiler warnings for serialis.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
fe614c6069 Enable less FP exceptions for clang compiler when running tesseract
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-03 22:56:07 +01:00
Egor Pugin
c39b1daa6b GenericVector -> std::vector. 2021-03-03 22:22:00 +03:00
Egor Pugin
0a693a9519 Allow to serialize std vectors with classes from TFile. Implementation from GenericVector. 2021-03-03 22:21:40 +03:00
Stefan Weil
ff830775f9 Fix memory leak in DocumentCache
It was introduced in commit 5cac52173e.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-01 11:31:48 +01:00
Stefan Weil
339c01894e Avoid fp division by 0 (fix issue #3314)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-28 19:42:01 +01:00
Stefan Weil
cd60728e8a Avoid float division by zero when calculating adaptive learning rate
The following line results in a division by zero when
momentum is -1 and num_samples is even:

     learning_rate /= 1.0f - pow(momentum, num_samples);

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
c12dde2862 Use float instead of double for learning_rate, momentum and adam_beta
Only WeightMatrix::Update used double parameters, all other functions
already used float. So this change avoids unnecessary conversions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
422452b9f4 Check for float errors when running tesseract and lstmtraining
Some illegal floating point calculations like division by zero,
illegal value or overflow will now abort tesseract with an error
message.

For lstmtraining there is now a new parameter --debug_float to
enable the same kind of checks. It is currently disabled by default
because such errors occur and would abort the training process.
That should be fixed in the future.

If tesseract also shows floating point errors which cannot be
fixed easily, a similar parameter to enable the checks can be
added there, too.

The new code requires the function feenableexcept which is only
available with the GNU libc, so it is only used on Linux.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:49:27 +01:00
Stefan Weil
51a214a51b Remove unused include statements for imagedata.h and document used ones
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:28 +01:00
Stefan Weil
1d7a981203 Disable code for unused classes WordFeature and FloatWordFeature
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:17 +01:00
Stefan Weil
5cac52173e Replace PointerVector by std::vector in class DocumentCache
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:07 +01:00
Stefan Weil
387acd9881 Initialize weight matrix with 0.0 (fix issue #3229)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 18:49:39 +01:00
Egor Pugin
1ab6b0fbc6
Merge pull request #3311 from stweil/master
Replace calls of exit function
2021-02-26 17:43:53 +03:00
Stefan Weil
58304cbfdd Don't compile OpenCL code when OpenCL is disabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 15:40:23 +01:00
Stefan Weil
a6946c3bf9 Replace calls of exit function
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:36 +01:00
Stefan Weil
373a3527ec Format code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:09 +01:00
Stefan Weil
ea446b1eae Remove blanks at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:05:36 +01:00
Stefan Weil
394c56ab15 Replace GenericVector by std::vector in class WERD_CHOICE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 23:14:25 +01:00
Stefan Weil
fccecb2d23 Replace GenericVector by std::vector in class ResultIterator
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 21:07:57 +01:00
Stefan Weil
2257028052 Replace GenericVector by std::vector in reject.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 21:06:59 +01:00
Stefan Weil
d62f27dd8f Replace GenericVector by std::vector in stepblob.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:47:06 +01:00
Stefan Weil
3e5b2760ab Replace GenericVector by std::vector for struct BlamerBundle
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:34:41 +01:00
Stefan Weil
0b8e937655 Use countof to get number of array elements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
7097dfd41c Replace GenericVector by std::vector for parameters
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
f2d2695ce9 Replace STRING and clean declarations of local variables in eval_word_spacing
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
5277443833 Replace more STRING
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
ae00f291f6 Remove unused include statements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-22 22:28:47 +01:00
Stefan Weil
65053890d7 Handle file list without terminating LF (fix issue #3298)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-13 11:44:47 +01:00
Stefan Weil
bc69e28de3 Update include statements for external header file allheaders.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-13 10:17:20 +01:00
Stefan Weil
e6f15621c2 Remove Python training scripts which were moved to tesstrain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-04 14:45:19 +01:00
Shree Devi Kumar
40f3c8d104 Change LATIN_FONTS to use replacement fonts from TeX Gyre collection 2021-02-04 13:51:03 +01:00
Stefan Weil
4902e68682 cmake: Use pkg_config to find required libraries
This is needed for cmake builds on MacOS (Intel and Amd64) with Homebrew.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-31 17:23:06 +01:00
Stefan Weil
e999f421bc Replace GenericVector<float> by std::vector<float> for class SimpleStats
This also fixes a runtime error:

    src/ccutil/genericvector.h:228:11: runtime error:
      null pointer passed as argument 1, which is declared to never be null

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-26 14:29:07 +01:00
Stefan Weil
4b84a56d8d Replace STRING by std::string for function read_unlv_file
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:46:12 +01:00
Stefan Weil
139d127ff7 Remove unneeded include statement for genericvector.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
71fb535427 Remove unneeded include statement for strngs.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
44fd1c4986 Wordrec: Modernize code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 15:53:55 +01:00
Stefan Weil
5a3d6e5e0d Fix memory leak in mastertrainer_test (fixes issue #3215)
The issue was introduced in commit 6e9456415.

Partially reverting this commit fixes it.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 14:54:38 +01:00
Stefan Weil
e3fd938bca lstmtrainer: Modernize code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-22 08:17:19 +01:00
Stefan Weil
0cdaab5ac9 lstmtrainer: Remove unused local variable
This fixes a compiler warning:
    src/training/unicharset/lstmtrainer.cpp:107:15: warning:
      unused variable 'shape' [-Wunused-variable]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-22 08:13:38 +01:00
Stefan Weil
3d47e0a91a Replace GenericVector by std::vector in LoadFileLinesToStrings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-22 08:13:38 +01:00
Stefan Weil
5d44a8216f Show names of failing lstmf files in error messages
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-20 13:36:59 +01:00
Stefan Weil
c7baf8f17d Add more information shown by combine_tessdata -l
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
3195c8f75f Add new option -l for combine_tessdata to list the network string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
970eba79e6 Replace STRING by std::string for LSTMRecognizer::network_str_
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
97cfd95872 Replace STRING by char* in LSTMRecognizer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
73ffcabfe9 lstmtraining: Interpret negative value for --max_iterations as epochs
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 19:51:58 +01:00
Stefan Weil
40bdcd2941 Add TESS_API to instantiation of template functions
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 18:07:35 +01:00
Stefan Weil
80810218f7 Use explicit int32_t for serialized data type
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 18:06:39 +01:00
Stefan Weil
05da41dc60 Replace GenericVector<BlobData> by std::vector<BlobData>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 17:23:13 +01:00
Robert Pösel
7dcd9b5095 Remove ANDROID_BUILD macro
Build fails when ANDROID_BUILD is defined, because it removes parts of the LSTM engine, but there are still some unguarded references. But removing LSTM engine is not needed as it works perfectly fine on Android.

This macro doesn't provide any benefit anymore and is not even used in current build config. If needed, ANDROID macro should be used instead (which is already used on few places).
2021-01-14 14:31:34 +01:00
Stefan Weil
08f2ba02f7 Fix memory allocation in TFile::DeSerialize(std::vector<T>& data)
lstmtraining crashed when creating traineddata files:

    Error: attempt to subscript container with out-of-bounds index 0, but
    container only holds 0 elements.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 12:11:02 +01:00
Stefan Weil
5e661b9339 Don't use local CP_RESULT_STRUCT variable to initialize elements of std::vector
std::vector passes that local variable by reference, so no individual
instances are used for the new vector elements.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-13 15:57:04 +01:00
Stefan Weil
b0e46085f4 Fix serialization of std::vector (fix issue #3220)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-12 21:23:14 +01:00
Stefan Weil
9b15e65900 Replace resize(0) by clear() for std::vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-12 19:24:54 +01:00
Shree Devi Kumar
5104af6a15 Remove --psm 6 for lstm.train in tesstrain.py 2021-01-12 13:26:33 +01:00
Shree Devi Kumar
106b3d1ed0 No --psm 6 for lstm.train 2021-01-12 12:42:53 +01:00
Robert Pösel
ca9c7ba303 Fix NEON also tesseractmain.cpp 2021-01-11 12:17:25 +01:00
Robert Pösel
1954ee3867 Fix use of NEON on ARMv8
Flag neon_available_ is automatically set to true when __aarch64__ is defined,
but the actual check for neon_available_ required having also HAVE_NEON defined.

Now we check the flag also when only __aarch64__ is defined.
2021-01-11 12:17:16 +01:00
Stefan Weil
021237ad2c Add assertion for IntCastRounded
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-10 15:08:31 +01:00
Stefan Weil
209c1df599 Fix some format strings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-08 18:49:21 +01:00
Egor Pugin
8cb1c62259 More std::vector. 2021-01-07 15:13:59 +03:00
Egor Pugin
8d6cad1acc Misc. 2021-01-07 14:33:45 +03:00
Egor Pugin
4f5bd1c562 Move unicodes into files where they are used. 2021-01-07 14:33:02 +03:00
Egor Pugin
8aa5492262 Misc. 2021-01-07 14:14:40 +03:00
Egor Pugin
9cc7bdeaa6 Use std::bitset<16> instead of custom BITS16. 2021-01-07 14:14:27 +03:00
Egor Pugin
9710bc0465 More std::vector. 2021-01-07 13:57:57 +03:00
Stefan Weil
d000df7e00 Remove remaining parts of tessopt (fix autotools build)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-05 23:06:17 +01:00
Egor Pugin
8e947a98b5 Remove emalloc. Replace it with malloc. To be replaced with new later. 2021-01-06 00:30:52 +03:00
Egor Pugin
af4ebaa943 Alloc on stack. 2021-01-05 18:07:40 +03:00
Egor Pugin
d3729cb34e Rmove unused members. 2021-01-05 18:07:10 +03:00
Egor Pugin
40aca00559 Remove unused var. 2021-01-05 17:56:39 +03:00
Egor Pugin
14cf6adda2 More std::vector. 2021-01-05 17:53:05 +03:00
Egor Pugin
a44d107e94 Misc. 2021-01-05 17:45:34 +03:00
Egor Pugin
6e94564152 [training] More unique ptrs. 2021-01-05 17:03:26 +03:00
Egor Pugin
4415209fd6 Remove tessopt. This fixes mastertrainer test in shared build. 2021-01-05 17:00:27 +03:00
Egor Pugin
c946a5610c Remove unused header. 2021-01-05 16:45:24 +03:00
Egor Pugin
8950e49a5d Remove unused var. 2021-01-05 16:45:07 +03:00
Egor Pugin
5160426400 Misc. 2021-01-05 16:31:09 +03:00
Egor Pugin
fb98b9b2f5 Use unique_ptr. 2021-01-05 16:00:22 +03:00
Egor Pugin
aa80aa5de1 More std::vector. 2021-01-05 15:54:30 +03:00
Egor Pugin
4f8f8e3d58 More std::vector. Simplify. 2021-01-05 15:49:53 +03:00
Egor Pugin
ca514ad91e [test] Return early on error. 2021-01-05 15:37:43 +03:00
Egor Pugin
4ed601956e More std::vector. 2021-01-05 14:46:11 +03:00
Egor Pugin
0c7139ce09 A better fix to read unichars. Imbue C locale always since on different systems, default locale will give different results. 2021-01-04 20:36:21 +03:00
Egor Pugin
0364832ab8 Correctly read cutoff classes. 2021-01-04 20:20:17 +03:00
Egor Pugin
71f578a198 Do not swap endian elements with size == 1. 2021-01-04 20:00:46 +03:00
Egor Pugin
4e59d964dc Use templates for serialize/deserialize. 2021-01-04 20:00:25 +03:00
Egor Pugin
4162e37e8c Use std::vector. 2021-01-04 19:54:51 +03:00
Egor Pugin
3aae46d53d Remove noisy message. 2021-01-04 18:11:16 +03:00
Stefan Weil
40ba25acbb Remove functions which are only used locally from scanedg.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-04 15:49:15 +01:00
Stefan Weil
709acf74fe Remove functions which are only used locally from fpchop.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-03 21:41:56 +01:00