Stefan Weil
5384aa7b21
Modernize code (clang-tidy -checks='-*,modernize-use-equals-delete')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
406233f1ae
Modernize code (clang-tidy -checks='-*,modernize-use-equals-default')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
27293fad62
Modernize code (clang-tidy -checks='-*,modernize-use-emplace')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
6fc31c44f8
Modernize code (clang-tidy -checks='-*,modernize-use-bool-literals')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
35e143ddfc
Modernize code (clang-tidy -checks='-*,modernize-use-auto')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
1439efa734
Modernize code (clang-tidy -checks='-*,modernize-make-unique')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
02774bda6e
Modernize code (clang-tidy -checks='-*,modernize-loop-convert')
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
719dc1d7da
Modernize code using override
...
The modifications were made using this command:
run-clang-tidy -header-filter='.*' -checks='-*,modernize-use-override' -fix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 20:06:38 +01:00
Stefan Weil
187ac4136a
Fix LGTM alert (local variable hides a parameter)
...
LGTM alert:
Local variable 'correct_text' hides a parameter of the same name.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 18:20:13 +01:00
Egor Pugin
7d17b72ba5
Use more smart pointers.
2021-03-21 15:19:21 +03:00
Stefan Weil
0c20d3f843
Fix compiler warnings (mostly -Wsign-compare)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 09:29:34 +01:00
Stefan Weil
55d87f642c
Disable most Leptonica messages for tesseract by default
...
They were disabled in earlier builds which used NDEBUG, too.
Allow manual setting of the Leptonica message level
with environment variable LEPT_MSG_SEVERITY.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 20:16:16 +01:00
Stefan Weil
19afcdb79b
Remove unused function UnicharIdArrayUtils::find_in
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:51:28 +01:00
Stefan Weil
7af5b75b8f
Disable unused WriteMemoryCallback if libcurl is not used
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:49:06 +01:00
Egor Pugin
db7a977eab
Use smart pointers.
2021-03-20 16:04:45 +03:00
Egor Pugin
69ab5bbf65
Misc.
2021-03-20 16:04:00 +03:00
Stefan Weil
f176e7c274
Fix double free caused by commit f33e80e
(fixes issue #3348 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 12:37:56 +01:00
Stefan Weil
87b0a4de97
Rename GenericVector::get
...
The new name GenericVector::at is compatible with standard containers.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:42:19 +01:00
Stefan Weil
2c1c09bd6a
Rename UnicityTable::get, UnicityTable::get_mutable
...
The new name UnicityTable::at is compatible with standard containers.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:40:00 +01:00
Stefan Weil
883353df63
Replace std::array by std::vector to avoid stack overflow
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:39:16 +01:00
Stefan Weil
ec2c989d00
Modernize code in src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:06:40 +01:00
Stefan Weil
54aec32586
Replace remaining PointerVector by std::vector for src/lstm
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 22:22:04 +01:00
Stefan Weil
0d739530a5
Remove unused PointerVector::DeSerialize, PonterVector::DeSerializeElement
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:17 +01:00
Stefan Weil
7207cf13d7
Replace more PointerVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:08 +01:00
Stefan Weil
aa64d83c2f
Replace more PointerVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 15:22:29 +01:00
Stefan Weil
79477dc2fe
Replace more PointerVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 14:46:25 +01:00
Stefan Weil
752779aaed
Replace more PointerVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
cac116dd11
Replace more PointerVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
dae5accceb
Replace remaining PointerVector by std::vector for src/api
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
9e006a8bbc
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
65d882f96e
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
8ed6dee8e9
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
abc22976e4
Replace remaining PointerVector by std::vector for src/api
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
7f11261076
Suppress resolution warning if no resolution was given
...
Tesseract reported confusing information for images without resolution:
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 642
The warning is also shown when the resolution is not used at all
when preparing data for training.
It is now suppressed when there is no resolution information
(resolution == 0).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 10:45:54 +01:00
Stefan Weil
52a82b4356
Fix new alert reported by LGTM
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 23:27:17 +01:00
Stefan Weil
f33e80e2fb
Replace remaining PointerVector by std::vector for src/lstm
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 20:14:40 +01:00
Stefan Weil
07d147d4a6
Replace more PointerVector by std::vector for src/textord
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 19:04:00 +01:00
Stefan Weil
b0e30bd247
Replace remaining PointerVector by std::vector for src/wordrec
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 18:56:08 +01:00
Stefan Weil
b62a86a93f
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 17:16:43 +01:00
Stefan Weil
177703c562
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
9e566de0f2
Remove unused classes WordFeature, FloatWordFeature
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
7b92614efa
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
a584ee5ac0
Add missing include statement (fix CI build)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:59:58 +01:00
Stefan Weil
9eab1d60c1
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:04:56 +01:00
Stefan Weil
f8d55f30d8
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:31:13 +01:00
Stefan Weil
d9739ba459
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:27:37 +01:00
Stefan Weil
4b428df131
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:18:49 +01:00
Stefan Weil
92e98a30e1
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:04:22 +01:00
Stefan Weil
573e7d6bb9
Replace more GenericVector by std::vector
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:58:13 +01:00
Stefan Weil
a80689559b
Partially revert "Replace more GenericVector by std::vector for src/ccutil"
...
This partially reverts and cleans commit 96d72298b12f744a72e5c3cea67924779e859e42
which had broken intfeaturemap_test.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:43:32 +01:00
Stefan Weil
576d8d6c63
Partially revert "Replace remaining GenericVector by std::vector for src/training"
...
This partially reverts commit 7df1cb0bab
which had broken lstm_squashed_test.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 10:59:07 +01:00
Stefan Weil
77dbd3ee02
Remove two type casts
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 09:04:39 +01:00
Stefan Weil
7fdf79aff4
Move function ExtractFontName to baseapi.cpp
...
It is only used there, so now a local function.
This also allows removing blobclass.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a847e0f9b5
Replace remaining GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
7df1cb0bab
Replace remaining GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
4d8e9dc659
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
37c9cf4940
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a00e7bc2bb
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
1609014525
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
cb207ce645
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
b0b6bbf019
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
699f727f3e
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
edab5ddee8
Replace remaining choose_nth_item by std::nth_element
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 07:23:40 +01:00
Stefan Weil
94a3a70fda
Fix new alerts reported by LGTM
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
f5a10618bf
Add missing reference & for loop iterator
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
5dc3f25aca
Make only locally used functions row_y_order and row_spacing_order static
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
edd599fa7b
Replace more GenericVector by std::vector and remove GenericVector::choose_nth_item
...
KDVector is now derived from std::vector.
This requires an update for unittest nthitem_test because
std::nth_element does not handle all corner cases of choose_nth_item.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4779615679
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4103c40a29
Replace more GenericVector by std::vector for src/ccutil
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
e0b1093249
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
71dfb82065
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
dcef5a5df1
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
314933823a
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
6c589e044f
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
9728bbc596
Replace more GenericVector by std::vector for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
415d9aa2da
Replace remaining GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
ef39692451
Replace remaining GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
2fb6f9eb72
Replace remaining GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
c8c9428824
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
71df85a4b1
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
d5aa220347
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
114c058fe4
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
9f1041efa7
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
aea7440847
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
a17f63f43e
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
0f632e1dda
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
6fcbea3533
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
fa93232517
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
487f5fad11
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
666ea8d560
Replace more GenericVector by std::vector for src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Weil
c03ffda45a
Replace more GenericVector by std::vector for src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00
Stefan Brechtken
288b8cac11
Merge branch 'master' of https://github.com/Sintun/tesseract
2021-03-17 11:09:01 +01:00
Stefan Brechtken
ec8d7dd6bb
Changing structure name MyTable -> TessTable and using tesseract namespace
2021-03-17 11:07:51 +01:00
Sintun
c4ba513994
Update src/textord/tablerecog.h
...
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:42:31 +01:00
Sintun
55fbee2d4c
Update src/textord/tablerecog.h
...
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:42:23 +01:00
Sintun
14408861ea
Update src/ccstruct/tabletransfer.h
...
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:36:15 +01:00
Sintun
02055d667c
Update src/ccstruct/tabletransfer.h
...
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 10:36:09 +01:00
Stefan Brechtken
5e8c8c2b4d
conflict merge, removing an unnecessary include
2021-03-16 23:47:43 +01:00
Stefan Weil
223f356027
Fix alerts reported by LGTM
...
They were caused by recent commits which replaced GenericVector by std::vector.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 19:04:00 +01:00
Stefan Weil
8cfaf7bf64
Fix removal of duplicates in StructuredTable::FindLinedStructure
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 17:49:54 +01:00
Stefan Weil
5db92b26aa
Replace remaining GenericVector by std::vector for src/textord
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 16:59:12 +01:00
Stefan Weil
1f94d79c81
Replace remaining GenericVector by std::vector for src/ccmain
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 16:55:38 +01:00
Stefan Brechtken
d856acba56
Change License to Apache V2, add new file to Makefile.am, change file name to .h ending
2021-03-16 14:16:02 +01:00
Stefan Weil
bf42f8313d
Replace remaining GenericVector by std::vector for src/dict
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Weil
17eee8648f
Replace more GenericVector by std::vector
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Weil
2a3682a35e
Replace remaining GenericVector by std::vector in src/lstm
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-16 12:25:11 +01:00
Stefan Brechtken
e10d19b084
updating function documentation and removing unnecessary include
2021-03-15 17:25:10 +01:00
Stefan Brechtken
594a000ecd
merging with tesseract master in order to create a pull request
2021-03-15 17:02:19 +01:00
Stefan Weil
e51fcb2d31
Remove last usage of STRING
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
57920174dc
Remove unused parts of class STRING
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
576c09bf31
Replace remaining STRING by std::string in unittest
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
0edd69eb10
Replace remaining STRING by std::string in src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
d16fba9bed
Replace all but one remaining STRING by std::string in src/ccstruct
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
21cf7cf84e
Replace remaining STRING by std::string in src/dict
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
21d9aad594
Replace remaining STRING by std::string in src/viewer and src/wordrec
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
e0ce040832
Replace remaining STRING by std::string in src/classify
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Stefan Weil
db9f963411
Replace remaining STRING by std::string in src/ccmain
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-15 09:11:41 +01:00
Egor Pugin
d7823a71c2
Remove unused file.
2021-03-15 09:47:04 +03:00
Egor Pugin
efd17e205a
Replace typedef structs with structs.
...
typedef enums are left intact.
2021-03-15 09:47:04 +03:00
Egor Pugin
262f65a4d2
snprintf will add '\0' at the end itself.
2021-03-14 23:54:29 +03:00
Egor Pugin
26ceeef6c0
[training] Modernize.
2021-03-14 23:47:42 +03:00
Shree Devi Kumar
efe9ff611f
Limit unicharset from training_text only to Indic languages
2021-03-14 17:58:57 +00:00
Shree Devi Kumar
a589ded25f
Create unicharset from training text to avoid normalization errors
2021-03-14 16:39:00 +00:00
Egor Pugin
f06b2c7c8d
[capi] Restore some of wrongly removed apis.
...
Removed C++ APIs are not restored.
Additionally remove unused C++ typedefs which were in removed C++ functions.
If you still need them, use C++ API instead.
2021-03-14 17:20:52 +03:00
Egor Pugin
dabdaa1def
Misc.
2021-03-14 17:14:41 +03:00
Stefan Weil
7178ebd799
Add missing TESS_API for new function tesseract::split
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-14 08:16:33 +01:00
Stefan Weil
36f9131e04
Move implementation of tesseract::split from header to cpp file
...
This fixes duplicate symbols for some builds.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:39:58 +01:00
Stefan Weil
3b0759940c
Replace more STRING by std::string
...
Remove STRING::add_str_int and STRING::add_str_double which are now unused.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 23:16:35 +01:00
Stefan Weil
c9f0da49ca
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
91f7675848
Replace more STRING by std::string for src/ccmain
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d084c7cca8
Replace remaining STRING by std::string for src/api
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
96d1644da1
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
a42c6c7dcd
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
9cf5b9870d
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
51909d5a2e
Replace more STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:52 +01:00
Stefan Weil
d6495d9026
Replace STRING by std::string in src/lstm
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 21:15:51 +01:00
Stefan Weil
1f2ec4dfb1
Fix network specification for NT_SYMCLIP
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-13 13:10:37 +01:00
Stefan Weil
6bf5080d4c
Remove unused include statements for strngs.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 23:11:08 +01:00
Egor Pugin
a393df5038
Add missing export header.
2021-03-13 00:07:19 +03:00
Egor Pugin
2d10be5209
[clang-format] Format generated protobuf source.
2021-03-13 00:07:03 +03:00
Egor Pugin
618b185d14
Include missing config_auto.h
2021-03-12 23:39:18 +03:00
Egor Pugin
8b0c5405e2
Add missing forward decl.
2021-03-12 22:35:30 +03:00
Egor Pugin
0eb7ba88bf
[clang-format] Execute clang format on include and src dirs.
...
Script:
find include src -type f | sort > all.txt
find include src -type f | grep -v "\.cpp" | grep -v "\.h" | sort > skip.txt
comm -23 all.txt skip.txt | xargs clang-format -i
2021-03-12 22:35:02 +03:00
Stefan Weil
4c6cc5a04d
Replace GenericVector by std::vector in class ImageData
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-12 13:10:25 +01:00
Ger Hobbelt
779aa79350
Fix build ( #3322 )
...
* fix errors after merge commit: missing changes that are needed too to make this codebase compile.
* Update src/wordrec/wordrec.h
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-03-11 21:43:07 +01:00
Egor Pugin
3444618075
Fix linux build.
2021-03-10 15:35:13 +03:00
Egor Pugin
ce058604ba
Pass empty strings into Tesseract::init_tesseract().
2021-03-10 15:21:03 +03:00
Egor Pugin
911dd93f12
Pass init strings as std::string instead of const char * internally. This does not affect public APIs.
2021-03-10 15:17:00 +03:00
Egor Pugin
9792f3c4ff
Remove STRING::size() method.
2021-03-10 14:58:37 +03:00
Egor Pugin
6de97309a1
Remove unused STRING::strdup().
2021-03-10 14:42:50 +03:00
Egor Pugin
f0e30a2af2
Remove unused STRING::unsigned_size().
2021-03-10 14:41:31 +03:00
Egor Pugin
d36adf3d40
Replace STRING::truncate_at() with resize().
2021-03-10 14:40:28 +03:00
Egor Pugin
e9a2fc0083
More std::string replacements.
2021-03-10 14:36:59 +03:00
Stefan Weil
0f1296c6f6
Clean implementation for (de-)serialization of a vector
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-08 13:33:48 +01:00
Stefan Weil
6cfe604d58
Fix serialization for vector of RecodedCharID
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-07 23:01:25 +01:00
Stefan Weil
0cde3ede98
Add heuristic to fix swap (partially fixes issue #2586 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
a2769aebb4
Replace GenericVector<TBOX> by std::vector<TBOX>
...
Fix also endianness handling for (de)serialisation of TBOX.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
c31c1a7d60
Fix two compiler warnings for serialis.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-05 14:27:28 +01:00
Stefan Weil
fe614c6069
Enable less FP exceptions for clang compiler when running tesseract
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-03 22:56:07 +01:00
Egor Pugin
c39b1daa6b
GenericVector -> std::vector.
2021-03-03 22:22:00 +03:00
Egor Pugin
0a693a9519
Allow to serialize std vectors with classes from TFile. Implementation from GenericVector.
2021-03-03 22:21:40 +03:00
Stefan Weil
ff830775f9
Fix memory leak in DocumentCache
...
It was introduced in commit 5cac52173e
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-01 11:31:48 +01:00
Stefan Weil
339c01894e
Avoid fp division by 0 (fix issue #3314 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-28 19:42:01 +01:00
Stefan Weil
cd60728e8a
Avoid float division by zero when calculating adaptive learning rate
...
The following line results in a division by zero when
momentum is -1 and num_samples is even:
learning_rate /= 1.0f - pow(momentum, num_samples);
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
c12dde2862
Use float instead of double for learning_rate, momentum and adam_beta
...
Only WeightMatrix::Update used double parameters, all other functions
already used float. So this change avoids unnecessary conversions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-27 21:08:41 +01:00
Stefan Weil
422452b9f4
Check for float errors when running tesseract and lstmtraining
...
Some illegal floating point calculations like division by zero,
illegal value or overflow will now abort tesseract with an error
message.
For lstmtraining there is now a new parameter --debug_float to
enable the same kind of checks. It is currently disabled by default
because such errors occur and would abort the training process.
That should be fixed in the future.
If tesseract also shows floating point errors which cannot be
fixed easily, a similar parameter to enable the checks can be
added there, too.
The new code requires the function feenableexcept which is only
available with the GNU libc, so it is only used on Linux.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:49:27 +01:00
Stefan Weil
51a214a51b
Remove unused include statements for imagedata.h and document used ones
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:28 +01:00
Stefan Weil
1d7a981203
Disable code for unused classes WordFeature and FloatWordFeature
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:17 +01:00
Stefan Weil
5cac52173e
Replace PointerVector by std::vector in class DocumentCache
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 21:42:07 +01:00
Stefan Weil
387acd9881
Initialize weight matrix with 0.0 (fix issue #3229 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 18:49:39 +01:00
Egor Pugin
1ab6b0fbc6
Merge pull request #3311 from stweil/master
...
Replace calls of exit function
2021-02-26 17:43:53 +03:00
Stefan Weil
58304cbfdd
Don't compile OpenCL code when OpenCL is disabled
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 15:40:23 +01:00
Stefan Weil
a6946c3bf9
Replace calls of exit function
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:36 +01:00
Stefan Weil
373a3527ec
Format code
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:22:09 +01:00
Stefan Weil
ea446b1eae
Remove blanks at line endings
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-26 14:05:36 +01:00
Stefan Weil
394c56ab15
Replace GenericVector by std::vector in class WERD_CHOICE
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 23:14:25 +01:00
Stefan Weil
fccecb2d23
Replace GenericVector by std::vector in class ResultIterator
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 21:07:57 +01:00
Stefan Weil
2257028052
Replace GenericVector by std::vector in reject.cpp
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 21:06:59 +01:00
Stefan Weil
d62f27dd8f
Replace GenericVector by std::vector in stepblob.cpp
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:47:06 +01:00
Stefan Weil
3e5b2760ab
Replace GenericVector by std::vector for struct BlamerBundle
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:34:41 +01:00
Stefan Weil
0b8e937655
Use countof to get number of array elements
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
7097dfd41c
Replace GenericVector by std::vector for parameters
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
f2d2695ce9
Replace STRING and clean declarations of local variables in eval_word_spacing
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
5277443833
Replace more STRING
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-23 20:20:48 +01:00
Stefan Weil
ae00f291f6
Remove unused include statements
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-22 22:28:47 +01:00
Stefan Weil
65053890d7
Handle file list without terminating LF (fix issue #3298 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-13 11:44:47 +01:00
Stefan Weil
bc69e28de3
Update include statements for external header file allheaders.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-13 10:17:20 +01:00
Stefan Weil
e6f15621c2
Remove Python training scripts which were moved to tesstrain
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-02-04 14:45:19 +01:00
Shree Devi Kumar
40f3c8d104
Change LATIN_FONTS to use replacement fonts from TeX Gyre collection
2021-02-04 13:51:03 +01:00
Stefan Weil
4902e68682
cmake: Use pkg_config to find required libraries
...
This is needed for cmake builds on MacOS (Intel and Amd64) with Homebrew.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-31 17:23:06 +01:00
Stefan Weil
e999f421bc
Replace GenericVector<float> by std::vector<float> for class SimpleStats
...
This also fixes a runtime error:
src/ccutil/genericvector.h:228:11: runtime error:
null pointer passed as argument 1, which is declared to never be null
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-26 14:29:07 +01:00
Stefan Weil
4b84a56d8d
Replace STRING by std::string for function read_unlv_file
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:46:12 +01:00
Stefan Weil
139d127ff7
Remove unneeded include statement for genericvector.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
71fb535427
Remove unneeded include statement for strngs.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
44fd1c4986
Wordrec: Modernize code
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 15:53:55 +01:00
Stefan Weil
5a3d6e5e0d
Fix memory leak in mastertrainer_test (fixes issue #3215 )
...
The issue was introduced in commit 6e9456415
.
Partially reverting this commit fixes it.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 14:54:38 +01:00
Stefan Weil
e3fd938bca
lstmtrainer: Modernize code
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-22 08:17:19 +01:00
Stefan Weil
0cdaab5ac9
lstmtrainer: Remove unused local variable
...
This fixes a compiler warning:
src/training/unicharset/lstmtrainer.cpp:107:15: warning:
unused variable 'shape' [-Wunused-variable]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-22 08:13:38 +01:00
Stefan Weil
3d47e0a91a
Replace GenericVector by std::vector in LoadFileLinesToStrings
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-22 08:13:38 +01:00
Stefan Weil
5d44a8216f
Show names of failing lstmf files in error messages
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-20 13:36:59 +01:00
Stefan Weil
c7baf8f17d
Add more information shown by combine_tessdata -l
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
3195c8f75f
Add new option -l for combine_tessdata to list the network string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
970eba79e6
Replace STRING by std::string for LSTMRecognizer::network_str_
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
97cfd95872
Replace STRING by char* in LSTMRecognizer
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-15 18:49:51 +01:00
Stefan Weil
73ffcabfe9
lstmtraining: Interpret negative value for --max_iterations as epochs
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 19:51:58 +01:00
Stefan Weil
40bdcd2941
Add TESS_API to instantiation of template functions
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 18:07:35 +01:00
Stefan Weil
80810218f7
Use explicit int32_t for serialized data type
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 18:06:39 +01:00
Stefan Weil
05da41dc60
Replace GenericVector<BlobData> by std::vector<BlobData>
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 17:23:13 +01:00
Robert Pösel
7dcd9b5095
Remove ANDROID_BUILD macro
...
Build fails when ANDROID_BUILD is defined, because it removes parts of the LSTM engine, but there are still some unguarded references. But removing LSTM engine is not needed as it works perfectly fine on Android.
This macro doesn't provide any benefit anymore and is not even used in current build config. If needed, ANDROID macro should be used instead (which is already used on few places).
2021-01-14 14:31:34 +01:00
Stefan Weil
08f2ba02f7
Fix memory allocation in TFile::DeSerialize(std::vector<T>& data)
...
lstmtraining crashed when creating traineddata files:
Error: attempt to subscript container with out-of-bounds index 0, but
container only holds 0 elements.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-14 12:11:02 +01:00
Stefan Weil
5e661b9339
Don't use local CP_RESULT_STRUCT variable to initialize elements of std::vector
...
std::vector passes that local variable by reference, so no individual
instances are used for the new vector elements.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-13 15:57:04 +01:00
Stefan Weil
b0e46085f4
Fix serialization of std::vector (fix issue #3220 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-12 21:23:14 +01:00
Stefan Weil
9b15e65900
Replace resize(0) by clear() for std::vector
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-12 19:24:54 +01:00
Shree Devi Kumar
5104af6a15
Remove --psm 6 for lstm.train in tesstrain.py
2021-01-12 13:26:33 +01:00
Shree Devi Kumar
106b3d1ed0
No --psm 6 for lstm.train
2021-01-12 12:42:53 +01:00
Robert Pösel
ca9c7ba303
Fix NEON also tesseractmain.cpp
2021-01-11 12:17:25 +01:00
Robert Pösel
1954ee3867
Fix use of NEON on ARMv8
...
Flag neon_available_ is automatically set to true when __aarch64__ is defined,
but the actual check for neon_available_ required having also HAVE_NEON defined.
Now we check the flag also when only __aarch64__ is defined.
2021-01-11 12:17:16 +01:00
Stefan Weil
021237ad2c
Add assertion for IntCastRounded
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-10 15:08:31 +01:00
Stefan Weil
209c1df599
Fix some format strings
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-08 18:49:21 +01:00
Egor Pugin
8cb1c62259
More std::vector.
2021-01-07 15:13:59 +03:00
Egor Pugin
8d6cad1acc
Misc.
2021-01-07 14:33:45 +03:00
Egor Pugin
4f5bd1c562
Move unicodes into files where they are used.
2021-01-07 14:33:02 +03:00
Egor Pugin
8aa5492262
Misc.
2021-01-07 14:14:40 +03:00
Egor Pugin
9cc7bdeaa6
Use std::bitset<16> instead of custom BITS16.
2021-01-07 14:14:27 +03:00
Egor Pugin
9710bc0465
More std::vector.
2021-01-07 13:57:57 +03:00
Stefan Weil
d000df7e00
Remove remaining parts of tessopt (fix autotools build)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-05 23:06:17 +01:00
Egor Pugin
8e947a98b5
Remove emalloc. Replace it with malloc. To be replaced with new later.
2021-01-06 00:30:52 +03:00
Egor Pugin
af4ebaa943
Alloc on stack.
2021-01-05 18:07:40 +03:00
Egor Pugin
d3729cb34e
Rmove unused members.
2021-01-05 18:07:10 +03:00
Egor Pugin
40aca00559
Remove unused var.
2021-01-05 17:56:39 +03:00
Egor Pugin
14cf6adda2
More std::vector.
2021-01-05 17:53:05 +03:00
Egor Pugin
a44d107e94
Misc.
2021-01-05 17:45:34 +03:00
Egor Pugin
6e94564152
[training] More unique ptrs.
2021-01-05 17:03:26 +03:00
Egor Pugin
4415209fd6
Remove tessopt. This fixes mastertrainer test in shared build.
2021-01-05 17:00:27 +03:00
Egor Pugin
c946a5610c
Remove unused header.
2021-01-05 16:45:24 +03:00
Egor Pugin
8950e49a5d
Remove unused var.
2021-01-05 16:45:07 +03:00
Egor Pugin
5160426400
Misc.
2021-01-05 16:31:09 +03:00
Egor Pugin
fb98b9b2f5
Use unique_ptr.
2021-01-05 16:00:22 +03:00
Egor Pugin
aa80aa5de1
More std::vector.
2021-01-05 15:54:30 +03:00
Egor Pugin
4f8f8e3d58
More std::vector. Simplify.
2021-01-05 15:49:53 +03:00
Egor Pugin
ca514ad91e
[test] Return early on error.
2021-01-05 15:37:43 +03:00
Egor Pugin
4ed601956e
More std::vector.
2021-01-05 14:46:11 +03:00
Egor Pugin
0c7139ce09
A better fix to read unichars. Imbue C locale always since on different systems, default locale will give different results.
2021-01-04 20:36:21 +03:00
Egor Pugin
0364832ab8
Correctly read cutoff classes.
2021-01-04 20:20:17 +03:00
Egor Pugin
71f578a198
Do not swap endian elements with size == 1.
2021-01-04 20:00:46 +03:00
Egor Pugin
4e59d964dc
Use templates for serialize/deserialize.
2021-01-04 20:00:25 +03:00
Egor Pugin
4162e37e8c
Use std::vector.
2021-01-04 19:54:51 +03:00
Egor Pugin
3aae46d53d
Remove noisy message.
2021-01-04 18:11:16 +03:00
Stefan Weil
40ba25acbb
Remove functions which are only used locally from scanedg.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-04 15:49:15 +01:00
Stefan Weil
709acf74fe
Remove functions which are only used locally from fpchop.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-03 21:41:56 +01:00