Commit Graph

6040 Commits

Author SHA1 Message Date
Egor Pugin
bb155a1bb4
Merge pull request #3663 from stweil/clang7
Allow compilation with clang-7
2021-11-28 23:02:41 +03:00
Stefan Weil
eb089c1346 autobuild: Fix autogen.sh (reduce build time)
After running autogen.sh and configure, the following make had to
run autoreconf because of dependencies which needed an update.

This is fixed by running aclocal twice.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 19:22:58 +01:00
Stefan Weil
a1f40cadc1 Avoid some unnecessary conversions from float to double
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 18:55:27 +01:00
Stefan Weil
5e8d877262 Modernize code in class Classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 18:44:20 +01:00
Stefan Weil
ffe2038ea6 Allow compilation with clang-7
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 10:45:46 +01:00
Stefan Weil
839f528b9a Remove unused GenericVector::contains_index, UnicityTable::contains_id
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 09:54:59 +01:00
Stefan Weil
8b21e4f0b8 Remove member function GenericVector<T>::contains
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-27 09:40:36 +01:00
Stefan Weil
739057c586 Remove member function UnicityTable<T>::contains
It was only used once, and the code using it can be simplified.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-27 09:40:17 +01:00
Egor Pugin
3313bb794b
Merge pull request #3657 from stweil/bcer
Limit BCER to interval [0,1]
2021-11-25 13:43:47 +03:00
Stefan Weil
99aea21336 Limit BCER to interval [0,1]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-25 08:04:26 +01:00
Egor Pugin
515e9906d4
Update sw.yml 2021-11-24 18:41:06 +03:00
Egor Pugin
6f399c0df1
[ci] Add vs2022 to sw workflow. 2021-11-24 14:17:08 +03:00
Stefan Weil
ee29fca9ce Create new release 5.0.0-rc3
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-22 22:01:18 +01:00
Amit D
2087c45f20
Update unittest-disablelegacy.yml 2021-11-22 21:28:29 +02:00
Stefan Weil
2c4665466e Format code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-22 19:47:39 +01:00
Bernhard Liebl
555aa55f05 Add RowAttributes getter to PageIterator
[sw]: Cherry-picked commit from 4.1 branch

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-22 19:47:39 +01:00
Stefan Weil
b649222de3 Fix resultiterator_test with --disable-legacy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-20 14:58:36 +01:00
Stefan Weil
5f27310d22 Fix some compiler warnings with --disable-legacy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-20 14:57:08 +01:00
Amit D
47abbaa48f
Training: Fix compiler warnings (#3650)
warning: format ‘%c’ expects argument of type ‘int’, but argument 2 has type ‘tesseract::Validator::CharClass’ [-Wformat=]
2021-11-19 21:01:04 +02:00
Amit D
34b4391227
Update unittest-disablelegacy.yml 2021-11-19 11:05:20 +02:00
Amit D
75253a24c7
Improve the disable legacy build (#3649)
resultiterator_test: Disable some parts of EasyTest.
2021-11-19 10:52:18 +02:00
Stefan Weil
455feb35f2 Replace char error by BCER in more training messages
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-18 21:34:16 +01:00
Amit D
ff11f5dc65
Improve disable legacy build (#3648)
resultiterator_test: Disable SmallCapDropCapTest

Co-authored-by: Shree Devi Kumar <5095331+Shreeshrii@users.noreply.github.com>
2021-11-18 16:07:55 +02:00
Stefan Weil
981c167f8c Improve result message from lstmeval
Old message:

    At iteration 0, stage 0, BCER eval=2.553356, BWER eval=5.586173

New message:

    BCER eval=2.553356, BWER eval=5.586173

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-17 09:02:49 +01:00
Stefan Weil
c716ebdc42
Improve training messages (issue #3560) (#3644)
The old messages could wrongly be interpreted as CER / WER values,
but Tesseract training currently uses simple bag of characters /
bag of words error rates (see LSTMTrainer::ComputeCharError,
LSTMTrainer::ComputeWordError).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-17 09:39:23 +02:00
Stefan Weil
ef3bf98cc1 lstmtrainer: Fix comment
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-15 20:19:54 +01:00
Stefan Weil
83ad8a18de Clean code with clang-tidy (performance-move-const)
Command used:

    clang-tidy --checks="-*,performance-move-const-arg"

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-15 20:18:29 +01:00
Stefan Weil
f48620fffb scrollview: Add const attributes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-15 20:17:59 +01:00
Stefan Weil
66dc90bc5f Create new release 5.0.0-rc2
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-14 20:04:23 +01:00
Stefan Weil
f0b8c0254b stepblob: Fix some warnings from clang-tidy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-14 16:40:38 +01:00
Stefan Weil
25cdca6492 combine_tessdata: Print "Version:" instead of "Version string:"
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-14 16:38:52 +01:00
Stefan Weil
d8d63fd71b Optimize performance with clang-tidy
The code was partially formatted with clang-format and optimized with

    clang-tidy --checks="-*,perfor*" --fix src/*/*.cpp

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-14 15:54:04 +01:00
Stefan Weil
e5011c545a Remove unused function ScrollView::AwaitEventAnyWindow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-14 12:10:37 +01:00
Stefan Weil
37b33749da ScrollView: Fix memory leak and modernize code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-14 10:34:20 +01:00
Stefan Weil
371ee2232e Remove spaces at line endings and empty last lines
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 22:45:47 +01:00
Stefan Weil
e18826cfab Fix some compiler warnings and modernize code in class TrainingSampleSet
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 22:33:22 +01:00
Stefan Weil
6360e60877 Modernize code in TessBaseAPI::Init
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 21:43:46 +01:00
Stefan Weil
03f2cfdf02 Show tessdata directory when listing models
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 21:43:01 +01:00
Stefan Weil
c2ee0cd06f Fix listing of languages
The last fix for OCR with more than one model introduced
a regression for `tesseract --list-langs`.

Fixes: 9091055783 ("Fix loading of additional model files")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 21:34:29 +01:00
Stefan Weil
ebce8ab2eb combine_tessdata: Support -dl and -ld options
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 11:33:10 +01:00
Stefan Weil
905795041f Fix new GitHub action CIFuzz
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 09:56:26 +01:00
Stefan Weil
3378d79ae6 Add new GitHub action CIFuzz
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 09:42:04 +01:00
Stefan Weil
5884036ecd Don't use compiler flags -march=native -mtune=native in autoconf builds
Using those flags is not acceptable for Linux distributions
because the resulting code then depends on the build
infrastructure, so the build result is not deterministic.

It is still possible to use those compiler flags by specifying
CXXFLAGS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-11 12:29:51 +01:00
Stefan Weil
9091055783 Fix loading of additional model files (issue #3635)
Modernize also a for loop statement.

Fixes: d6de055acf ("Set default language for tesseract only if required")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-10 20:34:06 +01:00
Amit D
827900675b
Don't add a page separator for a single page image (#3632)
This change was requested in issue #3628.
2021-11-08 20:49:49 +01:00
Stefan Weil
2fbe4f54bb Fix out-of-memory in fuzzer-api (oss-fuzz issue #39185)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-07 13:49:30 +01:00
Stefan Weil
183bb3f519 Use TDimension for arguments of make_edgept
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-06 10:01:22 +01:00
Stefan Weil
6c7cfe41cc Remove some unneeded type casts
Those type casts were also wrong for large image support.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-06 10:01:22 +01:00
Amit D
4469053a9b
Update unittest-disablelegacy.yml 2021-11-05 14:06:46 +02:00
Amit D
8865fefdba
Improve the disable legacy build (#3627)
Undo API changes done in e9b8b840bf.
2021-11-04 18:26:15 +02:00