Commit Graph

5905 Commits

Author SHA1 Message Date
Stefan Weil
e18826cfab Fix some compiler warnings and modernize code in class TrainingSampleSet
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 22:33:22 +01:00
Stefan Weil
6360e60877 Modernize code in TessBaseAPI::Init
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 21:43:46 +01:00
Stefan Weil
03f2cfdf02 Show tessdata directory when listing models
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 21:43:01 +01:00
Stefan Weil
c2ee0cd06f Fix listing of languages
The last fix for OCR with more than one model introduced
a regression for `tesseract --list-langs`.

Fixes: 9091055783 ("Fix loading of additional model files")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 21:34:29 +01:00
Stefan Weil
ebce8ab2eb combine_tessdata: Support -dl and -ld options
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 11:33:10 +01:00
Stefan Weil
905795041f Fix new GitHub action CIFuzz
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 09:56:26 +01:00
Stefan Weil
3378d79ae6 Add new GitHub action CIFuzz
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-13 09:42:04 +01:00
Stefan Weil
5884036ecd Don't use compiler flags -march=native -mtune=native in autoconf builds
Using those flags is not acceptable for Linux distributions
because the resulting code then depends on the build
infrastructure, so the build result is not deterministic.

It is still possible to use those compiler flags by specifying
CXXFLAGS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-11 12:29:51 +01:00
Stefan Weil
9091055783 Fix loading of additional model files (issue #3635)
Modernize also a for loop statement.

Fixes: d6de055acf ("Set default language for tesseract only if required")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-10 20:34:06 +01:00
Amit D
827900675b
Don't add a page separator for a single page image (#3632)
This change was requested in issue #3628.
2021-11-08 20:49:49 +01:00
Stefan Weil
2fbe4f54bb Fix out-of-memory in fuzzer-api (oss-fuzz issue #39185)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-07 13:49:30 +01:00
Stefan Weil
183bb3f519 Use TDimension for arguments of make_edgept
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-06 10:01:22 +01:00
Stefan Weil
6c7cfe41cc Remove some unneeded type casts
Those type casts were also wrong for large image support.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-06 10:01:22 +01:00
Amit D
4469053a9b
Update unittest-disablelegacy.yml 2021-11-05 14:06:46 +02:00
Amit D
8865fefdba
Improve the disable legacy build (#3627)
Undo API changes done in e9b8b840bf.
2021-11-04 18:26:15 +02:00
Amit D
49715f4d27
pagesegmode_test.cc: Disable some code for disable legacy build (#3626)
Co-authored-by: Shree Devi Kumar <5095331+Shreeshrii@users.noreply.github.com>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-11-04 12:49:32 +01:00
Amit D
e9b8b840bf
Improve the disable legacy build (#3624)
Disable more code related to equation detection and osd.
2021-11-03 19:15:15 +01:00
Amit D
5da09f241c README: Remove the reference to version 3.05.02
Versions 4.1.1 and 5.0.0 still support the legacy engine with the same functionality as 3.05.02, so there is no reason to mention 3.05.02.
2021-11-03 17:53:13 +01:00
Stefan Weil
62bfbf5aa4 Use bool instead of int8_t for boolean variable
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 11:22:14 +01:00
Stefan Weil
333f7bfc5c Use bool instead of int for boolean variable
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 11:02:30 +01:00
Stefan Weil
87a5689f8d Format code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 10:57:40 +01:00
Stefan Weil
a91ea10924 Optimize function ApproximateOutline
The compiler can now inline several functions which are
only used in this compilation unit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 10:53:35 +01:00
Amit D
b77009bd59
configure.ac: Update minimum required autoconf version to 2.69
This version was released in April 2012.

It is supported by old Linux distros like RHEL/CentOS 7, SLES 12 and Ubuntu 14.04.
2021-11-02 15:49:46 +02:00
Stefan Weil
17e795aaae Add missing include statement for INT_MIN, INT_MAX
Fixes: c6b25f3b6e ("Add assertions in IntCastRounded")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-02 10:20:37 +01:00
Stefan Weil
c6b25f3b6e Add assertions in IntCastRounded
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39185 could be
caused by an integer overflow in IntCastRounded.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-02 07:52:31 +01:00
Stefan Weil
565d3912c6 Fix compiler warnings with -Wformat-security
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 22:58:56 +01:00
Stefan Weil
7058bbf282 Move googletest to unittest/third_party/googletest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 11:50:50 +01:00
Stefan Weil
a5f2f90c8d Fix legacy build
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 08:34:34 +01:00
Egor Pugin
1258386e72
Merge pull request #3619 from stweil/move_tesseractmain
Move src/api/tesseractmain.cpp to src/tesseract.cpp
2021-11-01 01:55:52 +03:00
Stefan Weil
104ef8f30e Move src/api/tesseractmain.cpp to src/tesseract.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-31 21:43:30 +01:00
Stefan Weil
c0b529f2e1 Move declaration of ThresholdMethod from public API to thresholder.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 20:15:25 +02:00
Stefan Weil
97cd07f2a0 Add format attributes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 19:55:27 +02:00
Stefan Weil
68017dbf2a lstmtraining: Handle missing traineddata with error message (fix issue #1075)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 12:27:35 +02:00
Stefan Weil
2a66694754 Format API headers with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 10:00:27 +02:00
Stefan Weil
ca9ea78494 Format code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 09:42:41 +02:00
Stefan Weil
57af712f2f Fix some compiler warnings for unused parameters
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 09:39:05 +02:00
Stefan Weil
20203de8d9 Fix format strings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 09:37:30 +02:00
Stefan Weil
8b6390846e Create new release 5.0.0-rc1
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-29 22:32:11 +02:00
Stefan Weil
b4b2cacd40 Avoid segmentation fault with classify_enable_adaptive_matcher == false (issue #256)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-29 19:42:34 +02:00
Stefan Weil
676b86be4d Fix automake warning because of redefined DEFAULT_INCLUDES
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 22:14:06 +02:00
Stefan Weil
612ff9b7e8 Fix sw build error by using TESS_API for global variable log_level
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 22:13:21 +02:00
Stefan Weil
b4e4e00653 Fix two memory leaks in LineFinder::FindAndRemoveLines
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 21:09:46 +02:00
Stefan Weil
1f8835d731 Fix compiler error in try / catch statement
Fixes: 1a6c298696 ("Add new command line option --loglevel")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 20:55:46 +02:00
Stefan Weil
69e0a02399 Remove banner message completely
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 20:43:23 +02:00
Stefan Weil
491e60296c Add missing include statement
Fixes: 1a6c298696 ("Add new command line option --loglevel")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 15:14:55 +02:00
Amit D
fe16277fad Disable music staff detection and removal
Change the default value of pageseg_apply_music_mask to false. See #1255.
2021-10-28 15:04:27 +02:00
Stefan Weil
73a1bfc4e8 Run ReCachePages synchronously during training (fix issue #3111)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 14:40:27 +02:00
Stefan Weil
1a6c298696 Add new command line option --loglevel
By default some less important log messages are suppressed now.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 14:26:48 +02:00
zdenop
3ca273f914 cmake silent message about changed behaviour 2021-10-28 12:07:53 +02:00
zdenop
62566abece cmake: Hide some warnings for MSVC release target 2021-10-28 11:56:22 +02:00