Commit Graph

6046 Commits

Author SHA1 Message Date
Amit D
827900675b
Don't add a page separator for a single page image (#3632)
This change was requested in issue #3628.
2021-11-08 20:49:49 +01:00
Stefan Weil
2fbe4f54bb Fix out-of-memory in fuzzer-api (oss-fuzz issue #39185)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-07 13:49:30 +01:00
Stefan Weil
183bb3f519 Use TDimension for arguments of make_edgept
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-06 10:01:22 +01:00
Stefan Weil
6c7cfe41cc Remove some unneeded type casts
Those type casts were also wrong for large image support.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-06 10:01:22 +01:00
Amit D
4469053a9b
Update unittest-disablelegacy.yml 2021-11-05 14:06:46 +02:00
Amit D
8865fefdba
Improve the disable legacy build (#3627)
Undo API changes done in e9b8b840bf.
2021-11-04 18:26:15 +02:00
Amit D
49715f4d27
pagesegmode_test.cc: Disable some code for disable legacy build (#3626)
Co-authored-by: Shree Devi Kumar <5095331+Shreeshrii@users.noreply.github.com>
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-11-04 12:49:32 +01:00
Amit D
e9b8b840bf
Improve the disable legacy build (#3624)
Disable more code related to equation detection and osd.
2021-11-03 19:15:15 +01:00
Amit D
5da09f241c README: Remove the reference to version 3.05.02
Versions 4.1.1 and 5.0.0 still support the legacy engine with the same functionality as 3.05.02, so there is no reason to mention 3.05.02.
2021-11-03 17:53:13 +01:00
Stefan Weil
62bfbf5aa4 Use bool instead of int8_t for boolean variable
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 11:22:14 +01:00
Stefan Weil
333f7bfc5c Use bool instead of int for boolean variable
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 11:02:30 +01:00
Stefan Weil
87a5689f8d Format code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 10:57:40 +01:00
Stefan Weil
a91ea10924 Optimize function ApproximateOutline
The compiler can now inline several functions which are
only used in this compilation unit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-03 10:53:35 +01:00
Amit D
b77009bd59
configure.ac: Update minimum required autoconf version to 2.69
This version was released in April 2012.

It is supported by old Linux distros like RHEL/CentOS 7, SLES 12 and Ubuntu 14.04.
2021-11-02 15:49:46 +02:00
Stefan Weil
17e795aaae Add missing include statement for INT_MIN, INT_MAX
Fixes: c6b25f3b6e ("Add assertions in IntCastRounded")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-02 10:20:37 +01:00
Stefan Weil
c6b25f3b6e Add assertions in IntCastRounded
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39185 could be
caused by an integer overflow in IntCastRounded.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-02 07:52:31 +01:00
Stefan Weil
565d3912c6 Fix compiler warnings with -Wformat-security
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 22:58:56 +01:00
Stefan Weil
7058bbf282 Move googletest to unittest/third_party/googletest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 11:50:50 +01:00
Stefan Weil
a5f2f90c8d Fix legacy build
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-01 08:34:34 +01:00
Egor Pugin
1258386e72
Merge pull request #3619 from stweil/move_tesseractmain
Move src/api/tesseractmain.cpp to src/tesseract.cpp
2021-11-01 01:55:52 +03:00
Stefan Weil
104ef8f30e Move src/api/tesseractmain.cpp to src/tesseract.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-31 21:43:30 +01:00
Stefan Weil
c0b529f2e1 Move declaration of ThresholdMethod from public API to thresholder.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 20:15:25 +02:00
Stefan Weil
97cd07f2a0 Add format attributes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 19:55:27 +02:00
Stefan Weil
68017dbf2a lstmtraining: Handle missing traineddata with error message (fix issue #1075)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 12:27:35 +02:00
Stefan Weil
2a66694754 Format API headers with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 10:00:27 +02:00
Stefan Weil
ca9ea78494 Format code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 09:42:41 +02:00
Stefan Weil
57af712f2f Fix some compiler warnings for unused parameters
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 09:39:05 +02:00
Stefan Weil
20203de8d9 Fix format strings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-30 09:37:30 +02:00
Stefan Weil
8b6390846e Create new release 5.0.0-rc1
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-29 22:32:11 +02:00
Stefan Weil
b4b2cacd40 Avoid segmentation fault with classify_enable_adaptive_matcher == false (issue #256)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-29 19:42:34 +02:00
Stefan Weil
676b86be4d Fix automake warning because of redefined DEFAULT_INCLUDES
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 22:14:06 +02:00
Stefan Weil
612ff9b7e8 Fix sw build error by using TESS_API for global variable log_level
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 22:13:21 +02:00
Stefan Weil
b4e4e00653 Fix two memory leaks in LineFinder::FindAndRemoveLines
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 21:09:46 +02:00
Stefan Weil
1f8835d731 Fix compiler error in try / catch statement
Fixes: 1a6c298696 ("Add new command line option --loglevel")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 20:55:46 +02:00
Stefan Weil
69e0a02399 Remove banner message completely
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 20:43:23 +02:00
Stefan Weil
491e60296c Add missing include statement
Fixes: 1a6c298696 ("Add new command line option --loglevel")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 15:14:55 +02:00
Amit D
fe16277fad Disable music staff detection and removal
Change the default value of pageseg_apply_music_mask to false. See #1255.
2021-10-28 15:04:27 +02:00
Stefan Weil
73a1bfc4e8 Run ReCachePages synchronously during training (fix issue #3111)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 14:40:27 +02:00
Stefan Weil
1a6c298696 Add new command line option --loglevel
By default some less important log messages are suppressed now.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-28 14:26:48 +02:00
zdenop
3ca273f914 cmake silent message about changed behaviour 2021-10-28 12:07:53 +02:00
zdenop
62566abece cmake: Hide some warnings for MSVC release target 2021-10-28 11:56:22 +02:00
Stefan Weil
a7a729f6c3 Disable CI checks which are no longer valid with NFC normalization
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-27 19:15:44 +02:00
Stefan Weil
5cc649e5f9 Remove code which is wrong in combination with NFC
See comments in https://github.com/tesseract-ocr/tesseract/pull/3420.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-27 18:52:03 +02:00
Stefan Weil
5cee9a0cec Merge remote-tracking branch 'nickjwhite/nfc'
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-27 18:40:02 +02:00
Stefan Weil
282685d531 Enable fast float32 LSTM by default
It is still possible to build Tesseract with double LSTM:

    # autoconf
    ./configure --disable-float32

    # cmake
    cmake .. -DFAST_FLOAT=ON

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-27 18:37:27 +02:00
Stefan Weil
c602624012 Prepare support for image width and height larger than 32767 (continued)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-27 08:58:31 +02:00
Stefan Weil
59fbad0dd5 Prepare support for image width and height larger than 32767
Avoid using int16_t and use a new data type TDimension where needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-27 08:45:33 +02:00
Stefan Weil
56f54c24de Fix heap use after free (issue #3523)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-26 19:19:59 +02:00
Amit D
cea2a6015e
Thresholding: Improve some debug messages 2021-10-26 19:09:06 +03:00
Stefan Weil
d6de055acf Set default language for tesseract only if required
When running with --list-langs, --print-parameters or --print-fonts-table
no default language is needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-10-26 11:05:06 +02:00