Commit Graph

2089 Commits

Author SHA1 Message Date
Zdenko Podobný
14e2517d6d remove src.destroy(); 2022-10-21 15:58:58 +02:00
zdenop
95019a8cf3 fix issue #3940 - remove colormap before thresholding 2022-10-15 00:11:58 +02:00
Stefan Weil
0daf18c202 Detect availability of AVX512-VNNI
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-08-06 11:23:06 +02:00
Arseniy Zaostrovnykh
749c90d92e
Fix the build on CodeQL/Analyze 2022-08-02 13:04:31 +02:00
Egor Pugin
4de02dd7f9 [sw] Add svpaint. 2022-07-25 22:02:54 +03:00
Stefan Weil
989956c998 Replace call of exit function by return statement in main function
Add also a missing return statement and use EXIT_FAILURE
and EXIT_SUCCESS instead of 1 and 0 as return values.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-07-20 20:32:27 +02:00
Stefan Weil
ee34b100bf Fix double free in function vigorous_noise_removal (fixes issue #3876)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-07-19 09:29:58 +02:00
Stefan Weil
99d6717c10 Create to_win if needed in Textord::make_spline_rows (fixes issue #3875)
There still remain memory leaks for the test scenario, but those are less
urgent as they are related to code which is only used for debugging.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-07-19 08:45:22 +02:00
Stefan Weil
8c573e4cef
autotools: Add rule for svpaint executable (#3873)
Move also its source code svpaint.cpp from src/viewer/ to src/,
so it is no longer included in libtesseract by the cmake build.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-07-19 00:50:01 +03:00
Stefan Weil
e589bfa58b
Merge pull request #3872 from p12tic/fix-scrollview-double-free (fixes issue #3869)
Fix memory issues in ScrollView::MessageReceiver.
2022-07-18 17:36:04 +02:00
Povilas Kanapickas
0107687a9b viewer: Use std::unique_ptr in event_table_ data structure 2022-07-18 18:04:31 +03:00
Povilas Kanapickas
9a74c4ccad viewer: Use std::unique_ptr in waiting_for_events data structure
The current usage of waiting_for_events is taking ownership of SVEvent
pointer from a unique_ptr. This is error prone as all code paths using
waiting_for_events need to ensure deletion. We fix it by using
unique_ptr in waiting_for_events and all dependent code paths.
2022-07-18 18:04:30 +03:00
Povilas Kanapickas
4f831ff489 viewer: Fix double free caused by ScrollView::MessageReceiver
waiting_for_events takes ownership of the passed event which is later
deleted. Since we use unique_ptr::get() to acquire the pointer, we cause
double free: one free happens in the code path where the event from
waiting_for_events goes and the other free happens in unique_ptr
destructor.

The fix is to move ownership out of unique_ptr by unique_ptr::release().

Fixes: https://github.com/tesseract-ocr/tesseract/issues/3869
Fixes: 37b33749da
2022-07-18 18:04:29 +03:00
Stefan Weil
02e834000c Catch potential nullptr in SVNetwork::SVNetwork
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-07-18 13:52:39 +02:00
Stefan Weil
b8b6c158a7 Mark parameter 'tessedit_do_invert' as deprecated
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-06-26 19:27:48 +02:00
Stefan Weil
96861b58ae
Add new parameter for invert_threshold (#3852)
Change default value from 0.5 to 0.7.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-06-26 12:32:56 +03:00
Daniel Plakhotich
0df584e65d capi: Fix calling delete[] for memory allocated by malloc 2022-06-23 17:08:45 +02:00
zdenop
18fb5aa977 fix issue #3092 - skip removing colormap 2022-06-23 16:38:32 +02:00
Stefan Weil
27b1827ccd Update code to support Leptonica 1.83.0 and newer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-06-23 12:11:00 +02:00
Stefan Weil
70109f1e8f Use Leptonica API to access internals of Pix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-06-23 11:23:22 +02:00
Nicolas Abram
71b045cf20
C API: Add a function to init tesseract with traineddata from memory (#3780)
Fixes #3691.

* retrigger checks
2022-06-20 14:53:42 +03:00
Stefan Weil
330d49a0a3 Replace BOX -> Box
Both are equivalent, but the rest of the code already uses Box.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-06-10 19:20:19 +02:00
zdenop
67841aa89f do not use '\0' in std::string => fixes issue #3837 (loading uzn file) 2022-06-07 21:31:00 +02:00
Podobny Zdenko
d2015a6119 cmake: fix Build with clang-cl on Windows; fixes #3683 2022-06-07 11:44:29 +02:00
Yulv-git
8bc7a9591d Fix some typos. 2022-06-05 16:48:20 +08:00
Robert Clausecker
2e7ae6eeb6
Fix NEON detection on FreeBSD (#3782)
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2022-05-29 19:06:54 +02:00
Stefan Weil
64bcdce607 Replace std::regex by std::string functions (fixes issue #3830)
On Windows with UCRT and a UTF-8 locale std::regex takes a lot of time
(several minutes!). Replacing it avoids that bottleneck.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-05-29 10:21:51 +02:00
Stefan Weil
f36c0d019b Replace direct access to Leptonica internal data structures by function calls
This fixes builds with latest Leptonica code (for example for OSS-Fuzz).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-05-03 12:15:39 +02:00
Stefan Weil
b0d82879e5 Add initial support for Intel AVX512F
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-04-14 23:47:04 +02:00
Sunoru
def53f3d4e
Check input for AppendString to avoid strlen(nullptr). 2022-04-11 16:37:35 -04:00
Stefan Weil
facd55ab5c Set /Os for some 32 bit MS compilers (fixes #3769)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-03-31 22:10:24 +02:00
Stefan Weil
fd48287c02 scrollview: Fix two comments
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-03-31 15:46:56 +02:00
zdenop
76dbc21233 fix OpenCL with Nvidia drivers 2022-03-19 11:54:09 +01:00
zdenop
c5007c082b cmake: fix OpenCL build 2022-03-19 11:52:57 +01:00
CSBVision
e9b3939566
Update ccutil.cpp (#3768)
Fixes #3767.

Co-authored-by: Stefan Weil <sw@weilnetz.de>
2022-03-11 15:31:27 +01:00
Stefan Weil
59b4b1eaf8 Remove unneeded include statements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-28 23:09:23 +01:00
Stefan Weil
32e452fc50 Fix typo in descriptions of thresholding parameters
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-27 22:44:46 +01:00
Stefan Weil
424b17f997 Handle image and line regions in output formats ALTO, hOCR and text
Tested-by: Merlijn Wajer <merlijn@archive.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-10 14:23:47 +01:00
Stefan Weil
ebf367e248 Partially revert changes of list data types (fix compiler warnings)
Changing from class to struct causes clang compiler warnings like this one:

In file included from ../../../src/api/baseapi.cpp:63:
../../../include/tesseract/osdetect.h:29:1: warning: class 'BLOB_CHOICE_LIST' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
class BLOB_CHOICE_LIST;
^
../../../src/ccstruct/ratngs.h:228:1: note: previous use is here
ELISTIZEH(BLOB_CHOICE)
^
../../../src/ccutil/elst.h:804:10: note: expanded from macro 'ELISTIZEH'
  struct CLASSNAME##_LIST : X_LIST<ELIST, ELIST_ITERATOR, CLASSNAME> { \
         ^
<scratch space>:458:1: note: expanded from here
BLOB_CHOICE_LIST
^
../../../include/tesseract/osdetect.h:29:1: note: did you mean struct here?
class BLOB_CHOICE_LIST;
^~~~~

As it is not possible to change the API header tesseract/osdetect.h,
some of the changes from class to struct had to be reverted.

Fixes: 968d653f89 ("Shorten macros")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-09 12:53:56 +01:00
Egor Pugin
7c7dd1d889 Remove unused code. 2022-02-07 02:05:38 +03:00
Egor Pugin
58c52dbce6 Remove unused code. 2022-02-07 01:59:33 +03:00
Egor Pugin
91d836a556 Simplify. Move related function from separate file. 2022-02-07 01:53:10 +03:00
Stefan Weil
4ce8fafd82
Merge pull request #3745 from egorpugin/main
Remove unused functions in genericvector.h.
2022-02-06 23:13:03 +01:00
Egor Pugin
dbc14e68d4 Fix warnings. 2022-02-07 01:00:11 +03:00
Egor Pugin
37c62f3ae0 Remove unused fwd. 2022-02-07 01:00:05 +03:00
Egor Pugin
2882766882 Remove unused ctors in macros. 2022-02-07 00:59:41 +03:00
Egor Pugin
b4231c0cee Fix list type. 2022-02-07 00:59:27 +03:00
Egor Pugin
8eef8bc1ac Remove in-class TESS_API. 2022-02-07 00:59:15 +03:00
Egor Pugin
dfffaa28c3 Remove unused functions in genericvector.h. 2022-02-07 00:24:01 +03:00
Egor Pugin
0e7e4cf779 Fix build. 2022-02-07 00:21:32 +03:00
Egor Pugin
eeb4121888 Fix warnings. 2022-02-07 00:21:26 +03:00
Egor Pugin
7f6606ccdc Remove unneeded dtor. 2022-02-07 00:20:07 +03:00
Egor Pugin
f526bf30bb Fix warnings. 2022-02-07 00:19:52 +03:00
Egor Pugin
968d653f89 Shorten macros. 2022-02-07 00:17:29 +03:00
Stefan Weil
44ddde1692 Remove a local function from class TableRecognizer
This allows the compiler to remove the unused function IsWeakTableRow.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 21:14:08 +01:00
Stefan Weil
101ed0036b Remove some local functions from class ImageFind
This allows optimizations like inline code by the compiler.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 20:31:52 +01:00
Stefan Weil
eeda2297ca Remove unused functions ImageFind::ComposeRGB and ImageFind::ClipToByte
Fixes: a1c22fb0d0 ("Fixed issue #557")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 20:13:31 +01:00
Stefan Weil
f6250e6dfe Remove unused function ImageFind::ComputeRectangleColors
Fixes: a1c22fb0d0 ("Fixed issue #557")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 20:03:31 +01:00
Stefan Weil
14399ceb78 Remove unused resolution parameters
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 16:08:59 +01:00
Stefan Weil
7ea97552c6 Remove some local functions from class LineFinder
This allows optimizations like inlining by the compiler.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 15:58:49 +01:00
Stefan Weil
554d14d275 Fix comment
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-05 15:37:12 +01:00
Egor Pugin
8b5571f8bf
Merge pull request #3742 from stweil/robustness
Catch nullptr in PageIterator::Orientation to improve robustness
2022-02-03 14:57:42 +03:00
Stefan Weil
76faf16006 Fix old TODO (STATS::rangemax_)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-03 09:15:31 +01:00
Stefan Weil
443933a75a Catch nullptr in PageIterator::Orientation to improve robustness
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-02-03 08:31:31 +01:00
Stefan Weil
24e68b9140 Add new parameter curl_timeout for curl_easy_setop
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-01-16 14:58:06 +01:00
Stefan Weil
ad55cec472 Add missing include file for std::max, std::min
This fixes a build issue with VS 2019 Version 16.11.9
and platform toolset v141.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-01-13 22:15:50 +01:00
Gilles Talis
be15b46c60 Check if platform supports feenableexcept
feenableexcept is not supported by uclibc

Signed-off-by: Gilles Talis <gilles.talis@gmail.com>
[Retrieved (and updated to add cmake support and simplify configure.ac)
from
https://git.buildroot.net/buildroot/tree/package/tesseract-ocr/0001-Check-if-platform-supports-feenableexcept.patch]
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
2022-01-11 15:23:56 +01:00
Stefan Weil
04a66b91e6 Don't use <XXX>_LINK_LIBRARIES for cmake before version 3.12
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-01-03 16:34:10 +01:00
Stefan Weil
28f854186f cmake: reformat with cmake-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-01-03 16:34:10 +01:00
Stefan Weil
b8b2ab225f Simplify cmake check for Pango related modules
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-01-03 16:34:10 +01:00
Stefan Weil
e1764e1bc8 Use cmake policy CMP0074 only with version 3.12 or newer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-01-01 23:00:47 +01:00
Stefan Weil
6727aae7e9 Remove unused include statement
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-30 19:44:00 +01:00
Stefan Weil
df227caa87 Add function ERRCODE::error with only 2 parameters
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-30 18:46:48 +01:00
Stefan Weil
84e6f44455 Fix some compiler warnings (implicit float to double conversion)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-30 17:58:36 +01:00
Stefan Weil
25d25b5e09 Remove unused forward declaration
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-30 17:52:29 +01:00
Stefan Weil
e87969033b Remove duplicate parameter certainty_scale
It was also declared in class Dict and mostly used from that class.
Setting it via API or command line never changed that used value.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-30 10:36:37 +01:00
zdenop
86158d3978
Merge pull request #3697 from stweil/opt
Small optimizations and fixes for some compiler warnings
2021-12-29 20:13:38 +01:00
Stefan Weil
d754593a31 Catch nullptr in STATS::pile_count (fix isse #3694)
Add also a test case for this issue.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-29 17:26:32 +01:00
Stefan Weil
22e86fa75d Eliminate function NetworkIO::ZeroTimeStepGeneral
This allows more inline code (optimization).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-29 17:17:57 +01:00
Stefan Weil
03e82271bb Fix clang compiler warnings in functions.h
The new code avoids some conversions between double and float,
so it should also have a small positive effect on the performance.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-27 15:39:46 +01:00
Stefan Weil
7277963e11 Update generator for lookup tables to use TFloat instead of double
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-27 10:31:42 +01:00
Stefan Weil
706d3bac62 Fix some clang compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-26 17:45:16 +01:00
Stefan Weil
7a218f1d6c Fix compiler warning [-Wsign-compare]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-25 17:02:45 +01:00
Stefan Weil
34311179f5 Allow printing of bitfield with variadic templates
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-23 16:38:00 +01:00
Stefan Weil
edf5c91ab9 Fix compiler warnings caused by empty statements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-23 16:35:47 +01:00
zdenop
f65fae82ac clean up condition to detect MSCV 2021-12-22 18:57:13 +01:00
Zdenko Podobný
771c1e9c9b fix lstm.cpp build with clang 2021-12-20 14:40:45 +01:00
Zdenko Podobný
8f02255294 cmake: reformat with cmake-format and check with cmake-lint 2021-12-20 13:18:01 +01:00
Stefan Weil
f728df0cfa Support up to 8 redirections when running OCR on a URL
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-12-16 14:57:18 +01:00
Amit D
d37dd73439
Fix broken msys2 build with gcc 11
Fix #3672.
2021-12-05 08:57:49 +02:00
Egor Pugin
b5d33a104b
Merge pull request #3664 from stweil/classify
Fix some compiler warnings and avoid float / double conversions in class Classify
2021-11-28 23:04:01 +03:00
Stefan Weil
a1f40cadc1 Avoid some unnecessary conversions from float to double
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 18:55:27 +01:00
Stefan Weil
5e8d877262 Modernize code in class Classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 18:44:20 +01:00
Stefan Weil
ffe2038ea6 Allow compilation with clang-7
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 10:45:46 +01:00
Stefan Weil
839f528b9a Remove unused GenericVector::contains_index, UnicityTable::contains_id
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-28 09:54:59 +01:00
Stefan Weil
8b21e4f0b8 Remove member function GenericVector<T>::contains
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-27 09:40:36 +01:00
Stefan Weil
739057c586 Remove member function UnicityTable<T>::contains
It was only used once, and the code using it can be simplified.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-27 09:40:17 +01:00
Stefan Weil
99aea21336 Limit BCER to interval [0,1]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-25 08:04:26 +01:00
Stefan Weil
2c4665466e Format code with clang-format
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-22 19:47:39 +01:00
Bernhard Liebl
555aa55f05 Add RowAttributes getter to PageIterator
[sw]: Cherry-picked commit from 4.1 branch

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-11-22 19:47:39 +01:00