Commit Graph

1817 Commits

Author SHA1 Message Date
Nick White
4787414d88 lstmeval: Print char and word error rates for each line tested 2021-05-11 10:54:34 +01:00
Egor Pugin
43747d6ea8 Postfix for #3418. 2021-05-10 15:06:27 +03:00
Egor Pugin
e7c01a6f15
Merge pull request #3418 from amitdo/thresholder
Add more binarization options
2021-05-10 14:45:03 +03:00
Amit Dovev
21e76c7a13 Convert enum ThreshMethod to enum class 2021-05-09 18:49:09 +03:00
Egor Pugin
176d0927bd Allow explicit casts of Image to Pix**. 2021-05-07 21:30:42 +03:00
Amit Dovev
11c73c9481 Add more binarization options
Use functions from Leptonica to provide more binarization options. The new options are: 1) Adaptive Otsu and 2) Sauvola (Tiled) .
2021-05-07 16:48:26 +03:00
Egor Pugin
65118b2e3a [misc] Fix variable type. Fixes warning. 2021-05-04 16:12:40 +03:00
Egor Pugin
346b77c94e Remove unneeded header. 2021-05-04 16:10:52 +03:00
Egor Pugin
4fbe9f1de2 Revert d6cdc52. Fixes #3412. 2021-05-04 00:51:39 +03:00
Ger Hobbelt
bd8adff829 fix compile error: PrintFontsTable() is for legacy builds only
# Conflicts:
#	googletest
2021-04-29 23:27:20 +02:00
Lucas Cimon
b852d658cb Adding --print-fonts-table parameter & tessedit_font_id configuration option 2021-04-29 11:25:40 +02:00
Stefan Weil
2e2a5b3ef4 Improved fix for issue #3405
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-27 22:15:36 +02:00
Stefan Weil
0b7fc068d2 Revert "Fix double free. Closes #3405."
This reverts commit 3997cf54d2.
It will be replaced by a simpler fix.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-27 22:15:18 +02:00
Egor Pugin
3a195e5b05 Misc. 2021-04-27 22:08:29 +03:00
Egor Pugin
3997cf54d2 Fix double free. Closes #3405. 2021-04-27 22:08:06 +03:00
Egor Pugin
e3ac1835e0 Remove unneeded ctor. 2021-04-23 04:26:18 +03:00
Egor Pugin
a7f938d28e Make FontSet just a vector. 2021-04-23 04:25:45 +03:00
Egor Pugin
4ae5a7d6b5 Properly init font set. 2021-04-23 04:05:59 +03:00
Egor Pugin
048e63c02b Replace FontSet struct with vector. It may be improved further (remove pointer?). 2021-04-23 02:38:25 +03:00
Egor Pugin
d6cdc521e5 Remove unused headers. 2021-04-23 02:06:06 +03:00
Stefan Weil
740d10b61b Fix issue #3404 (empty page regression)
The regression was caused by a bug in commit 5db92b26aa.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-22 20:51:23 +02:00
Stefan Weil
66a963b50a Remove two assertions which are triggered by fuzzing
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-20 19:04:49 +02:00
Stefan Weil
26c21a6db4 Fix some compiler warnings with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-20 07:58:31 +02:00
Stefan Weil
6d0595b443 Fix memory leak (OSS-Fuzz issue 33220)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-19 20:59:18 +02:00
Robert Pösel
c74ff1259b Fix wrong parameter name and documentation
set_only_init_params -> set_only_non_debug_params
2021-04-19 16:55:01 +02:00
Stefan Weil
2dfa38a072 Fix old TODO for struct EDGEPT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-17 18:08:27 +02:00
Fabrizio Di Vittorio
2be896d2b9 Add SVSemaphore destructor to avoid system objects leaks 2021-04-15 09:23:22 +02:00
Stefan Weil
e6e871bc73 Replace pointer by value for ScrollView mutex
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-15 06:30:05 +02:00
Stefan Weil
4daf781916 Fix NULL pointer access (issue #3394)
The regression was caused by commit 57c90eee02.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 22:10:12 +02:00
Stefan Weil
91b2b4f4a0 Fix OSS-Fuzz issue 32142 (container-overflow write)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 13:45:12 +02:00
Stefan Weil
f83f00496e Clean, format and optimize code in edgblob.cpp / edgblob.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 08:03:30 +02:00
Egor Pugin
a732565cad Fix headers. 2021-04-12 01:40:40 +03:00
Egor Pugin
4f6ff85123 Remove unneeded header. 2021-04-12 01:19:00 +03:00
Egor Pugin
57c90eee02 [edgblob] Replace unique ptr with vector. Fix possible index issues.
Closes #1921.
2021-04-12 01:17:57 +03:00
Stefan Weil
cca46e6b29 Fix another use-after-free (issue #3394)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 21:37:46 +02:00
Stefan Weil
33fa9d3223 Fix use-after-free (issue #3394)
This bug was introduced by commit f77b1c6881.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 19:10:44 +02:00
Egor Pugin
423f00c351
Merge pull request #3393 from eighttails/fix_zero_division
Fix division by zero during CJK training.
2021-04-11 15:38:28 +03:00
Tadahito Yao
8a8204e62a Reverted one of zero value checks. 2021-04-11 21:30:02 +09:00
Tadahito Yao
05eef742df Fix division by zero during CJK training. 2021-04-11 20:14:45 +09:00
Stefan Weil
0401b9470c Fix some typos (most found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 11:06:36 +02:00
Stefan Weil
f77b1c6881 Fix memory leak (OSS-Fuzz issue #32246)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-10 21:35:31 +02:00
Amit D
a4a84c4c92
lstmrecognizer.cpp: Call OutputStats() only when 'invert' is true (#3387)
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 17:55:23 +02:00
Amit Dovev
e6ce048426 Change message from 'Found SSE' to 'Found SSE4.1' 2021-04-08 17:51:09 +02:00
Stefan Weil
63f4463028 Add const attribute to some functions (API change)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
253751c331 Simplify class REJ by replacing two std::bitset<16> by one std::bitset<32>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
2fbcca783b Make more functions in class REJ inline
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
a74bbb6032 Remove bits16.h and BITS16 data type
Add also const attribute to some functions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
2fa96b765b Modernize and optimize list_rec a little bit
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:30:33 +02:00
Stefan Weil
7fd90498ca Modernize code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:30:33 +02:00
Egor Pugin
edfce72340 Refactor microfeatures a bit. 2021-04-07 17:29:46 +03:00
Egor Pugin
47715e576a Replace microfeatures from oldlist to std::forward_list. 2021-04-07 17:10:16 +03:00
Egor Pugin
2e17ee7327 Correct template args. 2021-04-07 13:28:57 +03:00
Stefan Weil
10255d013a Fix new / delete class mismatch
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 09:25:37 +02:00
Egor Pugin
b1731b6e73 Add missing TESS_API. 2021-04-07 00:59:36 +03:00
Egor Pugin
6e3259593a Reorder list templates. 2021-04-07 00:29:07 +03:00
Egor Pugin
409aa5296f Misc. 2021-04-07 00:17:04 +03:00
Egor Pugin
9d40512ade [elist2] Convert macros to template. Remove source file macro ELIST2IZE. 2021-04-07 00:15:01 +03:00
Egor Pugin
03435adca0 [elist] Rework macro into template and small macro. Move common iterator template into 'list_iterator.h'. 2021-04-07 00:04:30 +03:00
Egor Pugin
b9329e599f Misc. 2021-04-06 23:45:28 +03:00
Egor Pugin
746b87363b Remove unused methods. 2021-04-06 23:45:22 +03:00
Egor Pugin
29e75d0f51 [elist] Remove unused macros QUOTE_IT. 2021-04-06 23:40:56 +03:00
Egor Pugin
539f4b8255 [clist] Remove unused methods. 2021-04-06 23:40:35 +03:00
Egor Pugin
18e61d10ce Rework big clist macro into template and small macro. Remove unused macros QUOTE_IT and CLISTIZE (source file macro). 2021-04-06 23:37:14 +03:00
Raf Schietekat
6bbfef7c85 RAII: TessBaseAPI::GetIterator()
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 17:57:23 +02:00
Raf Schietekat
d71413f4aa RAII: TessBaseAPI::AnalyseLayout()
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 17:46:26 +02:00
Stefan Weil
897e59613d Clean code for hOCR renderer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 16:36:23 +02:00
Stefan Weil
3705989c94 Optimize length method for ELIST, ELIST2
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:57:12 +02:00
Stefan Weil
4104876b08 Add const attribute to some methods of ELIST, ELIST2 and related classes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:48:18 +02:00
Stefan Weil
fb904d2265 Remove redundant debug code for CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:26:04 +02:00
Stefan Weil
b47ce5643b Modernize CLIST code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:16:57 +02:00
Stefan Weil
fd187b0c18 Optimize CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:08:35 +02:00
Stefan Weil
4a628729b2 Delete assignment and copy constructor for ELIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:59:31 +02:00
Stefan Weil
b0b5600c30 Delete assignment and copy constructor for ELIST2, ELIST2_LINK
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:59:00 +02:00
Stefan Weil
24f91fab0b Delete assignment and copy constructor for CLIST, CLIST_LINK
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:42:01 +02:00
Stefan Weil
eeb67e8ae8 Replace find / insert by insert on unordered set to optimize GridSearch
Both find and insert can be slow for a large unordered set.

Instead of using both methods, it is sufficient to simply try only
the insert method which returns whether the insertion was possible
or not.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 18:11:33 +02:00
Egor Pugin
50aec308b3 Remove unnecessary pointer hasher for uset. 2021-04-04 14:00:46 +03:00
Stefan Weil
0611c892b6 Disable more code with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-02 16:43:26 +02:00
Egor Pugin
7a73875bd1
Merge pull request #3375 from amitdo/viewer
Disable more code with GRAPHICS_DISABLED
2021-04-02 12:27:24 +03:00
Amit Dovev
6d94b22c80 Disable more code with GRAPHICS_DISABLED 2021-04-02 11:12:38 +03:00
Egor Pugin
34e0d017ab Add Image::operator&=(). 2021-04-01 19:15:58 +03:00
Egor Pugin
9e3da4a724 Add Image::operator|=(). 2021-04-01 19:10:48 +03:00
Egor Pugin
e077b7255d Remove arg from Image::copy(). 2021-04-01 19:08:47 +03:00
Egor Pugin
d5fb7f9843 Init variable. 2021-04-01 17:16:46 +03:00
Egor Pugin
fe02ba2363 Add Image::isZero(). 2021-04-01 17:15:48 +03:00
Egor Pugin
306d296979 Add Image::clone(). 2021-04-01 17:06:30 +03:00
Egor Pugin
2aca22439e Add Image::copy(). 2021-04-01 16:55:43 +03:00
Stefan Weil
5159f9aa12 Fix name conflict between class and function named Image
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-01 14:00:08 +02:00
Egor Pugin
e429b607ae [misc] Update header guard. 2021-04-01 01:36:22 +03:00
Egor Pugin
1628a9aae3 Revert 4fa05b9147. Make a note. 2021-04-01 01:35:50 +03:00
Egor Pugin
a792b67983 Basic usage of new Image class. Only pixDestroy is wrapped at the moment.
Add new methods to Image class and replace them in non-public code.
2021-03-31 22:39:43 +03:00
Egor Pugin
ce6e2f1821 Initial tesseract Image wrapper.
Provide basic Pix conversions.
Add destroy() method.

It can be extended later to 1) image owner (raii), 2) different image libraries.
2021-03-31 22:38:32 +03:00
Egor Pugin
4fa05b9147 Remove unused ifdef. 2021-03-31 21:54:12 +03:00
Stefan Weil
722767633e Partially fix issue #3374
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-31 19:23:07 +02:00
Stefan Weil
b7c6d971f3 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-31 07:08:53 +02:00
Stefan Weil
6684a727c1 Improve some structs further (fixes several CID issues)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-30 14:20:52 +02:00
Nick White
abea25ee2f lstm: Include missing header 2021-03-29 18:53:35 +02:00
Stefan Weil
2e349dbba5 Fix compilation for Tensorflow code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:19:06 +02:00
Stefan Weil
3c03d70e64 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:12:52 +02:00
Stefan Weil
f639500a81 Add missing TESS_API for sw builds
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:34:23 +02:00
Stefan Weil
5c4de14567 Replace strdup / free by std::string in SVSync::StartProcess
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00