Commit Graph

2089 Commits

Author SHA1 Message Date
Merlijn Wajer
ca177e72f3 hocrrenderer: write scan_res property to the ocr_page
This will make Tesseract emit the DPI of the document, if known at OCR
time. This is requird to properly interpret the x_fsize (font size)
property of words, since Tesseract scales the font size to the DPI.

See issue #3326 (https://github.com/tesseract-ocr/tesseract/issues/3326)
2021-09-21 11:02:52 +02:00
Stefan Weil
638045133f Simplify function LoadTrainingData and fix mastertrainer_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-17 08:24:50 +02:00
Stefan Weil
d87e08f266 Fix crash of shapeclustering (fixes #3564)
Fixes: 4415209fd6 ("Remove tessopt. This fixes mastertrainer test in shared build")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-16 22:31:09 +02:00
Stefan Weil
e5e12f2856 Disable HAVE_FRAMEWORK_ACCELERATE for compilers which fail to compile with it
g++-10 and g++-11 throw compiler errors in builds with the
Accelerate framework, so disable it for all GNU compilers
before version 12 (which still has to be tested).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-06 17:15:46 +02:00
Stefan Weil
ec87dd4d49 Abort LSTM training with integer model (fixes issue #1573)
Tesseract currently cannot continue LSTM training from an
integer (fast) model.

Report this to users who try it nevertheless instead of crashing
with an assertion.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-06 08:18:55 +02:00
Stefan Weil
a027dca007 Extend URI support for Tesseract with libcurl
libcurl not only supports HTTP and HTTPS, but also a lot of other protocols,
for example FTP and SFTP. Those protocols can also be useful for Tesseract.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-05 16:49:22 +02:00
Stefan Weil
7fc9a34f79 Rename processed TIFF output file and add page number if needed (fixes issue #3544)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-01 14:16:05 +02:00
Robert Pösel
40fdacd485 Add missing check for __ARM_NEON
This makes it consistent with intsimdmatrixneon.cpp file and allows having this file included in builds even for non-NEON platforms (simplifies build config).
2021-08-26 15:28:59 +02:00
Stefan Weil
4dcd8fa591 Fix handling of TESSDATA_PREFIX containing // (fixes issue #3527)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-24 20:05:54 +02:00
Stefan Weil
391e713ae8 Use model prefix also for submodels
Fix also a regression in the for loop which handles submodels.

Fixes: 0d91c700c0 ("Modernize code in Tesseract::init_tesseract")
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-24 13:41:00 +02:00
Stefan Weil
0d91c700c0 Modernize code in Tesseract::init_tesseract
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-23 07:30:03 +02:00
Egor Pugin
1d3d1fbc62 Move member function bodies into class template. 2021-08-20 12:42:40 +03:00
Egor Pugin
c539328d7d Merge branch 'master' of github.com-egorpugin:tesseract-ocr/tesseract 2021-08-20 12:38:12 +03:00
Egor Pugin
407346246c [universalambigs] Use inline variables. 2021-08-20 12:38:03 +03:00
Stefan Weil
7acda5cb6c Fix cloning of Image with pix_ == nullptr (issue #537)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-18 19:22:23 +02:00
Egor Pugin
6056c84977 [image] Mark PIX** cast explicit to prevent implicit bool checks in ternary operators. 2021-08-18 18:14:47 +03:00
Stefan Weil
59271470b4 Remove unneeded type cast
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-12 20:55:14 +02:00
Stefan Weil
aaec341449 Avoid call of ColumnFinder::DisplayBlocks (small optimization)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-12 15:23:44 +02:00
Stefan Weil
6da7d6fcda Optimize check for non empty string and fix code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-12 14:45:22 +02:00
Stefan Weil
92cae8f194 Optimize check for non empty string
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-12 14:44:45 +02:00
Stefan Weil
3ef403c345 Compile LSTM::PrintW and LSTM::PrintDW conditionally
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-10 22:04:57 +02:00
Stefan Weil
5d99041f5d Remove unused function Wordrec::merge_fragments
Remove also more functions which are now also unused.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-10 22:04:57 +02:00
Stefan Weil
f1c8df0ce9 Remove unused global variable fx_debug
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-10 22:04:57 +02:00
Stefan Weil
16fd1439fa Write image filename in ALTO output
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-07 22:14:03 +02:00
Stefan Weil
5f10fed5d9 Reduce size of TessResultRenderer
Changing the order reduces the size from 72 to 64 bytes
on 64 bit Linux.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-07 22:14:03 +02:00
Stefan Weil
a73e7b97a4 Add float dotproduct implementation for NEON
Signed-off-by: Stefan Weil <stefan.weil@bib.uni-mannheim.de>
2021-08-03 10:35:22 +02:00
Stefan Weil
bb4a1219d7 Improve setting of dot product functions via environment variable
Apply the settings which are selected by environment variable DOTPRODUCT
after the autodetection which detects the available SIMD hardware.

'accelerate', 'fma' and 'std::inner_product' now no longer change
the setting for intSimdMatrix to 'generic' because they don't provide
their own implementation for it.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-03 10:34:33 +02:00
Stefan Weil
edcf4fcd3b Fix comment
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-08-01 13:17:45 +02:00
Stefan Weil
0d0f203509 Add new configure option --enable-float32 for faster LSTM with float
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-29 06:49:08 +02:00
Stefan Weil
553ab64d8d Rename UnicityTable<T>::get_id to UnicityTable<T>::get_index
This prepares replacing UnicityTable<FontInfo> by FontInfoTable.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-26 07:59:58 +02:00
Stefan Weil
df1295ea6b
Simplify *_VAR_H macros (#3508)
This avoids duplicate (and potentially inconsistent) code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-25 12:09:07 +03:00
Ger Hobbelt
27597883db Implement DotProductSSE() for FAST_FLOAT
[sw] Formatted commit message
2021-07-24 15:14:17 +02:00
Ger Hobbelt
79e8b4f344 bugfixing the AVX2 Extract8+16 codes
There's lines like `__m256d scale01234567 = _mm256_loadu_ps(scales)`,
i.e. loading float vectors into double vector types.

[sw] Formatted commit message
2021-07-24 15:14:17 +02:00
Ger Hobbelt
24a29b79e5 bugfix of FMA port to FAST_FLOAT
8 float FPs fit in a single 256bit vector (8x32)
(contrasting 4 double FPs: 4*64)

[sw] Format commit message and use float instead of TFloat
2021-07-24 15:14:17 +02:00
Stefan Weil
472f5d9020 Add TFloat data type for neural network
Up to now Tesseract used double for training and recognition
with "best" models.

This commit replaces double by a new data type TFloat which
is double by default, but float if FAST_FLOAT is defined.

Ideally this should allow faster training.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-24 15:14:17 +02:00
Stefan Weil
66b77e6639 Prepare using float instead of double for LSTM calculations
The new header file ccutils/tesstypes.h also prepares support
for larger images by introducing a new data type for image
size and coordinates (still unused).

FloatToDouble is now a local function.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-24 13:59:37 +02:00
Stefan Weil
4df822a3fc
Revert "Merge pull request #3330 from Sintun/master" (#3505)
This reverts commit 122daf1d64, reversing
changes made to 4cd56dc5f5.

Those changes caused two regressions which resulted in an assertion
or a segmentation fault.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-22 09:04:23 +03:00
Stefan Weil
e176169a90 Remove stray spaces at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:59:15 +02:00
Ger Hobbelt
444fe14273 Fix a couple of 'shadowed local variables' compiler warnings
These fixes got through while I manually extracted the template work
from my mainline (warnings due to running MSVC at Level 4)

[sw]: Format commit message and use different fix for blamer.cpp

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:49:03 +02:00
Stefan Weil
0fc6d8d7f0 Add missing hint for dotproduct parameter value "fma"
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:44:29 +02:00
Ger Hobbelt
f72d4b1fe7 NEON arch: dead ref cycle fix
When neon_available_ is ON, the DotProduct was set to point to DotProduct,
which should have been DotProductNative, as dotProduct is the *target* global itself:
see simddetect.h --> effectively making that part of the SetDotProduct() call
identical to this (no-op) statement: `DotProduct = DotProduct;`

Also added the Neon check in the Update() API, where it exists together
with the other checks (for AVX/SSE/etc.)

[sw: formatted commit message and merged into main branch]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:40:16 +02:00
Stefan Weil
dff7312aed Modernize code in SIMDDetect::Update
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:16:49 +02:00
Stefan Weil
3ab8dcbf72 Use Apple Accelerate framework for training and best models
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 19:27:54 +02:00
Johannes Künsebeck
3be11f12a9 Removed unused parameters declarations and definitions 2021-07-20 15:08:10 +02:00
zdenop
8dd7936475
Solve clang reporting unused variable in ExtractMicros function (#3501)
* mark attribute as unused for compiler
* try c++17 standard https://en.cppreference.com/w/cpp/language/attributes/maybe_unused
2021-07-18 01:59:49 +02:00
nagadomi
7fe0624838
Fix spec string of convolution layer (#3499) 2021-07-16 18:21:52 +03:00
Stefan Weil
88d4028a5a Enable pragma for SIMD also when _OPENMP is defined
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-15 16:03:43 +02:00
Stefan Weil
f0fb6809e3 Use SIMD instructions for DotProductNative
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-14 19:13:01 +02:00
Tadahito Yao
12e0fb4e01
Fix deadlock in lstmtraing. (#3488) 2021-07-10 10:59:10 +03:00
Stefan Weil
767fb5a177 Fix LSTMTrainerTest.BidiTest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-04 18:41:19 +02:00
Stefan Weil
158c845228 Catch another FP division by 0 (fixes issue #3483)
Rewriting the code avoids FP operations (so makes it potentially faster)
and fixes the division by 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-03 15:37:24 +02:00
Stefan Weil
4b630a8813 Catch FP division by 0 (fixes issue #3483)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-02 15:04:31 +02:00
Stefan Weil
a701454ae5
Fix vector resize with init for all elements (issue #3473) (#3474)
Fixes: c8b8d266d6
Fixes: 9710bc0465
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-06-29 21:05:29 +03:00
nagadomi
ff1062d39d
Add --reset_learning_rate option to lstmtraining (#3470)
When the --reset_learning_rate option is specified,
it resets the learning rate stored in each layer of the network
loaded with --continue_from to the value specified by the --learning_rate option.
If checkpoint is available, it does nothing.
2021-06-28 11:48:07 +03:00
nagadomi
d8bd78f8e2
Fix missing reset of best_error_history_ in LSTMTrainer::InitIterations() (#3469) 2021-06-27 09:26:32 +03:00
nagadomi
b2fa77f8f0
Show layer specified learning rates with combine_tessdata -l (#3468) 2021-06-26 08:08:54 +03:00
MonkeybreadSoftware
75e6c3ea4c
Null check for GetSourceYResolution (#3457)
* Null check for GetSourceYResolution

Added missing NULL check to avoid crash when we read property in our tesseract wrapper.

* Added missing return value.

added -1 to return if undefined.
2021-06-16 16:35:24 +03:00
Amit Dovev
bf979c801a Remove unused variable 2021-05-21 20:34:09 +03:00
Egor Pugin
a72408fdef
Merge pull request #3438 from amitdo/pango
Raise Minimum required Pango version to 1.38.0
2021-05-21 20:09:27 +03:00
Amit Dovev
8615f65cc4 Raise Minimum required Pango version to 1.38.0 2021-05-21 19:56:37 +03:00
Amit Dovev
c24538518c ThresholdMethod::TiledSauvola -> ThresholdMethod::Sauvola
The fact that this method uses tiles is implementation detail. It does not change the result compared to Sauvola without tiles. The use of tiles minimize memory consumption.
2021-05-21 18:15:30 +03:00
Stefan Weil
93348a83a3 Remove scripts for training
They were replaced by Python3 scripts (part of the tesstrain repository).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-18 10:47:44 +02:00
nagadomi
42e4b91132 Refactor ObjectCache::DeleteUnusedObjects with reverse iterator 2021-05-17 14:50:30 +02:00
nagadomi
dc4a8a6ce0 Fix crash in ObjectCache::DeleteUnusedObjects 2021-05-17 10:25:17 +09:00
Stefan Weil
0c4e2f1cb5 Fix comment in code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-16 07:47:19 +02:00
Stefan Weil
57b7974292 Remove an arbitrary limit for the image size
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
a0cf117c5d Fix compiler warning in binarization code (uninitialized local variable)
Simplify the code also a little bit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
bf84fb9f2d Optimize code for binarization
Some code is only needed for Otsu or even not at all.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
4b5dd25b84 Fix compiler warning
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-15 15:03:22 +02:00
Stefan Weil
12c29639fc Add conditional compilation with GRAPHICS_DISABLED
This fixes a compiler warning when GRAPHICS_DISABLED is defined.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-13 17:22:24 +02:00
Nick White
ad7010a5eb lstmeval: Only print char and word error rates for verbosity 2/3 2021-05-11 13:15:35 +01:00
Nick White
4787414d88 lstmeval: Print char and word error rates for each line tested 2021-05-11 10:54:34 +01:00
Nick White
9c82cc63c2 Switch to NFC normalisation everywhere 2021-05-11 10:18:06 +01:00
Egor Pugin
43747d6ea8 Postfix for #3418. 2021-05-10 15:06:27 +03:00
Egor Pugin
e7c01a6f15
Merge pull request #3418 from amitdo/thresholder
Add more binarization options
2021-05-10 14:45:03 +03:00
Amit Dovev
21e76c7a13 Convert enum ThreshMethod to enum class 2021-05-09 18:49:09 +03:00
Egor Pugin
176d0927bd Allow explicit casts of Image to Pix**. 2021-05-07 21:30:42 +03:00
Amit Dovev
11c73c9481 Add more binarization options
Use functions from Leptonica to provide more binarization options. The new options are: 1) Adaptive Otsu and 2) Sauvola (Tiled) .
2021-05-07 16:48:26 +03:00
Egor Pugin
65118b2e3a [misc] Fix variable type. Fixes warning. 2021-05-04 16:12:40 +03:00
Egor Pugin
346b77c94e Remove unneeded header. 2021-05-04 16:10:52 +03:00
Egor Pugin
4fbe9f1de2 Revert d6cdc52. Fixes #3412. 2021-05-04 00:51:39 +03:00
Ger Hobbelt
bd8adff829 fix compile error: PrintFontsTable() is for legacy builds only
# Conflicts:
#	googletest
2021-04-29 23:27:20 +02:00
Lucas Cimon
b852d658cb Adding --print-fonts-table parameter & tessedit_font_id configuration option 2021-04-29 11:25:40 +02:00
Stefan Weil
2e2a5b3ef4 Improved fix for issue #3405
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-27 22:15:36 +02:00
Stefan Weil
0b7fc068d2 Revert "Fix double free. Closes #3405."
This reverts commit 3997cf54d2.
It will be replaced by a simpler fix.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-27 22:15:18 +02:00
Egor Pugin
3a195e5b05 Misc. 2021-04-27 22:08:29 +03:00
Egor Pugin
3997cf54d2 Fix double free. Closes #3405. 2021-04-27 22:08:06 +03:00
Egor Pugin
e3ac1835e0 Remove unneeded ctor. 2021-04-23 04:26:18 +03:00
Egor Pugin
a7f938d28e Make FontSet just a vector. 2021-04-23 04:25:45 +03:00
Egor Pugin
4ae5a7d6b5 Properly init font set. 2021-04-23 04:05:59 +03:00
Egor Pugin
048e63c02b Replace FontSet struct with vector. It may be improved further (remove pointer?). 2021-04-23 02:38:25 +03:00
Egor Pugin
d6cdc521e5 Remove unused headers. 2021-04-23 02:06:06 +03:00
Stefan Weil
740d10b61b Fix issue #3404 (empty page regression)
The regression was caused by a bug in commit 5db92b26aa.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-22 20:51:23 +02:00
Stefan Weil
66a963b50a Remove two assertions which are triggered by fuzzing
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-20 19:04:49 +02:00
Stefan Weil
26c21a6db4 Fix some compiler warnings with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-20 07:58:31 +02:00
Stefan Weil
6d0595b443 Fix memory leak (OSS-Fuzz issue 33220)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-19 20:59:18 +02:00
Robert Pösel
c74ff1259b Fix wrong parameter name and documentation
set_only_init_params -> set_only_non_debug_params
2021-04-19 16:55:01 +02:00
Stefan Weil
2dfa38a072 Fix old TODO for struct EDGEPT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-17 18:08:27 +02:00
Fabrizio Di Vittorio
2be896d2b9 Add SVSemaphore destructor to avoid system objects leaks 2021-04-15 09:23:22 +02:00
Stefan Weil
e6e871bc73 Replace pointer by value for ScrollView mutex
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-15 06:30:05 +02:00
Stefan Weil
4daf781916 Fix NULL pointer access (issue #3394)
The regression was caused by commit 57c90eee02.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 22:10:12 +02:00
Stefan Weil
91b2b4f4a0 Fix OSS-Fuzz issue 32142 (container-overflow write)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 13:45:12 +02:00
Stefan Weil
f83f00496e Clean, format and optimize code in edgblob.cpp / edgblob.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-12 08:03:30 +02:00
Egor Pugin
a732565cad Fix headers. 2021-04-12 01:40:40 +03:00
Egor Pugin
4f6ff85123 Remove unneeded header. 2021-04-12 01:19:00 +03:00
Egor Pugin
57c90eee02 [edgblob] Replace unique ptr with vector. Fix possible index issues.
Closes #1921.
2021-04-12 01:17:57 +03:00
Stefan Weil
cca46e6b29 Fix another use-after-free (issue #3394)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 21:37:46 +02:00
Stefan Weil
33fa9d3223 Fix use-after-free (issue #3394)
This bug was introduced by commit f77b1c6881.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 19:10:44 +02:00
Egor Pugin
423f00c351
Merge pull request #3393 from eighttails/fix_zero_division
Fix division by zero during CJK training.
2021-04-11 15:38:28 +03:00
Tadahito Yao
8a8204e62a Reverted one of zero value checks. 2021-04-11 21:30:02 +09:00
Tadahito Yao
05eef742df Fix division by zero during CJK training. 2021-04-11 20:14:45 +09:00
Stefan Weil
0401b9470c Fix some typos (most found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 11:06:36 +02:00
Stefan Weil
f77b1c6881 Fix memory leak (OSS-Fuzz issue #32246)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-10 21:35:31 +02:00
Amit D
a4a84c4c92
lstmrecognizer.cpp: Call OutputStats() only when 'invert' is true (#3387)
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 17:55:23 +02:00
Amit Dovev
e6ce048426 Change message from 'Found SSE' to 'Found SSE4.1' 2021-04-08 17:51:09 +02:00
Stefan Weil
63f4463028 Add const attribute to some functions (API change)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
253751c331 Simplify class REJ by replacing two std::bitset<16> by one std::bitset<32>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
2fbcca783b Make more functions in class REJ inline
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
a74bbb6032 Remove bits16.h and BITS16 data type
Add also const attribute to some functions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-08 10:43:21 +02:00
Stefan Weil
2fa96b765b Modernize and optimize list_rec a little bit
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:30:33 +02:00
Stefan Weil
7fd90498ca Modernize code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 17:30:33 +02:00
Egor Pugin
edfce72340 Refactor microfeatures a bit. 2021-04-07 17:29:46 +03:00
Egor Pugin
47715e576a Replace microfeatures from oldlist to std::forward_list. 2021-04-07 17:10:16 +03:00
Egor Pugin
2e17ee7327 Correct template args. 2021-04-07 13:28:57 +03:00
Stefan Weil
10255d013a Fix new / delete class mismatch
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-07 09:25:37 +02:00
Egor Pugin
b1731b6e73 Add missing TESS_API. 2021-04-07 00:59:36 +03:00
Egor Pugin
6e3259593a Reorder list templates. 2021-04-07 00:29:07 +03:00
Egor Pugin
409aa5296f Misc. 2021-04-07 00:17:04 +03:00
Egor Pugin
9d40512ade [elist2] Convert macros to template. Remove source file macro ELIST2IZE. 2021-04-07 00:15:01 +03:00
Egor Pugin
03435adca0 [elist] Rework macro into template and small macro. Move common iterator template into 'list_iterator.h'. 2021-04-07 00:04:30 +03:00
Egor Pugin
b9329e599f Misc. 2021-04-06 23:45:28 +03:00
Egor Pugin
746b87363b Remove unused methods. 2021-04-06 23:45:22 +03:00
Egor Pugin
29e75d0f51 [elist] Remove unused macros QUOTE_IT. 2021-04-06 23:40:56 +03:00
Egor Pugin
539f4b8255 [clist] Remove unused methods. 2021-04-06 23:40:35 +03:00
Egor Pugin
18e61d10ce Rework big clist macro into template and small macro. Remove unused macros QUOTE_IT and CLISTIZE (source file macro). 2021-04-06 23:37:14 +03:00
Raf Schietekat
6bbfef7c85 RAII: TessBaseAPI::GetIterator()
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 17:57:23 +02:00
Raf Schietekat
d71413f4aa RAII: TessBaseAPI::AnalyseLayout()
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 17:46:26 +02:00
Stefan Weil
897e59613d Clean code for hOCR renderer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 16:36:23 +02:00
Stefan Weil
3705989c94 Optimize length method for ELIST, ELIST2
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:57:12 +02:00
Stefan Weil
4104876b08 Add const attribute to some methods of ELIST, ELIST2 and related classes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:48:18 +02:00
Stefan Weil
fb904d2265 Remove redundant debug code for CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:26:04 +02:00
Stefan Weil
b47ce5643b Modernize CLIST code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:16:57 +02:00
Stefan Weil
fd187b0c18 Optimize CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-06 15:08:35 +02:00
Stefan Weil
4a628729b2 Delete assignment and copy constructor for ELIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:59:31 +02:00
Stefan Weil
b0b5600c30 Delete assignment and copy constructor for ELIST2, ELIST2_LINK
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:59:00 +02:00
Stefan Weil
24f91fab0b Delete assignment and copy constructor for CLIST, CLIST_LINK
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 19:42:01 +02:00
Stefan Weil
eeb67e8ae8 Replace find / insert by insert on unordered set to optimize GridSearch
Both find and insert can be slow for a large unordered set.

Instead of using both methods, it is sufficient to simply try only
the insert method which returns whether the insertion was possible
or not.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-05 18:11:33 +02:00
Egor Pugin
50aec308b3 Remove unnecessary pointer hasher for uset. 2021-04-04 14:00:46 +03:00
Stefan Weil
0611c892b6 Disable more code with GRAPHICS_DISABLED
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-02 16:43:26 +02:00
Egor Pugin
7a73875bd1
Merge pull request #3375 from amitdo/viewer
Disable more code with GRAPHICS_DISABLED
2021-04-02 12:27:24 +03:00
Amit Dovev
6d94b22c80 Disable more code with GRAPHICS_DISABLED 2021-04-02 11:12:38 +03:00
Egor Pugin
34e0d017ab Add Image::operator&=(). 2021-04-01 19:15:58 +03:00
Egor Pugin
9e3da4a724 Add Image::operator|=(). 2021-04-01 19:10:48 +03:00
Egor Pugin
e077b7255d Remove arg from Image::copy(). 2021-04-01 19:08:47 +03:00
Egor Pugin
d5fb7f9843 Init variable. 2021-04-01 17:16:46 +03:00
Egor Pugin
fe02ba2363 Add Image::isZero(). 2021-04-01 17:15:48 +03:00
Egor Pugin
306d296979 Add Image::clone(). 2021-04-01 17:06:30 +03:00
Egor Pugin
2aca22439e Add Image::copy(). 2021-04-01 16:55:43 +03:00
Stefan Weil
5159f9aa12 Fix name conflict between class and function named Image
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-01 14:00:08 +02:00
Egor Pugin
e429b607ae [misc] Update header guard. 2021-04-01 01:36:22 +03:00
Egor Pugin
1628a9aae3 Revert 4fa05b9147. Make a note. 2021-04-01 01:35:50 +03:00
Egor Pugin
a792b67983 Basic usage of new Image class. Only pixDestroy is wrapped at the moment.
Add new methods to Image class and replace them in non-public code.
2021-03-31 22:39:43 +03:00
Egor Pugin
ce6e2f1821 Initial tesseract Image wrapper.
Provide basic Pix conversions.
Add destroy() method.

It can be extended later to 1) image owner (raii), 2) different image libraries.
2021-03-31 22:38:32 +03:00
Egor Pugin
4fa05b9147 Remove unused ifdef. 2021-03-31 21:54:12 +03:00
Stefan Weil
722767633e Partially fix issue #3374
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-31 19:23:07 +02:00
Stefan Weil
b7c6d971f3 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-31 07:08:53 +02:00
Stefan Weil
6684a727c1 Improve some structs further (fixes several CID issues)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-30 14:20:52 +02:00
Nick White
abea25ee2f lstm: Include missing header 2021-03-29 18:53:35 +02:00
Stefan Weil
2e349dbba5 Fix compilation for Tensorflow code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:19:06 +02:00
Stefan Weil
3c03d70e64 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 16:12:52 +02:00
Stefan Weil
f639500a81 Add missing TESS_API for sw builds
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:34:23 +02:00
Stefan Weil
5c4de14567 Replace strdup / free by std::string in SVSync::StartProcess
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00
Stefan Weil
3790413cc5 Replace remaining malloc / free in training code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00
Stefan Weil
7c1bea505a Replace strdup / free by std::string for StringRenderer::features_
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 11:24:58 +02:00
Stefan Weil
201686feb8 Use lept_free instead of free for memory which was allocated by Leptonica
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 10:55:33 +02:00
Stefan Weil
1b95eb1d19 Replace malloc / free by std::string for LABELEDLISTNODE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 10:29:08 +02:00
Stefan Weil
1620daffcd Replace malloc / free by std::string in LABELEDLISTNODE and MERGE_CLASS_NODE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-29 10:17:42 +02:00
Stefan Weil
0976e23387 Replace malloc / free by new / delete for KDTREE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 23:19:46 +02:00
Stefan Weil
c05d849381 Replace malloc / free by new / delete for NORM_PROTOS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 22:37:47 +02:00
Stefan Weil
174210c849 Replace malloc / free by new / delete for MFEDGEPT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 22:24:51 +02:00
Stefan Weil
0c3d244238 Replace new / delete by std::vector for INT_CLASS_STRUCT::ProtoLengths
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 22:09:06 +02:00
Stefan Weil
486c257f42 Replace malloc / free by new / delete for MICROFEATURE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 21:20:59 +02:00
Stefan Weil
30f44f333a Replace malloc / free by new / delete for KDNODE
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 21:11:22 +02:00
Stefan Weil
47a1fd7b45 Replace malloc / free by new / delete for INT_CLASS_STRUCT::ProtoLengths
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:41:37 +02:00
Stefan Weil
d6caae3793 Replace malloc / free by std::vector for BUCKETS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:32:57 +02:00
Stefan Weil
78f8a47d05 Replace malloc / free by std::vector for PROTOTYPE::Distrib
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
b8488dac7a Replace malloc / free for TEMPCLUSTER
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
2a569c9cfb Replace malloc / free for FLOATUNION::Elliptical
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
5bf1af257c Use std::vector<BIT_VECTOR> for CLASS_STRUCT::Configurations
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
6f499f7fb5 Use std::vector<PROTO_STRUCT> for CLASS_STRUCT::Prototypes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
441f74c1e6 Replace malloc / free for STATISTICS
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:35 +02:00
Stefan Weil
57d3a1eb99 Replace malloc / free for CLUSTER::Mean and PROTOTYPE::Mean
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 20:31:32 +02:00
Stefan Weil
667eee2344 Replace malloc / free for CLIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
0077bc46cf Replace malloc / free for ELIST2
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
2c273c1b3b Replace malloc / free for ELIST
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
582260a9bf Replace malloc / free for C_OUTLINE::steps
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
b15b5d1de7 Replace malloc / free by new / delete for FEATURE_STRUCT, FEATURE_SET_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-28 12:12:18 +02:00
Stefan Weil
aa8dda89a3 Replace malloc / free by new / delete for CHAR_DESC_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 18:43:14 +01:00
Stefan Weil
0f90ccb9cd Replace malloc / free by new / delete for CHISTRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 16:45:14 +01:00
Stefan Weil
0a46866bcd Replace malloc / free by new / delete for PERM_CONFIG_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 16:19:40 +01:00
Stefan Weil
92359a4a11 Replace malloc / free by new / delete for TEMP_CONFIG_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 15:59:28 +01:00
Stefan Weil
fdf4539769 Replace malloc / free by new / delete for ADAPT_CLASS_STRUCT
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:49:57 +01:00
Stefan Weil
0a0a3e1946 Replace malloc / free by new / delete
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:00:18 +01:00
Stefan Weil
884a28b366 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:00:18 +01:00
Stefan Weil
77514d693f Modernize BitVector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-27 13:00:18 +01:00
Stefan Weil
0f72e0fdb3 Simplify checks for emptiness
Replace the patterns (x.size() == 0) and (x.length() == 0) by x.empty().

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-26 23:22:50 +01:00
Egor Pugin
067c971774 Misc. 2021-03-24 14:36:45 +03:00
Egor Pugin
7c975a0eee Remove default locale setting in debug config. Any locale errors must be fixed separately (if any).
Fixes #3290.
2021-03-24 14:36:40 +03:00
Stefan Weil
595346d548 Replace some snprintf by std::to_string and modernize more code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-24 08:01:59 +01:00
Stefan Weil
2048f328e0 Suppress output of page number for TIFF files with a single image
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 18:25:15 +01:00
Stefan Weil
264dfb3685 Don't convert for loop after '#pragma omp parallel' with clang-tidy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 15:48:59 +01:00
Stefan Weil
1205f036ea Remove TessBaseAPI::SetThresholder (API change)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 08:59:04 +01:00
Stefan Weil
7d70ed4b41 Modernize code for OTSU and reduce public API further
Remove thresholder.h from the public API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-23 08:59:04 +01:00
Stefan Weil
ef645ce334 Avoid lots of messages for training with single line images
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 16:06:30 +01:00
Egor Pugin
7677b80408
Merge pull request #3355 from eighttails/output_training_command_line
Print command line options if run_command() failed.
2021-03-22 15:13:31 +03:00
Tadahito Yao
3b436a72c5 Print command line options if run_command() failed. 2021-03-22 20:46:44 +09:00
Stefan Weil
67dcbdda2f Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 10:36:38 +01:00
Stefan Weil
4530763329 Fix some compiler warnings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:15:09 +01:00
Stefan Weil
fbaac9dc9d Modernize code (clang-tidy -checks='-*,google-readability-braces-around-statements')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:03:51 +01:00
Stefan Weil
a54dc6390d Modernize code (clang-tidy -checks='-*,modernize-use-auto')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:02:57 +01:00
Stefan Weil
77ed2886a7 Modernize code (clang-tidy -checks='-*,modernize-loop-convert')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:02:51 +01:00
Stefan Weil
d4d51910e1 Add braces to single line statements (clang-tidy -checks='-*,google-readability-braces-around-statements')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-22 09:02:13 +01:00
Stefan Weil
5384aa7b21 Modernize code (clang-tidy -checks='-*,modernize-use-equals-delete')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
406233f1ae Modernize code (clang-tidy -checks='-*,modernize-use-equals-default')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:56 +01:00
Stefan Weil
27293fad62 Modernize code (clang-tidy -checks='-*,modernize-use-emplace')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
6fc31c44f8 Modernize code (clang-tidy -checks='-*,modernize-use-bool-literals')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
35e143ddfc Modernize code (clang-tidy -checks='-*,modernize-use-auto')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
1439efa734 Modernize code (clang-tidy -checks='-*,modernize-make-unique')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
02774bda6e Modernize code (clang-tidy -checks='-*,modernize-loop-convert')
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 21:45:55 +01:00
Stefan Weil
719dc1d7da Modernize code using override
The modifications were made using this command:

run-clang-tidy -header-filter='.*' -checks='-*,modernize-use-override' -fix

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 20:06:38 +01:00
Stefan Weil
187ac4136a Fix LGTM alert (local variable hides a parameter)
LGTM alert:

    Local variable 'correct_text' hides a parameter of the same name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 18:20:13 +01:00
Egor Pugin
7d17b72ba5 Use more smart pointers. 2021-03-21 15:19:21 +03:00
Stefan Weil
0c20d3f843 Fix compiler warnings (mostly -Wsign-compare)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-21 09:29:34 +01:00
Stefan Weil
55d87f642c Disable most Leptonica messages for tesseract by default
They were disabled in earlier builds which used NDEBUG, too.

Allow manual setting of the Leptonica message level
with environment variable LEPT_MSG_SEVERITY.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 20:16:16 +01:00
Stefan Weil
19afcdb79b Remove unused function UnicharIdArrayUtils::find_in
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:51:28 +01:00
Stefan Weil
7af5b75b8f Disable unused WriteMemoryCallback if libcurl is not used
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 15:49:06 +01:00
Egor Pugin
db7a977eab Use smart pointers. 2021-03-20 16:04:45 +03:00
Egor Pugin
69ab5bbf65 Misc. 2021-03-20 16:04:00 +03:00
Stefan Weil
f176e7c274 Fix double free caused by commit f33e80e (fixes issue #3348)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 12:37:56 +01:00
Stefan Weil
87b0a4de97 Rename GenericVector::get
The new name GenericVector::at is compatible with standard containers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:42:19 +01:00
Stefan Weil
2c1c09bd6a Rename UnicityTable::get, UnicityTable::get_mutable
The new name UnicityTable::at is compatible with standard containers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:40:00 +01:00
Stefan Weil
883353df63 Replace std::array by std::vector to avoid stack overflow
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:39:16 +01:00
Stefan Weil
ec2c989d00 Modernize code in src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-20 09:06:40 +01:00
Stefan Weil
54aec32586 Replace remaining PointerVector by std::vector for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 22:22:04 +01:00
Stefan Weil
0d739530a5 Remove unused PointerVector::DeSerialize, PonterVector::DeSerializeElement
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:17 +01:00
Stefan Weil
7207cf13d7 Replace more PointerVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 21:53:08 +01:00
Stefan Weil
aa64d83c2f Replace more PointerVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 15:22:29 +01:00
Stefan Weil
79477dc2fe Replace more PointerVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 14:46:25 +01:00
Stefan Weil
752779aaed Replace more PointerVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
cac116dd11 Replace more PointerVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
dae5accceb Replace remaining PointerVector by std::vector for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
9e006a8bbc Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
65d882f96e Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
8ed6dee8e9 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
abc22976e4 Replace remaining PointerVector by std::vector for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 12:27:48 +01:00
Stefan Weil
7f11261076 Suppress resolution warning if no resolution was given
Tesseract reported confusing information for images without resolution:

    Warning: Invalid resolution 0 dpi. Using 70 instead.
    Estimating resolution as 642

The warning is also shown when the resolution is not used at all
when preparing data for training.

It is now suppressed when there is no resolution information
(resolution == 0).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-19 10:45:54 +01:00
Stefan Weil
52a82b4356 Fix new alert reported by LGTM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 23:27:17 +01:00
Stefan Weil
f33e80e2fb Replace remaining PointerVector by std::vector for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 20:14:40 +01:00
Stefan Weil
07d147d4a6 Replace more PointerVector by std::vector for src/textord
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 19:04:00 +01:00
Stefan Weil
b0e30bd247 Replace remaining PointerVector by std::vector for src/wordrec
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 18:56:08 +01:00
Stefan Weil
b62a86a93f Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 17:16:43 +01:00
Stefan Weil
177703c562 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
9e566de0f2 Remove unused classes WordFeature, FloatWordFeature
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
7b92614efa Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 16:46:56 +01:00
Stefan Weil
a584ee5ac0 Add missing include statement (fix CI build)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:59:58 +01:00
Stefan Weil
9eab1d60c1 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 15:04:56 +01:00
Stefan Weil
f8d55f30d8 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:31:13 +01:00
Stefan Weil
d9739ba459 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:27:37 +01:00
Stefan Weil
4b428df131 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:18:49 +01:00
Stefan Weil
92e98a30e1 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 12:04:22 +01:00
Stefan Weil
573e7d6bb9 Replace more GenericVector by std::vector
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:58:13 +01:00
Stefan Weil
a80689559b Partially revert "Replace more GenericVector by std::vector for src/ccutil"
This partially reverts and cleans commit 96d72298b12f744a72e5c3cea67924779e859e42
which had broken intfeaturemap_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 11:43:32 +01:00
Stefan Weil
576d8d6c63 Partially revert "Replace remaining GenericVector by std::vector for src/training"
This partially reverts commit 7df1cb0bab
which had broken lstm_squashed_test.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 10:59:07 +01:00
Stefan Weil
77dbd3ee02 Remove two type casts
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 09:04:39 +01:00
Stefan Weil
7fdf79aff4 Move function ExtractFontName to baseapi.cpp
It is only used there, so now a local function.
This also allows removing blobclass.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a847e0f9b5 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
7df1cb0bab Replace remaining GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
4d8e9dc659 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
37c9cf4940 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:36 +01:00
Stefan Weil
a00e7bc2bb Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
1609014525 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
cb207ce645 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
b0b6bbf019 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
699f727f3e Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 08:57:35 +01:00
Stefan Weil
edab5ddee8 Replace remaining choose_nth_item by std::nth_element
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-18 07:23:40 +01:00
Stefan Weil
94a3a70fda Fix new alerts reported by LGTM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
f5a10618bf Add missing reference & for loop iterator
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
5dc3f25aca Make only locally used functions row_y_order and row_spacing_order static
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 21:51:40 +01:00
Stefan Weil
edd599fa7b Replace more GenericVector by std::vector and remove GenericVector::choose_nth_item
KDVector is now derived from std::vector.

This requires an update for unittest nthitem_test because
std::nth_element does not handle all corner cases of choose_nth_item.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4779615679 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
4103c40a29 Replace more GenericVector by std::vector for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
e0b1093249 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
71dfb82065 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
dcef5a5df1 Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
314933823a Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
6c589e044f Replace more GenericVector by std::vector for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
9728bbc596 Replace more GenericVector by std::vector for src/training
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 20:28:04 +01:00
Stefan Weil
415d9aa2da Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
ef39692451 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 17:31:12 +01:00
Stefan Weil
2fb6f9eb72 Replace remaining GenericVector by std::vector for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-03-17 13:45:54 +01:00