Commit Graph

6065 Commits

Author SHA1 Message Date
Amit D
c8bb526afb
Merge pull request #3510 from stweil/enable-float32
Add new configure option --enable-float32 for faster LSTM with float
2021-07-29 18:01:21 +03:00
Stefan Weil
0d0f203509 Add new configure option --enable-float32 for faster LSTM with float
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-29 06:49:08 +02:00
Stefan Weil
553ab64d8d Rename UnicityTable<T>::get_id to UnicityTable<T>::get_index
This prepares replacing UnicityTable<FontInfo> by FontInfoTable.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-26 07:59:58 +02:00
Stefan Weil
c9f42ce62b
Add unittest for static TessBaseAPI object (#3509)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-25 14:34:43 +03:00
Stefan Weil
df1295ea6b
Simplify *_VAR_H macros (#3508)
This avoids duplicate (and potentially inconsistent) code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-25 12:09:07 +03:00
Amit D
e538cd7152
Merge pull request #3486 from stweil/tfloat
Add TFloat data type for neural network
2021-07-25 00:03:56 +03:00
Ger Hobbelt
27597883db Implement DotProductSSE() for FAST_FLOAT
[sw] Formatted commit message
2021-07-24 15:14:17 +02:00
Ger Hobbelt
79e8b4f344 bugfixing the AVX2 Extract8+16 codes
There's lines like `__m256d scale01234567 = _mm256_loadu_ps(scales)`,
i.e. loading float vectors into double vector types.

[sw] Formatted commit message
2021-07-24 15:14:17 +02:00
Ger Hobbelt
24a29b79e5 bugfix of FMA port to FAST_FLOAT
8 float FPs fit in a single 256bit vector (8x32)
(contrasting 4 double FPs: 4*64)

[sw] Format commit message and use float instead of TFloat
2021-07-24 15:14:17 +02:00
Stefan Weil
472f5d9020 Add TFloat data type for neural network
Up to now Tesseract used double for training and recognition
with "best" models.

This commit replaces double by a new data type TFloat which
is double by default, but float if FAST_FLOAT is defined.

Ideally this should allow faster training.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-24 15:14:17 +02:00
Stefan Weil
66b77e6639 Prepare using float instead of double for LSTM calculations
The new header file ccutils/tesstypes.h also prepares support
for larger images by introducing a new data type for image
size and coordinates (still unused).

FloatToDouble is now a local function.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-24 13:59:37 +02:00
Stefan Weil
c3fb050daa Remove TODO comment which is no longer open
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-24 11:20:29 +02:00
Stefan Weil
4df822a3fc
Revert "Merge pull request #3330 from Sintun/master" (#3505)
This reverts commit 122daf1d64, reversing
changes made to 4cd56dc5f5.

Those changes caused two regressions which resulted in an assertion
or a segmentation fault.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-22 09:04:23 +03:00
Stefan Weil
e176169a90 Remove stray spaces at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:59:15 +02:00
Ger Hobbelt
444fe14273 Fix a couple of 'shadowed local variables' compiler warnings
These fixes got through while I manually extracted the template work
from my mainline (warnings due to running MSVC at Level 4)

[sw]: Format commit message and use different fix for blamer.cpp

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:49:03 +02:00
Stefan Weil
0fc6d8d7f0 Add missing hint for dotproduct parameter value "fma"
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:44:29 +02:00
Ger Hobbelt
f72d4b1fe7 NEON arch: dead ref cycle fix
When neon_available_ is ON, the DotProduct was set to point to DotProduct,
which should have been DotProductNative, as dotProduct is the *target* global itself:
see simddetect.h --> effectively making that part of the SetDotProduct() call
identical to this (no-op) statement: `DotProduct = DotProduct;`

Also added the Neon check in the Update() API, where it exists together
with the other checks (for AVX/SSE/etc.)

[sw: formatted commit message and merged into main branch]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:40:16 +02:00
Stefan Weil
dff7312aed Modernize code in SIMDDetect::Update
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 20:16:49 +02:00
Stefan Weil
3ab8dcbf72 Use Apple Accelerate framework for training and best models
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-20 19:27:54 +02:00
Johannes Künsebeck
3be11f12a9 Removed unused parameters declarations and definitions 2021-07-20 15:08:10 +02:00
zdenop
8dd7936475
Solve clang reporting unused variable in ExtractMicros function (#3501)
* mark attribute as unused for compiler
* try c++17 standard https://en.cppreference.com/w/cpp/language/attributes/maybe_unused
2021-07-18 01:59:49 +02:00
SpaceIm
b2fea77a27 fix cross-build to iOS/tvOS/watchOS
On these OS, executables are bundle, and they need a BUNDLE DESTINATION, otherwise CMake configuration fails.
See https://cmake.org/cmake/help/latest/policy/CMP0006.html
2021-07-17 09:29:34 +02:00
nagadomi
7fe0624838
Fix spec string of convolution layer (#3499) 2021-07-16 18:21:52 +03:00
Stefan Weil
88d4028a5a Enable pragma for SIMD also when _OPENMP is defined
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-15 16:03:43 +02:00
Stefan Weil
f0fb6809e3 Use SIMD instructions for DotProductNative
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-14 19:13:01 +02:00
Tadahito Yao
12e0fb4e01
Fix deadlock in lstmtraing. (#3488) 2021-07-10 10:59:10 +03:00
Stefan Weil
767fb5a177 Fix LSTMTrainerTest.BidiTest
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-04 18:41:19 +02:00
Stefan Weil
915c29e3c8 Fix IntSimdMatrixTest.AVX2
Fixes: 872816897a
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-04 09:07:35 +02:00
Stefan Weil
e0af8d12e6 Fix check for NEON on 32 bit ARM
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-03 15:43:10 +02:00
Stefan Weil
158c845228 Catch another FP division by 0 (fixes issue #3483)
Rewriting the code avoids FP operations (so makes it potentially faster)
and fixes the division by 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-03 15:37:24 +02:00
Stefan Weil
4333b2cea3
Use CMAKE_SYSTEM_PROCESSOR to check for SIMD support options (#3484)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-02 16:23:58 +03:00
Stefan Weil
4b630a8813 Catch FP division by 0 (fixes issue #3483)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-07-02 15:04:31 +02:00
Łukasz Nocuń
38f0fdcd88
Fix CMake Linux build (#3478) 2021-06-30 00:16:51 +03:00
OgreTransporter
4d0f027f58
Bugfix OpenMP with Visual Studio (#3475)
* Bugfix OpenMP with Visual Studio

* Test for working VS2019 update instead of first version of VS2019
2021-06-29 21:06:38 +03:00
Stefan Weil
a701454ae5
Fix vector resize with init for all elements (issue #3473) (#3474)
Fixes: c8b8d266d6
Fixes: 9710bc0465
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-06-29 21:05:29 +03:00
nagadomi
ff1062d39d
Add --reset_learning_rate option to lstmtraining (#3470)
When the --reset_learning_rate option is specified,
it resets the learning rate stored in each layer of the network
loaded with --continue_from to the value specified by the --learning_rate option.
If checkpoint is available, it does nothing.
2021-06-28 11:48:07 +03:00
nagadomi
d8bd78f8e2
Fix missing reset of best_error_history_ in LSTMTrainer::InitIterations() (#3469) 2021-06-27 09:26:32 +03:00
Stefan Weil
29e842df46 CI: Replace g++-8 by g++-11 for MacOS
g++-8 is no longer installed, therefore CI fails for that compiler.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-06-26 14:56:07 +02:00
nagadomi
b2fa77f8f0
Show layer specified learning rates with combine_tessdata -l (#3468) 2021-06-26 08:08:54 +03:00
Łukasz Nocuń
c583ecef29
Fix permanently disabled optimizations in CMake (#3467) 2021-06-24 21:16:23 +03:00
MonkeybreadSoftware
75e6c3ea4c
Null check for GetSourceYResolution (#3457)
* Null check for GetSourceYResolution

Added missing NULL check to avoid crash when we read property in our tesseract wrapper.

* Added missing return value.

added -1 to return if undefined.
2021-06-16 16:35:24 +03:00
Egor Pugin
7a308edcb1
Merge pull request #3439 from amitdo/remove-var
Remove unused variable
2021-05-21 23:07:26 +03:00
Amit Dovev
bf979c801a Remove unused variable 2021-05-21 20:34:09 +03:00
Egor Pugin
a72408fdef
Merge pull request #3438 from amitdo/pango
Raise Minimum required Pango version to 1.38.0
2021-05-21 20:09:27 +03:00
Egor Pugin
ef69805298
Merge pull request #3437 from amitdo/sauvola
ThresholdMethod::TiledSauvola -> ThresholdMethod::Sauvola
2021-05-21 20:08:59 +03:00
Amit Dovev
8615f65cc4 Raise Minimum required Pango version to 1.38.0 2021-05-21 19:56:37 +03:00
Amit Dovev
c24538518c ThresholdMethod::TiledSauvola -> ThresholdMethod::Sauvola
The fact that this method uses tiles is implementation detail. It does not change the result compared to Sauvola without tiles. The use of tiles minimize memory consumption.
2021-05-21 18:15:30 +03:00
Stefan Weil
93348a83a3 Remove scripts for training
They were replaced by Python3 scripts (part of the tesstrain repository).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-18 10:47:44 +02:00
Stefan Weil
5eb2e86635 Fix some typos (found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-05-17 15:18:43 +02:00
nagadomi
42e4b91132 Refactor ObjectCache::DeleteUnusedObjects with reverse iterator 2021-05-17 14:50:30 +02:00