tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-12-30 12:28:19 +08:00

Author	SHA1	Message	Date
Amit D	c8bb526afb	Merge pull request #3510 from stweil/enable-float32 Add new configure option --enable-float32 for faster LSTM with float	2021-07-29 18:01:21 +03:00
Stefan Weil	0d0f203509	Add new configure option --enable-float32 for faster LSTM with float Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-29 06:49:08 +02:00
Stefan Weil	553ab64d8d	Rename UnicityTable<T>::get_id to UnicityTable<T>::get_index This prepares replacing UnicityTable<FontInfo> by FontInfoTable. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-26 07:59:58 +02:00
Stefan Weil	c9f42ce62b	Add unittest for static TessBaseAPI object (#3509 ) Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-25 14:34:43 +03:00
Stefan Weil	df1295ea6b	Simplify *_VAR_H macros (#3508 ) This avoids duplicate (and potentially inconsistent) code. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-25 12:09:07 +03:00
Amit D	e538cd7152	Merge pull request #3486 from stweil/tfloat Add TFloat data type for neural network	2021-07-25 00:03:56 +03:00
Ger Hobbelt	27597883db	Implement DotProductSSE() for FAST_FLOAT [sw] Formatted commit message	2021-07-24 15:14:17 +02:00
Ger Hobbelt	79e8b4f344	bugfixing the AVX2 Extract8+16 codes There's lines like `__m256d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. [sw] Formatted commit message	2021-07-24 15:14:17 +02:00
Ger Hobbelt	24a29b79e5	bugfix of FMA port to FAST_FLOAT 8 float FPs fit in a single 256bit vector (8x32) (contrasting 4 double FPs: 4*64) [sw] Format commit message and use float instead of TFloat	2021-07-24 15:14:17 +02:00
Stefan Weil	472f5d9020	Add TFloat data type for neural network Up to now Tesseract used double for training and recognition with "best" models. This commit replaces double by a new data type TFloat which is double by default, but float if FAST_FLOAT is defined. Ideally this should allow faster training. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-24 15:14:17 +02:00
Stefan Weil	66b77e6639	Prepare using float instead of double for LSTM calculations The new header file ccutils/tesstypes.h also prepares support for larger images by introducing a new data type for image size and coordinates (still unused). FloatToDouble is now a local function. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-24 13:59:37 +02:00
Stefan Weil	c3fb050daa	Remove TODO comment which is no longer open Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-24 11:20:29 +02:00
Stefan Weil	4df822a3fc	Revert "Merge pull request #3330 from Sintun/master" (#3505 ) This reverts commit `122daf1d64`, reversing changes made to `4cd56dc5f5`. Those changes caused two regressions which resulted in an assertion or a segmentation fault. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-22 09:04:23 +03:00
Stefan Weil	e176169a90	Remove stray spaces at line endings Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-20 20:59:15 +02:00
Ger Hobbelt	444fe14273	Fix a couple of 'shadowed local variables' compiler warnings These fixes got through while I manually extracted the template work from my mainline (warnings due to running MSVC at Level 4) [sw]: Format commit message and use different fix for blamer.cpp Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-20 20:49:03 +02:00
Stefan Weil	0fc6d8d7f0	Add missing hint for dotproduct parameter value "fma" Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-20 20:44:29 +02:00
Ger Hobbelt	f72d4b1fe7	NEON arch: dead ref cycle fix When neon_available_ is ON, the DotProduct was set to point to DotProduct, which should have been DotProductNative, as dotProduct is the target global itself: see simddetect.h --> effectively making that part of the SetDotProduct() call identical to this (no-op) statement: `DotProduct = DotProduct;` Also added the Neon check in the Update() API, where it exists together with the other checks (for AVX/SSE/etc.) [sw: formatted commit message and merged into main branch] Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-20 20:40:16 +02:00
Stefan Weil	dff7312aed	Modernize code in SIMDDetect::Update Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-20 20:16:49 +02:00
Stefan Weil	3ab8dcbf72	Use Apple Accelerate framework for training and best models Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-20 19:27:54 +02:00
Johannes Künsebeck	3be11f12a9	Removed unused parameters declarations and definitions	2021-07-20 15:08:10 +02:00
zdenop	8dd7936475	Solve clang reporting unused variable in ExtractMicros function (#3501 ) * mark attribute as unused for compiler * try c++17 standard https://en.cppreference.com/w/cpp/language/attributes/maybe_unused	2021-07-18 01:59:49 +02:00
SpaceIm	b2fea77a27	fix cross-build to iOS/tvOS/watchOS On these OS, executables are bundle, and they need a BUNDLE DESTINATION, otherwise CMake configuration fails. See https://cmake.org/cmake/help/latest/policy/CMP0006.html	2021-07-17 09:29:34 +02:00
nagadomi	7fe0624838	Fix spec string of convolution layer (#3499 )	2021-07-16 18:21:52 +03:00
Stefan Weil	88d4028a5a	Enable pragma for SIMD also when _OPENMP is defined Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-15 16:03:43 +02:00
Stefan Weil	f0fb6809e3	Use SIMD instructions for DotProductNative Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-14 19:13:01 +02:00
Tadahito Yao	12e0fb4e01	Fix deadlock in lstmtraing. (#3488 )	2021-07-10 10:59:10 +03:00
Stefan Weil	767fb5a177	Fix LSTMTrainerTest.BidiTest Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-04 18:41:19 +02:00
Stefan Weil	915c29e3c8	Fix IntSimdMatrixTest.AVX2 Fixes: `872816897a` Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-04 09:07:35 +02:00
Stefan Weil	e0af8d12e6	Fix check for NEON on 32 bit ARM Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-03 15:43:10 +02:00
Stefan Weil	158c845228	Catch another FP division by 0 (fixes issue #3483 ) Rewriting the code avoids FP operations (so makes it potentially faster) and fixes the division by 0. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-03 15:37:24 +02:00
Stefan Weil	4333b2cea3	Use CMAKE_SYSTEM_PROCESSOR to check for SIMD support options (#3484 ) Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-02 16:23:58 +03:00
Stefan Weil	4b630a8813	Catch FP division by 0 (fixes issue #3483 ) Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-07-02 15:04:31 +02:00
Łukasz Nocuń	38f0fdcd88	Fix CMake Linux build (#3478 )	2021-06-30 00:16:51 +03:00
OgreTransporter	4d0f027f58	Bugfix OpenMP with Visual Studio (#3475 ) * Bugfix OpenMP with Visual Studio * Test for working VS2019 update instead of first version of VS2019	2021-06-29 21:06:38 +03:00
Stefan Weil	a701454ae5	Fix vector resize with init for all elements (issue #3473 ) (#3474 ) Fixes: `c8b8d266d6` Fixes: `9710bc0465` Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-06-29 21:05:29 +03:00
nagadomi	ff1062d39d	Add --reset_learning_rate option to lstmtraining (#3470 ) When the --reset_learning_rate option is specified, it resets the learning rate stored in each layer of the network loaded with --continue_from to the value specified by the --learning_rate option. If checkpoint is available, it does nothing.	2021-06-28 11:48:07 +03:00
nagadomi	d8bd78f8e2	Fix missing reset of best_error_history_ in LSTMTrainer::InitIterations() (#3469 )	2021-06-27 09:26:32 +03:00
Stefan Weil	29e842df46	CI: Replace g++-8 by g++-11 for MacOS g++-8 is no longer installed, therefore CI fails for that compiler. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-06-26 14:56:07 +02:00
nagadomi	b2fa77f8f0	Show layer specified learning rates with combine_tessdata -l (#3468 )	2021-06-26 08:08:54 +03:00
Łukasz Nocuń	c583ecef29	Fix permanently disabled optimizations in CMake (#3467 )	2021-06-24 21:16:23 +03:00
MonkeybreadSoftware	75e6c3ea4c	Null check for GetSourceYResolution (#3457 ) * Null check for GetSourceYResolution Added missing NULL check to avoid crash when we read property in our tesseract wrapper. * Added missing return value. added -1 to return if undefined.	2021-06-16 16:35:24 +03:00
Egor Pugin	7a308edcb1	Merge pull request #3439 from amitdo/remove-var Remove unused variable	2021-05-21 23:07:26 +03:00
Amit Dovev	bf979c801a	Remove unused variable	2021-05-21 20:34:09 +03:00
Egor Pugin	a72408fdef	Merge pull request #3438 from amitdo/pango Raise Minimum required Pango version to 1.38.0	2021-05-21 20:09:27 +03:00
Egor Pugin	ef69805298	Merge pull request #3437 from amitdo/sauvola ThresholdMethod::TiledSauvola -> ThresholdMethod::Sauvola	2021-05-21 20:08:59 +03:00
Amit Dovev	8615f65cc4	Raise Minimum required Pango version to 1.38.0	2021-05-21 19:56:37 +03:00
Amit Dovev	c24538518c	ThresholdMethod::TiledSauvola -> ThresholdMethod::Sauvola The fact that this method uses tiles is implementation detail. It does not change the result compared to Sauvola without tiles. The use of tiles minimize memory consumption.	2021-05-21 18:15:30 +03:00
Stefan Weil	93348a83a3	Remove scripts for training They were replaced by Python3 scripts (part of the tesstrain repository). Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-05-18 10:47:44 +02:00
Stefan Weil	5eb2e86635	Fix some typos (found by codespell) Signed-off-by: Stefan Weil <sw@weilnetz.de>	2021-05-17 15:18:43 +02:00
nagadomi	42e4b91132	Refactor ObjectCache::DeleteUnusedObjects with reverse iterator	2021-05-17 14:50:30 +02:00

... 8 9 10 11 12 ...

6065 Commits