* api: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* ccmain: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* ccstruct: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* classify: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* cutil: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* dict: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* textord: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* training: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* wordrec: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* ccutil: Replace Tesseract data types by POSIX data types
Now all Tesseract data types which are no longer needed can be removed
from ccutil/host.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* ccmain: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* ccstruct: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* classify: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* dict: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* lstm: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* textord: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* wordrec: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* ccutil: Replace Tesseract's MIN_*INT, MAX_*INT* by POSIX *INT*_MIN, *INT*_MAX
Remove the macros which are now unused from ccutil/host.h.
Remove also the obsolete history comments.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Fix build error caused by ambiguous ClipToRange
Error message vom Appveyor CI:
C:\projects\tesseract\ccstruct\coutln.cpp(818): error C2672: 'ClipToRange': no matching overloaded function found [C:\projects\tesseract\build\libtesseract.vcxproj]
C:\projects\tesseract\ccstruct\coutln.cpp(818): error C2782: 'T ClipToRange(const T &,const T &,const T &)': template parameter 'T' is ambiguous [C:\projects\tesseract\build\libtesseract.vcxproj]
c:\projects\tesseract\ccutil\helpers.h(122): note: see declaration of 'ClipToRange'
C:\projects\tesseract\ccstruct\coutln.cpp(818): note: could be 'char'
C:\projects\tesseract\ccstruct\coutln.cpp(818): note: or 'int'
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* unittest: Replace Tesseract's MAX_INT8 by POSIX INT8_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* arch: Replace Tesseract's MAX_INT8 by POSIX INT8_MAX
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Replace the Tesseract specific data types in header files which are
part of Debian package libtesseract-dev by POSIX data types.
Update also matching cpp files.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The related code in training/util.h now uses the GOOGLE_TESSERACT macro
to enable Google specific code to disable heap checking.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove old code for string class (no longer needed)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Add std namespace to string class
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Replace log2(n) by faster local function
This also adds support for environments without a log2 function (Android).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Provide local log2 function on platforms without log2 function
The existing implementation in wordrec/language_model.cpp is modified
to use a local inline function in the tesseract namespace and copied
to lstm/weightmatrix.cpp, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The new test in LSTMTrainer::UpdateErrorGraph fixes an assertion
(see issues #644, #792).
The new test in LSTMTrainer::ReadTrainingDump was added to improve
the robustness of the code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The constant value kNumThreads is not only used to configure the number
of threads but also to allocate vectors used in those threads.
There is only a single thread without OpenMP, so it is sufficient to
allocate vectors with only one element in that case.
Replace also the upper limit in the for loops by the known vector size.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The code in ccutil/hashfn.h was needed for some old compilers. Now that we support MSVC >= 2010 and compilers that has good support for C++11, we can drop this code.
As a result of this file removal, we now use:
std::unordered_map
std::unordered_set
std::unique_ptr
directly in the codebase with '#include' for the needed headers.
Modify also the code to use a singleton. This simplifies the code as
no locking is needed. It also slightly improves the performance because
no check whether the architecture was tested is needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Fix compiler warning (see below)
* Use Linux code for Mingw-w64, too
* Simplify conditional code by using X86_BUILD instead of NONX86_BUILD
* Remove unneeded call of __get_cpuid_max (already called by __get_cpuid)
* Remove unneeded #undef statement
gcc report:
lstm/weightmatrix.cpp: In static member function
'static double tesseract::WeightMatrix::DotProduct(const double*, const double*, int)':
weightmatrix.cpp:67:29: warning:
'ecx' may be used uninitialized in this function [-Wmaybe-uninitialized]
avx_available_ = (ecx & 0x10000000) != 0;
^
lstm/weightmatrix.cpp:64:30: note: 'ecx' was declared here
unsigned int eax, ebx, ecx, edx;
^
Signed-off-by: Stefan Weil <sw@weilnetz.de>
cpuid.h is only available for x86 builds. There are lots of non x86
architectures, so simply checking for PowerPC is not enough.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Builds without support for OpenMP failed with the old code. Fix this:
* Add OPENMP_CXXFLAGS for ccmain.
* Replace unconditional -fopenmp by OPENMP_CXXFLAGS for lstm.
* Always use _OPENMP for conditional compilation.
* Remove OPENMP as there is already _OPENMP.
* Include omp.h conditionally.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Coverity report:
CID 1366441 (#1 of 1): Division or modulo by float zero (DIVIDE_BY_ZERO)
5. divide_by_zero: In expression
static_cast<double>(char_errors) / truth_size, division by expression
truth_size which may be zero has undefined behavior.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Coverity report:
CID 1366443 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
3. var_deref_model: Passing null pointer this->sub_trainer_ to
training_iteration, which dereferences it.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Coverity report:
CID 1366448 (#1 of 1): Big parameter passed by value (PASS_BY_VALUE)
pass_by_value: Passing parameter recoder of type
tesseract::UnicharCompress const (size 240 bytes) by value.
Signed-off-by: Stefan Weil <sw@weilnetz.de>