Fix 306 warnings from MS C:
tesseract\ccutil\unicharset.h(242): warning C4267:
'argument': conversion from 'size_t' to 'int', possible loss of data
The change also avoids some type conversions.
The related code in training/util.h now uses the GOOGLE_TESSERACT macro
to enable Google specific code to disable heap checking.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove old code for string class (no longer needed)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Add std namespace to string class
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Replace log2(n) by faster local function
This also adds support for environments without a log2 function (Android).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Provide local log2 function on platforms without log2 function
The existing implementation in wordrec/language_model.cpp is modified
to use a local inline function in the tesseract namespace and copied
to lstm/weightmatrix.cpp, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
If equ_detect_ can be NULL, we must catch that case and show a warning
instead of crashing in method SetEquationDetect.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It looks like those files were added accidentally
in commit fc6a390c6c.
Add them to .gitignore to avoid that from now on.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Restore support for the legacy engine
It is still needed to get text attributes which are unsupported by the
LSTM engine, and it also has better recognition rates for some texts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* tesseractmain: Add missing 'static' attributes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Support different help texts for normal and advanced users
The old option --help now shows a very basic help text.
The new option --help-extra shows the full help information.
It now also includes a hint that Tesseract supports lists of images.
Fix also the indentation in the PSM help and
use a more neutral text in the OEM help.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Add missing line feed in error message
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Dereference pointer after NULL check (CID 1385638)
Move the statement which dereferences the pointer variable "current"
after the NULL check.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Dereference pointer after NULL check (CID 1385635)
Move the statement which dereferences the pointer variable "current"
after the NULL check.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Dereference pointer after NULL check (CID 1385634)
Move the statement which dereferences the pointer variable "current"
after the NULL check.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Fix CID 1164527 'Constant' variable guards dead code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Since commit cdc35338c5 Tesseract checks
the value passed for `--oem NUM`.
That only works as expected when the old (now unused) engine mode values
for cube are removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
ccstruct/seam.cpp:66:26: warning:
array subscript has type 'char' [-Wchar-subscripts]
Fix it by using an unsigned index and use the same type for related values.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
with Visual Studio 2015 RTM:
Error C2039: 'back_inserter': is not a member of 'std'
Error C3861: 'back_inserter': identifier not found
need "iterator" with Visual Studio 2015 (vc14).
#include <iterator>
ftell returns a long value which can be negative when an error occurred.
It returns LONG_MAX for directories.
Both cases were not handled by the old code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
If unicharset_extractor was called without any argument,
a help message was printed by tesseract::ParseCommandLineFlags.
Replace that by the local help message which is better.
Signed-off-by: Stefan Weil <sw@weilnetz.de>