* Add Abseil sources to build process.
* Add copyright comment.
* InitConfigOnlyTest no longer tests
hin.traineddata because it is LSTM only.
* Fix std::string.
* Deactivate tests with missing test data.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* 'master' of https://github.com/tesseract-ocr/tesseract:
Remove code for _MSC_VER < 1900
keep API compatibility with #1265
Update googletest submodule to release v1.8.1
Update test submodule
Always use isascii() with isspace()
Avoid crash with --psm 0 and LSTM traineddata
SVPaint: Remove empty block
Classify: Don't hide debug parameter
UNICHARMAP: Remove comparison which is always false
svpaint: Change a variable from global to local
pgedit: remove unused declaration of display_bln_lines
Plumbing: Remove comparison which is always false
Release candidate 2
use pdf L_FLATE_ENCODE only for png input; fixes#1961
isspace() must only used with an unsigned char or EOF argument,
and even then its result can depend on the current locale settings.
While this is not a problem for C/C++ executables which use the default
"C" locale, it becomes a problem when the Tesseract API is called from
languages like Python or Java which don't use the "C" locale.
By calling isasci() before calling isspace() this uncertainty can be
avoided, because any locale will hopefully give identical results for
the basic ASCII character set.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes a warning from LGTM:
Poor global variable name 'rgb'. Prefer longer, descriptive
names for globals (eg. kMyGlobalConstant, not foo).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes a warning from LGTM:
This parameter of type ScrollView is 144 bytes
- consider passing a pointer/reference instead.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* 'master' of https://github.com/tesseract-ocr/tesseract: (27 commits)
Rework check for readable input file
fix "mktemp -d --tmpdir" on Mac OS; see #1453
pgedit: Change some variables from global to local ones
improve description of min_characters_to_try variable
WERD_RES: Remove comparisons which are constant
GENERIC_2D_ARRAY: Pass parameters by reference
genericvector: Pass parameters by reference
chop: Use more efficient float calculations for sqrt
rect: Use more efficient float calculations for ceil, floor
intproto: Use more efficient float calculations for floor
genericvector: Rewrite code to satisfy static code analyzer
Fix constructor for class Dict (uninitialized member variables)
Fix use of wrong UNICHARSET
lstmtraining: Remove dead code for purified model name
combine_tessdata: Handle failures when extracting
lstmtraining: Check write permission for output model
implement parameter min_characters_to_try for minimum characters to try to skip page entirely. fixes#1729
Merge and enhance documentation on language and script models
Document some more config options for tesseract
Add Makefile rule to build HTML manpages
...
This fixes compiler warnings and a warning from LGTM:
Poor global variable name 'pe'. Prefer longer, descriptive names [...]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes warnings from LGTM:
Comparison is always false because id >= 0.
Comparison is always true because mirrored >= 1.
Comparison is always false because id >= 0.
INVALID_UNICHAR_ID is -1, so the warnings are correct.
Signed-off-by: Stefan Weil <sw@weilnetz.de>