This fixes a compiler warning:
globaloc.cpp:33:6: warning: no previous extern declaration for
non-static variable 'global_crash_pixes'
[-Wmissing-variable-declarations]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes some compiler warnings:
mainblk.cpp:28:9: warning: macro is not used [-Wunused-macros]
mainblk.cpp:29:9: warning: macro is not used [-Wunused-macros]
[...]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* 'master' of https://github.com/tesseract-ocr/tesseract:
Remove code for _MSC_VER < 1900
keep API compatibility with #1265
Update googletest submodule to release v1.8.1
Update test submodule
Always use isascii() with isspace()
Avoid crash with --psm 0 and LSTM traineddata
SVPaint: Remove empty block
Classify: Don't hide debug parameter
UNICHARMAP: Remove comparison which is always false
svpaint: Change a variable from global to local
pgedit: remove unused declaration of display_bln_lines
Plumbing: Remove comparison which is always false
Release candidate 2
use pdf L_FLATE_ENCODE only for png input; fixes#1961
isspace() must only used with an unsigned char or EOF argument,
and even then its result can depend on the current locale settings.
While this is not a problem for C/C++ executables which use the default
"C" locale, it becomes a problem when the Tesseract API is called from
languages like Python or Java which don't use the "C" locale.
By calling isasci() before calling isspace() this uncertainty can be
avoided, because any locale will hopefully give identical results for
the basic ASCII character set.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* 'master' of https://github.com/tesseract-ocr/tesseract: (27 commits)
Rework check for readable input file
fix "mktemp -d --tmpdir" on Mac OS; see #1453
pgedit: Change some variables from global to local ones
improve description of min_characters_to_try variable
WERD_RES: Remove comparisons which are constant
GENERIC_2D_ARRAY: Pass parameters by reference
genericvector: Pass parameters by reference
chop: Use more efficient float calculations for sqrt
rect: Use more efficient float calculations for ceil, floor
intproto: Use more efficient float calculations for floor
genericvector: Rewrite code to satisfy static code analyzer
Fix constructor for class Dict (uninitialized member variables)
Fix use of wrong UNICHARSET
lstmtraining: Remove dead code for purified model name
combine_tessdata: Handle failures when extracting
lstmtraining: Check write permission for output model
implement parameter min_characters_to_try for minimum characters to try to skip page entirely. fixes#1729
Merge and enhance documentation on language and script models
Document some more config options for tesseract
Add Makefile rule to build HTML manpages
...
This fixes warnings like the following one from LGTM:
This parameter of type ParamsTrainingHypothesis is 112 bytes
- consider passing a pointer/reference instead.
Most parameters can also get the const attribute.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Warning from LGTM:
Resource data_ is acquired by class GenericVector<FontSpacingInfo *>
but not released in the destructor.
LGTM complains about data_ not being deleted in the destructor.
The destructor calls the clear() method, but the delete there
was conditional which confuses the static code analyzer.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* 'master' of https://github.com/tesseract-ocr/tesseract:
Fix CID 1164579 (Explicit null dereferenced)
print help for tesstrain.sh; fixes#1469
Fix CID 1395882 (Uninitialized scalar variable)
Fix comments
Move content of ipoints.h to points.h and remove ipoints.h
remove duplicate help from combine_lang_model
Fix typo.
use tprintf instead of printf to be able disable messages by quiet option (issue #1240)
add "sudo ldconfig" to install instruction. fixes#1212
unittest: Replace NULL by nullptr
unittest: Format code
tesseract app: check if input file exists; fixes#1023
Format code (replace ( xxx ) by (xxx))
Simplify boolean expressions
Win32: use the ISO C and C++ conformant name "_putenv" instead of deprecated "putenv"
The report from Coverity Scan is a false positive.
Nevertheless the code can be rewritten and optimized
a little bit to fix that report.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The error message "segmentation fault" confuses most users,
so enforce a segmentation fault only in debug code.
Release code simply calls the abort function.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Wrong or old parameters in traineddata files should not terminate
the program, so make that a warning instead of a fatal error.
This fixes issue #1520.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccutil/indexmapbidi.h:102:7: warning:
'IndexMapBiDi' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccutil/indexmapbidi.h:102:7: warning:
'IndexMapBiDi' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccutil/ccutil.h:51:7: warning:
'CCUtil' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Either it was not needed, or it could be replaced by checking
for not _WIN32.
This fixes a compiler warning from clang:
src/ccutil/platform.h:41:9: warning:
macro name is a reserved identifier [-Wreserved-id-macro]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It is only used in textord/topitch.cpp, so move it into that file.
Remove also the inline attribute as it has not effect here and
update the type casts to fix some compiler warnings from clang.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
One of the checks was too restrictive, as lstmeval deserializes
char arrays with 14000000 elements, so raise the limit to 30000000.
That check was added in commit 992031e824.
Add also assertions which help finding such problems in debug mode.
Signed-off-by: Stefan Weil <stweil@ub-backup.bib.uni-mannheim.de>
Commit 4d514d5a60 introduced tprintf_internal
with an additional argument "level" which was removed again in commit
7dc5296fe9.
So we can now restore the original state without tprintf_internal.
Remove also the declaration of debug_window_on (it does not exist since
commit 030aae9896) and make the
configuration parameter debug_file local as it is only used by tprintf.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Wrong file data could give a large value for the number of vector elements
resulting in very large memory allocations.
Limit the allowed data range to UINT16_MAX (65535) elements
which hopefully should be sufficient for all use cases.
Changing the data type of the related member variables from int to
uint32_t allowed removing several type casts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Add missing include statements, add missing "static" qualifiers or
remove functions which are not used at all.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
On most systems float is the IEEE 754 single-precision binary
floating-point format (32 bits). Tesseract does not support other systems.
Signed-off-by: Stefan Weil <sw@weilnetz.de>