This fixes compiler warnings from clang:
src/textord/equationdetectbase.h:32:7: warning:
'EquationDetectBase' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/textord/blobgrid.h:33:7: warning:
'BlobGrid' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/textord/bbgrid.h:53:7: warning:
'GridBase' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/textord/alignedblob.h:81:7: warning:
'AlignedBlob' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/lstm/weightmatrix.h:33:7: warning:
'TransposedArray' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccutil/indexmapbidi.h:102:7: warning:
'IndexMapBiDi' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/training/icuerrorcode.h:44:7: warning:
'IcuErrorCode' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/training/validator.h:72:7: warning:
'Validator' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/dict/dawg.h:119:7: warning:
'Dawg' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/cutil/cutil_class.h:27:7: warning:
'CUtil' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccutil/indexmapbidi.h:102:7: warning:
'IndexMapBiDi' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccutil/ccutil.h:51:7: warning:
'CCUtil' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccstruct/matrix.h:575:7: warning:
'MATRIX' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccstruct/ccstruct.h:25:7: warning:
'CCStruct' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/viewer/scrollview.h:86:7: warning:
'SVEventHandler' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccmain/mutableiterator.h:44:7: warning:
'MutableIterator' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes compiler warnings from clang:
src/ccmain/ltrresultiterator.h:48:16: warning:
'LTRResultIterator' has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Either it was not needed, or it could be replaced by checking
for not _WIN32.
This fixes a compiler warning from clang:
src/ccutil/platform.h:41:9: warning:
macro name is a reserved identifier [-Wreserved-id-macro]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warning from clang:
src/api/pdfrenderer.cpp:848:28: warning:
cast from 'const char *' to 'char *' drops const qualifier [-Wcast-qual]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
size_t would require a different format string. Here an unsigned int
is sufficient in both cases, so use that.
This error was found by lgtm, see
https://lgtm.com/projects/g/tesseract-ocr/tesseract/.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warnings from clang:
src/textord/makerow.cpp:2579:36: warning:
cast from 'const void *' to 'BLOBNBOX **' drops const qualifier [-Wcast-qual]
src/textord/makerow.cpp:2581:36: warning:
cast from 'const void *' to 'BLOBNBOX **' drops const qualifier [-Wcast-qual]
src/textord/makerow.cpp:2601:31: warning:
cast from 'const void *' to 'TO_ROW **' drops const qualifier [-Wcast-qual]
src/textord/makerow.cpp:2603:31: warning:
cast from 'const void *' to 'TO_ROW **' drops const qualifier [-Wcast-qual]
src/textord/makerow.cpp:2623:31: warning:
cast from 'const void *' to 'TO_ROW **' drops const qualifier [-Wcast-qual]
src/textord/makerow.cpp:2625:31: warning:
cast from 'const void *' to 'TO_ROW **' drops const qualifier [-Wcast-qual]
Warning from lgtm:
Local variable 'blob' hides a parameter of the same name.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warnings from clang:
src/ccstruct/werd.cpp:128:4: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/werd.cpp:394:18: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/werd.cpp:394:27: warning:
cast from 'const void *' to 'WERD **' drops const qualifier [-Wcast-qual]
src/ccstruct/werd.cpp:395:18: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/werd.cpp:395:27: warning:
cast from 'const void *' to 'WERD **' drops const qualifier [-Wcast-qual]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warnings from clang:
src/ccstruct/polyblk.cpp:194:16: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:195:16: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:292:45: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:30:9: warning:
macro is not used [-Wunused-macros]
src/ccstruct/polyblk.cpp:348:8: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:358:12: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:362:26: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:383:21: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:383:36: warning:
cast from 'const void *' to 'ICOORDELT **' drops const qualifier [-Wcast-qual]
src/ccstruct/polyblk.cpp:384:21: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/polyblk.cpp:384:36:
warning: cast from 'const void *' to 'ICOORDELT **' drops const qualifier [-Wcast-qual]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warnings from clang:
src/ccstruct/ocrblock.cpp:74:12: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/ocrblock.cpp:74:21: warning:
cast from 'const void *' to 'ROW **' drops const qualifier [-Wcast-qual]
src/ccstruct/ocrblock.cpp:75:16: warning:
cast from 'const void *' to 'ROW **' drops const qualifier [-Wcast-qual]
src/ccstruct/ocrblock.cpp:75:7: warning:
use of old-style cast [-Wold-style-cast]
Make also the function decreasing_top_order a local function as it is
only used locally and remove its global declarations (2 locations).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warnings from clang:
src/ccstruct/mod128.cpp:57:15: warning:
no previous extern declaration for non-static variable 'dirtab' [-Wmissing-variable-declarations]
src/ccstruct/mod128.cpp:57:24: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/mod128.cpp:57:35: warning:
cast from 'const short *' to 'ICOORD *' drops const qualifier [-Wcast-qual]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Compiler warnings from clang:
src/ccstruct/genblob.cpp:34:20: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/genblob.cpp:34:32: warning:
cast from 'const void *' to 'C_BLOB **' drops const qualifier [-Wcast-qual]
src/ccstruct/genblob.cpp:35:20: warning:
use of old-style cast [-Wold-style-cast]
src/ccstruct/genblob.cpp:35:32: warning:
cast from 'const void *' to 'C_BLOB **' drops const qualifier [-Wcast-qual]
The function c_blob_comparator is only used in fixspace.cpp,
so move it to that file, make it a local function, and remove
genblob.cpp and genblob.h which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It is only used in textord/topitch.cpp, so move it into that file.
Remove also the inline attribute as it has not effect here and
update the type casts to fix some compiler warnings from clang.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
- add linefeed after last line
- remove blanks at line endings
This fixes some warnings from clang:
src/training/validate_javanese.h:63:51: warning:
no newline at end of file [-Wnewline-eof]
src/training/validate_javanese.cpp:269:26: warning:
no newline at end of file [-Wnewline-eof]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Instead of adding an empty TBOX at the end of the box list,
that corner case is now handled by passing a nullptr (like
it was already done for the first box in the list).
This avoids the calls of BoxMissMetric with a TBOX
which raises an assertion there (b == 0).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It looks like the check cblob_ptr != nullptr is not needed.
If cblob_ptr were NULL, we would have seen crashes in compute_bounding_box.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Let's hope that word->best_choice is never NULL.
Overwise both the old and the new code would abort with SIGSEGV.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The parameter glyph_confidences is changed from bool to int.
An execution with value 1 outputs the hOCR file enriched with glyph confidences
for every timestep like before. An execution with value 2 outputs the timesteps
accumulated over the recognized characters.
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
Page segmentation mode "OSD only" requires osd.traineddata,
so use it automatically.
Report a warning if the user specified a different language.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
By default, that script creates two new temporary directories with random
names in /tmp.
The new command line flag --workspace_dir PATH uses the given path as
a base directory for all temporary files.
That allows better reproducable training results (no random directory
names in log files).
Signed-off-by: Stefan Weil <stweil@ub-backup.bib.uni-mannheim.de>
By using the parameter -c glyph_confidences=true the user is able to enrich
the hOCR output with additional information. Tesseract then lists additionally
the timesteps with all glyphs that were considered with their confidence
for every timestep of the LSTM.
The format of the hOCR output is slightly changed: There is now a linebreak
after every word for better readability by humans.
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
One of the checks was too restrictive, as lstmeval deserializes
char arrays with 14000000 elements, so raise the limit to 30000000.
That check was added in commit 992031e824.
Add also assertions which help finding such problems in debug mode.
Signed-off-by: Stefan Weil <stweil@ub-backup.bib.uni-mannheim.de>
It is needed for running the training tutorial on Linux.
The correct mode was lost when moving the files in
commit 104fe7931c.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The Serialize method is used indirectly by MasterTrainer::Serialize,
but there is no corresponding MasterTrainer::DeSerialize.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
OpenclDevice::getDeviceSelection crashed when outdated information
was read from file and device.score was not set.
Change also the struct definitions from C to C++ and
eliminate some type casts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Commit 4d514d5a60 introduced tprintf_internal
with an additional argument "level" which was removed again in commit
7dc5296fe9.
So we can now restore the original state without tprintf_internal.
Remove also the declaration of debug_window_on (it does not exist since
commit 030aae9896) and make the
configuration parameter debug_file local as it is only used by tprintf.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
`int depth = strtol(*str + 1, str, 10);`
`**str` holds the words in the VGSL specification, and `*str` holds a single word, lets say, `Fr64`. Now, the `strtol` function modifies `str` to point to the first character which a non-digit number, and assumes that ` *str+1 ` points to a number (of valid integer format) as a string (automatically skipping all the white spaces, and no other characters), where in reality, it seems to point to `r` in `Fr164`.This is a bad argument, which results in strtol returning 0.
` strtol (*str + 2, str, 10)` should be passed instead.
Limit the matrix to UINT16_MAX x UINT16_MAX.
Larger dimensions could also result in an arithmetic overflow
when multiplying the two dimensions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Wrong file data could give a large value for the number of vector elements
resulting in very large memory allocations.
Limit the allowed data range to UINT16_MAX (65535) elements
which hopefully should be sufficient for all use cases.
Changing the data type of the related member variables from int to
uint32_t allowed removing several type casts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Add missing include statements, add missing "static" qualifiers or
remove functions which are not used at all.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Add break in default case to avoid potential problems with
future case statements following the default case.
* Remove empty statement.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warnings:
src/ccstruct/coutln.cpp:231:15: warning:
variable 'destindex' may be uninitialized when used here [-Wconditional-uninitialized]
src/wordrec/language_model.cpp:1170:27: warning:
variable 'expected_gap' may be uninitialized when used here [-Wconditional-uninitialized]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warnings:
src/api/baseapi.cpp:1642:18: warning:
possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1642:31: warning:
possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1642:45: warning:
possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1652:16: warning:
possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1652:30: warning:
possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1662:17: warning:
possible misuse of comma operator here [-Wcomma]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warning:
src/ccstruct/polyblk.cpp:48:36: warning:
constructor parameter 'box' shadows the field 'box' of 'POLY_BLOCK'
[-Wshadow-field-in-constructor]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warning:
src/lstm/networkio.cpp:56:15: warning:
'this' pointer cannot be null in well-defined C++ code;
comparison may be assumed to always evaluate to true [-Wtautological-undefined-compare]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warning:
src/lstm/lstmrecognizer.cpp:411:13: warning:
unused function 'NullIsBest' [-Wunused-function]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warning:
src/lstm/network.cpp:249:7:
warning: 'break' will never be executed [-Wunreachable-code-break]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The functions TessBaseAPIInitLangMod, TessBaseAPIClearAdaptiveClassifier
and TessBaseAPIDetectOrientationScript need conditional compilation.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Instead of defining the DISABLED_LEGACY_ENGINE macro in config_auto.h
(which is not included by all source files), define it as a preprocessor
option for those parts of the code which require it.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
On most systems float is the IEEE 754 single-precision binary
floating-point format (32 bits). Tesseract does not support other systems.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
On most systems double is the IEEE 754 double-precision binary
floating-point format (64 bits). Tesseract does not support other systems.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It did not cause a problem as both arguments were 0.
Update also the function prototype of HistogramRectOCL to
accept a void pointer which allows removing a type cast.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The division was made with integers, giving a wrong result.
* Avoid division and use pure integer operations.
* Add missing "static" attribute.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Remove unneeded assignments and a wrong comment in the destructor.
Fix wrong data type for local variable xstarts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The changes are based on an analysis done with include-what-you-use.
Replace also some standard header files by the corresponding
standard C++ header files.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Remove unneeded include statements, remove conditional statements and
replace the remaining assert.h by their standard C++ variant cassert.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
genericvector.h used a mix of assert and ASSERT_HOST.
By using assert only, it does no longer depend on errcode.h
which defines the ASSERT_HOST macro.
Other files which still use ASSERT_HOST now need an explicit
include statement for errcode.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Coverity Scan does not like incrementing of a null pointer,
so increment an index value instead of a pointer.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The tesseract/ subdirectory is no longer automatically added to the
include path of the compiler. Therefore old code which used code like
#include "capi.h"
must now change that to
#include "tesseract/capi.h"
This avoids name conflicts with header files from other projects.
Signed-off-by: Stefan Weil <sw@weilnetz.de>