This allows fixing two compiler warnings from clang++:
src/ccutil/universalambigs.cpp:23:19: warning: no previous extern declaration for non-static variable 'kUniversalAmbigsFile' [-Wmissing-variable-declarations]
src/ccutil/universalambigs.cpp:19019:18: warning: no previous extern declaration for non-static variable 'ksizeofUniversalAmbigsFile' [-Wmissing-variable-declarations]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove unused macros ultoa, SIGNED.
* Move macros NOMINMAX and WIN32_LEAN_AND_MEAN to host.h
because they are used when including windows.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This partially reverts commit c150b9832d.
Now params.cpp includes host.h which also gets the definition for MAX_PATH.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang-tidy added it in commit ac0b191f6b.
The "e" flag is an extension for glibc which sets the O_CLOEXEC flag,
so the file handle is not leaked to child processes. It is not needed
here.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It was only defined for Windows builds.
Use also false instead of 0 to set the default value of
two boolean config variables.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Remove unneeded include statements for host.h, add required ones and
update the comments for the remaining include statements.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The modifications were done using this command:
run-clang-tidy-8.py -header-filter='.*' -checks='-*,modernize-loop-convert' -fix
Then the resulting code was cleaned manually.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The modifications were done using this command:
run-clang-tidy-8.py -header-filter='.*' -checks='-*,modernize-use-auto' -fix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The modifications were done using this command:
run-clang-tidy-8.py -header-filter='.*' -checks='-*,modernize-use-override' -fix
Signed-off-by: Stefan Weil <sw@weilnetz.de>
clang warnings:
src/ccutil/unicharcompress.cpp:172:27: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
src/lstm/recodebeam.cpp:129:29: warning: comparison of integers of different signs: 'std::__cxx1998::vector::size_type' (aka 'unsigned long') and 'int' [-Wsign-compare]
src/lstm/recodebeam.cpp:276:48: warning: comparison of integers of different signs: 'std::__cxx1998::vector::size_type' (aka 'unsigned long') and 'int' [-Wsign-compare]
unittest/imagedata_test.cc:101:21: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
unittest/linlsq_test.cc:33:23: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
unittest/linlsq_test.cc:44:23: warning: comparison of integers of different signs: 'int' and 'std::__cxx1998::vector::size_type' (aka 'unsigned long') [-Wsign-compare]
unittest/nthitem_test.cc:27:23: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
unittest/nthitem_test.cc:68:21: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
unittest/stats_test.cc:26:23: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The modified definition avoids warnings caused by redundant semicolons.
Now a semicolon is required when using the macro, so a few code locations
had to be updated.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
- pass in ParamsVectors from Tesseract
(carrying values from langdata/config/api)
into LSTMRecognizer::Load and LoadDictionary
- after LSTMRecognizer's Dict is initialised
(with default values), reset the variables
user_{words,patterns}_{suffix,file} from the
corresponding entries in the passed vector
This requires libarchive-dev.
Tesseract can now load traineddata files in any of the archive formats
which are supported by libarchive. Example of a zipped BagIt archive:
$ unzip -l /usr/local/share/tessdata/zip.traineddata
Archive: /usr/local/share/tessdata/zip.traineddata
Length Date Time Name
--------- ---------- ----- ----
55 2019-03-05 15:27 bagit.txt
0 2019-03-05 15:25 data/
1557 2019-03-05 15:28 manifest-sha256.txt
1082890 2019-03-05 15:25 data/eng.word-dawg
1487588 2019-03-05 15:25 data/eng.lstm
7477 2019-03-05 15:25 data/eng.unicharset
63346 2019-03-05 15:25 data/eng.shapetable
976552 2019-03-05 15:25 data/eng.inttemp
13408 2019-03-05 15:25 data/eng.normproto
4322 2019-03-05 15:25 data/eng.punc-dawg
4738 2019-03-05 15:25 data/eng.lstm-number-dawg
1410 2019-03-05 15:25 data/eng.freq-dawg
844 2019-03-05 15:25 data/eng.pffmtable
6360 2019-03-05 15:25 data/eng.lstm-unicharset
1012 2019-03-05 15:25 data/eng.lstm-recoder
1047 2019-03-05 15:25 data/eng.unicharambigs
4322 2019-03-05 15:25 data/eng.lstm-punc-dawg
16109842 2019-03-05 15:25 data/eng.bigram-dawg
80 2019-03-05 15:25 data/eng.version
6426 2019-03-05 15:25 data/eng.number-dawg
3694794 2019-03-05 15:25 data/eng.lstm-word-dawg
--------- -------
23468070 21 files
`combine_tessdata -d` and `combine_tessdata -u` also work.
The traineddata files in the new format can be generated with
standard tools like zip or tar.
More work is needed for other training tools and big endian support.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes a warning from Apple's clang compiler:
[ 34%] Building CXX object CMakeFiles/libtesseract.dir/src/ccutil/errcode.cpp.o
/Users/travis/build/stweil/tesseract/src/ccutil/errcode.cpp:83:7: warning: indirection of non-volatile null pointer will be deleted, not trap [-Wnull-dereference]
*reinterpret_cast<int*>(0) = 0;
^~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/travis/build/stweil/tesseract/src/ccutil/errcode.cpp:83:7: note: consider using __builtin_trap() or qualifying pointer with 'volatile'
Signed-off-by: Stefan Weil <sw@weilnetz.de>
gcc warnings:
src/ccmain/docqual.cpp:734:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
src/ccmain/docqual.cpp:764:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
src/ccmain/docqual.cpp:782:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
[...]
Signed-off-by: Stefan Weil <sw@weilnetz.de>