Commit Graph

4860 Commits

Author SHA1 Message Date
Stefan Weil
15b3596ec4 Optimize LSTM code for builds without OpenMP
The constant value kNumThreads is not only used to configure the number
of threads but also to allocate vectors used in those threads.

There is only a single thread without OpenMP, so it is sufficient to
allocate vectors with only one element in that case.

Replace also the upper limit in the for loops by the known vector size.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-22 10:13:53 +02:00
zdenop
5a06417eb2 Merge pull request #937 from stweil/fix
UNICHARSET: Add missing initialization
2017-05-19 21:16:44 +02:00
Stefan Weil
fb863c97a9 UNICHARSET: Add missing initialization
The member variable default_sid_ was used without being initialized.

Valgrind report for `tesseract --oem 1 hello.png hello`:

    Conditional jump or move depends on uninitialised value(s)
       at 0x14352E: BITS16::set_bit(unsigned char, unsigned char) (bits16.h:50)
       by 0x143E27: WERD::set_flag(WERD_FLAGS, unsigned char) (werd.h:129)
       by 0x27D053: WERD_RES::SetupWordScript(UNICHARSET const&) (pageres.cpp:381)
       by 0x27CAFD: WERD_RES::SetupForRecognition(UNICHARSET const&, tesseract::Tesseract*, Pix*, int, TBOX const*, bool, bool, bool, ROW*, BLOCK const*) (pageres.cpp:316)
       by 0x145903: tesseract::Tesseract::SetupWordPassN(int, tesseract::WordData*) (control.cpp:182)
       by 0x145780: tesseract::Tesseract::SetupAllWordsPassN(int, TBOX const*, char const*, PAGE_RES*, GenericVector<tesseract::WordData>*) (control.cpp:168)
       by 0x146293: tesseract::Tesseract::recog_all_words(PAGE_RES*, ETEXT_DESC*, TBOX const*, char const*, int) (control.cpp:336)
       by 0x12F356: tesseract::TessBaseAPI::Recognize(ETEXT_DESC*) (baseapi.cpp:878)
       by 0x13036D: tesseract::TessBaseAPI::ProcessPage(Pix*, int, char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1184)
       by 0x13014A: tesseract::TessBaseAPI::ProcessPagesInternal(char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1140)
       by 0x12FBCE: tesseract::TessBaseAPI::ProcessPages(char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1040)
       by 0x12C3DF: main (tesseractmain.cpp:515)
     Uninitialised value was created by a heap allocation
       at 0x4C2C21F: operator new(unsigned long) (vg_replace_malloc.c:334)
       by 0x12D88B: tesseract::TessBaseAPI::Init(char const*, int, char const*, tesseract::OcrEngineMode, char**, int, GenericVector<STRING> const*, GenericVector<STRING> const*, bool, bool (*)(STRING const&, GenericVector<char>*)) (baseapi.cpp:320)
       by 0x12D6DA: tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, GenericVector<STRING> const*, GenericVector<STRING> const*, bool) (baseapi.cpp:284)
       by 0x12C088: main (tesseractmain.cpp:440)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-19 20:57:39 +02:00
zdenop
da1254dd58 Merge pull request #936 from stweil/opt
Reduce number of new / delete operations
2017-05-19 20:36:58 +02:00
Stefan Weil
e6d683923c Reduce number of new / delete operations for class LanguageModel
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-19 18:31:37 +02:00
Stefan Weil
562de89728 Reduce number of new / delete operations for class KDTreeSearch
Add also several TODO comments because it is not clear why expensive
FLOAT64 calculations are used instead of cheaper FLOAT32 ones.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-19 18:30:11 +02:00
Egor Pugin
95bf30def1 Update README.md 2017-05-19 17:10:52 +03:00
Egor Pugin
baf6cfe9ec Merge pull request #935 from stweil/coverity
README: Add Coverity badge
2017-05-19 17:10:23 +03:00
Stefan Weil
edeb0a4502 README: Add Coverity badge
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-19 16:08:42 +02:00
zdenop
84db453d3a Merge pull request #934 from stweil/opencl
opencl: Remove more unused code
2017-05-19 11:24:49 +02:00
Stefan Weil
df36c85f26 opencl: Remove more unused code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-19 10:11:14 +02:00
zdenop
f5814974b0 Merge pull request #932 from stweil/dockerfile
Fix and improve Dockerfile
2017-05-19 09:21:22 +02:00
Stefan Weil
defb399657 Fix and improve Dockerfile
* Add comment.
* Add missing packages cmake, curl.
* Update bundler (fixes warning).
* Only clone the latest release of travis-build (save time and disk space).
* Remove empty line at end of file.

Signed-off-by: Stefan Weil <stefan@v2201612906741603.powersrv.de>
2017-05-18 21:35:53 +02:00
zdenop
482cd82ca6 Merge pull request #930 from stweil/opt
EquationDetect: Remove unneeded new / delete operations
2017-05-18 08:59:20 +02:00
Stefan Weil
fef5972d23 EquationDetect: Remove unneeded new / delete operations
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-18 07:39:36 +02:00
zdenop
8bd2fa7a4b Merge pull request #927 from stweil/inttypes
Fix and clean ccutils/host.h
2017-05-17 19:52:28 +02:00
Stefan Weil
e05f4c677d Remove obsolete comments and unused code from ccutil/host.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-17 11:55:00 +02:00
Stefan Weil
3a6a8d70fc Replace Standard C library header files by C++ header files
Replacing inttypes.h by cinttypes fixes a problem with glibc < 2.18:
In older inttypes.h, the standard C format macros are only defined for
C++ when the macro __STDC_FORMAT_MACROS is set.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-17 11:49:43 +02:00
Egor Pugin
697f842317 Merge pull request #925 from stweil/opt
genericvector: Small optimizations
2017-05-17 00:40:23 +03:00
Stefan Weil
0ba202f6ed Remove unneeded null pointer check
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-16 22:58:10 +02:00
Stefan Weil
46ca83071e genericvector: Add overloaded LoadDataFromFile
Several code locations call that method with a normal C string,
so overload it to accept that without a conversion to a STRING
object. This saves unneeded new / memcpy / delete operations.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-16 22:57:46 +02:00
Egor Pugin
852b678314 Merge pull request #922 from stweil/automake
automake: Enable all warnings and fix a warning
2017-05-16 00:52:34 +03:00
Stefan Weil
5d60444f40 automake: Enable all warnings and fix a warning
Fix this automake warning for java/Makefile.am:

    java/Makefile.am:67: warning: user target 'clean' defined here ...
    automake: ... overrides Automake target 'clean' defined here
    java/Makefile.am:67: consider using clean-local instead of clean

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-15 20:08:54 +02:00
Egor Pugin
2a4483da4c Merge pull request #920 from stweil/fix
Improve robustness of TessdataManager
2017-05-14 23:38:12 +03:00
Stefan Weil
079d6b9161 Improve robustness of TessdataManager
Tesseract crashes with an unhandled exception (std::bad_alloc) if it gets
a bad tessdata file where the numEntries data field is very large (also
after swapping), for example 0x77777777.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-14 21:33:56 +02:00
zdenop
ffb1ec3535 Merge pull request #918 from rfschtkt/issue529
Issue529
2017-05-13 19:33:46 +02:00
zdenop
5c8f88b964 Merge pull request #917 from stweil/crashfix
Fix crash if output file could not be opened
2017-05-13 19:30:37 +02:00
zdenop
8b939b0cef Merge pull request #915 from stweil/tessdatamanager
Remove unused methods from Tessdatamanager
2017-05-13 19:29:42 +02:00
Raf Schietekat
b4cf46697f Issue #529: inside main() use return rather than exit 2017-05-13 18:02:00 +02:00
Raf Schietekat
9a5ed19cf6 Issue #529: cleanup 2017-05-13 18:01:45 +02:00
Stefan Weil
84396707a8 Fix crash if output file could not be opened
This error case results in fout_ == nullptr.
Closing a nullptr file is not allowed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 17:27:07 +02:00
Stefan Weil
db8750e94e Remove unused method TessdataManager::LoadFileLater
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 13:14:47 +02:00
Stefan Weil
65b839e1aa Remove unused method TessdataManager::OverwriteEntry
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 13:14:47 +02:00
zdenop
6bebe71749 Merge pull request #910 from stweil/opt
Fix GenericVector and optimize some code which used GenericVector::init_to_size
2017-05-13 12:53:40 +02:00
zdenop
29f3de9be1 Merge pull request #914 from stweil/clean
Clean code
2017-05-13 12:45:57 +02:00
zdenop
4e93259a80 Merge pull request #912 from stweil/leak
main: Fix two memory leaks and fix order of destructor calls
2017-05-13 12:44:38 +02:00
zdenop
81ad09ba97 Merge pull request #913 from stephengroat/patch-1
test brew HEAD installs
2017-05-13 12:42:09 +02:00
Stefan Weil
5dc4af62fb baseapi: Simplify code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 12:14:29 +02:00
Stefan Weil
69296f8d18 Clean method UNICHARSET::add_script
It increased the script_table too early, so the last element was never
used.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 11:53:43 +02:00
Stefan Weil
78142593d2 Fix order of destructor calls for DawgCache and TessBaseAPI
TessBaseAPI must release its cache use before DawgCache is destroyed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 11:35:30 +02:00
Stephen
b4b14061ff shouldn't rely on different install 2017-05-12 14:14:08 -07:00
Stephen
14baca38b1 test brew installs but allow failures 2017-05-12 14:02:39 -07:00
Stefan Weil
f37f858c99 main: Fix two memory leaks
When Tesseract terminates by calling the exit function,
the destructor of any local auto variable is not called.

Fix two cases by using static variables.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 21:15:52 +02:00
zdenop
af212af89f Merge pull request #909 from stweil/include
ccutil: Remove unneeded include statement
2017-05-12 16:22:24 +02:00
Stefan Weil
3a67ff930e Optimize code by replacing init_to_size with resize_no_init
There is no need to initialize memory with a fixed value which is
overwritten in the next step.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 14:34:55 +02:00
Stefan Weil
bb2348bbbe genericvector: Fix and optimize function LoadDataFromFile
It's not necessary to initialize the vector with 0,
because the initial values are read from file.

Fix also an assertion when trying to read an empty file.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 14:15:54 +02:00
Stefan Weil
80f51c3758 ccutil: Remove unneeded include statement
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 14:11:21 +02:00
zdenop
21e739ca2e Merge pull request #907 from stweil/no-tiff
Remove most libtiff dependencies
2017-05-12 12:49:36 +02:00
Stefan Weil
5e3665c6ae Remove most libtiff dependencies
libtiff is no longer needed for OpenCL, so remove that dependency.

It is still suggested for Windows to redirect warning messages
from the tesseract executable to the console.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 10:15:35 +02:00
zdenop
2b373d1cca Merge pull request #896 from rfschtkt/toomanywarnings
Too many warnings!
2017-05-12 08:46:12 +02:00