Commit Graph

4218 Commits

Author SHA1 Message Date
Egor Pugin
8ebcea2926 Use pangocairo-1.43 for the moment. Remove private pango header. 2019-11-01 12:59:04 +01:00
Egor Pugin
49ce908e4b Try to fix #2599 2019-11-01 12:58:57 +01:00
Egor Pugin
f522b51b90 [sw] Install tess headers. 2019-11-01 12:58:49 +01:00
Stefan Weil
7fcad19286 cmake: Add missing pthread library
It is needed for C++ threads since commit 85068be405.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:58:42 +01:00
Stefan Weil
d6a1e2ddb9 cmake: Add missing include directory for LibArchive
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:58:36 +01:00
Egor Pugin
a2dd6bf35b [appveyor] Disable VS2019 image because it's too slow. 2019-11-01 12:58:26 +01:00
Egor Pugin
5541a3d502 Update appveyor.yml 2019-11-01 12:56:19 +01:00
Stefan Weil
b21779d699 Improve formatting of hOCR output with character boxes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:55:49 +01:00
Stefan Weil
d338681758 Use auto data type for results of std::ftell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:44 +01:00
Stefan Weil
47c8710ac2 Remove unused filesize_ from class InputBuffer
This also simplifies the constructors.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:36 +01:00
Stefan Weil
e34acfeb46 Simplify shell code (fixes warning from Codacy)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:28 +01:00
Stefan Weil
8baf817192 Use long instead of off_t for result from ftell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:21 +01:00
Stefan Weil
055f32d422 Fix training script for macOS (issue #2578)
Bash on macOS does not support "|&":

    tesstrain_utils.sh: line 80: syntax error near unexpected token `&'

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:14 +01:00
Stefan Weil
a469224ec1 Fix some compiler warnings (unused local variables)
gcc warnings:

    src/classify/protos.cpp:85:7: warning: unused variable ‘i’ [-Wunused-variable]
    src/classify/protos.cpp:86:7: warning: unused variable ‘Bit’ [-Wunused-variable]
    src/classify/protos.cpp:89:14: warning: unused variable ‘Config’ [-Wunused-variable]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:06 +01:00
zdenop
5775cf0535 Implemented improved bounding box algorithm
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>

# Conflicts:
#	src/lstm/recodebeam.cpp
2019-11-01 12:52:47 +01:00
Stefan Weil
25b1a4b951 classify: Use fixed size bit vector
The vector was already limited to MAX_NUM_PROTOS (512) entries or 64 bytes
in the old code. Now it uses that size right from the start which avoids
reallocating it later when entries are added.

The old code which reallocated the vector to expand it was buggy because
the realloc function can return a different pointer, but the code still
used the original pointer to reset the new bits.

Function ExpandBitVector is now unused and therefore removed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:46:44 +01:00
Robert Pösel
c01d230c10 Give word's bounds to callback also during second pass 2019-11-01 12:46:37 +01:00
Egor Pugin
574586a8d0 Update appveyor.yml 2019-11-01 12:46:19 +01:00
Stefan Weil
59659ddc6e Remove structures.*
It only provided the functions new_cell, free_cell which could be replaced by new, delete.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:46:02 +01:00
Egor Pugin
5912204d62 [appveyor] Enable artifacts.
Though they will be with some sw artifacts.
2019-11-01 12:44:49 +01:00
zhuangzhuang1988
4bc94da148 fix cmake warning. 2019-11-01 12:44:36 +01:00
Stefan Weil
40b69539ff Remove unused functions reverse16, reverse32
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:44:29 +01:00
Stefan Weil
ae6eddcc12 Remove non portable sleep by std::this_thread::sleep_for
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:44:22 +01:00
Egor Pugin
09837a60dc [appveyor] Print sw version for reference. 2019-11-01 12:30:24 +01:00
Zdenko Podobný
5e3772cad8 fix #2101 2019-11-01 12:30:15 +01:00
Egor Pugin
e4936adfa3 Update appveyor.yml 2019-11-01 12:30:08 +01:00
Egor Pugin
3cf4895737 [build][sw] Disable FMA dotproduct. 2019-11-01 12:30:01 +01:00
Stefan Weil
25a6fe7ba9 arch: Reduce number of include files for dot product functions
dotproductavx.h and dotproductsse.h declared only two functions.
Move those declarations to dotproduct.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:29:51 +01:00
Stefan Weil
2e1cd1d448 Add dot product implementation for Intel FMA (double = tessdata_best)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:29:39 +01:00
zdenop
27af9e883d use Ubuntu Xenial for travis 2019-11-01 12:29:32 +01:00
zdenop
838b6476f9 Give info about expected leptonica dependencies (fix #2333) 2019-11-01 12:29:24 +01:00
Stefan Weil
ba8e870f85 Optimize tprintf implementation
It no longer uses a local buffer, so it needs less memory
and no mutex.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:28:19 +01:00
Stefan Weil
75a9926f01 FPRow: Add missing initialisation for scalar (CID 1402754)
Modernize the code also a little bit.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:28:11 +01:00
Stefan Weil
cad3433dc8 Fix format strings for size_t arguments (CID 1402762, 1402767)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:28:03 +01:00
Stefan Weil
c2839ecfd6 Fix format string for 64 bit integer (CID 1402986)
Commit c1264c189e was not the right fix.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:26:28 +01:00
Stefan Weil
595e263ceb tfnetwork: Add missing return statement (CID 1402992)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:26:21 +01:00
Egor Pugin
cc1486d270 [cmake] Hide unnecessary find_package. 2019-11-01 12:26:15 +01:00
Egor Pugin
3afc185ad4 Implement CMake+SW build.
Currently only Windows is supported.
You could try it as following:

    mkdir build_sw && cd build_sw && cmake .. -DSW_BUILD=1
2019-11-01 12:26:09 +01:00
theirix
5688c26b03 Avoid using experimental C++14/17 support in CMake
This commit points CMAKE_CXX_STANDARD to the latest non-experimental standard.

CMake announces C++14 and C++17 support even if the
compiler supports it only experimentally (c++1y and c++1z).
It breaks cmake standard detection and requires workarounds
for old compilers.
2019-11-01 12:26:03 +01:00
zhuangzhuang1988
4b4e1f1e8d fix tesstrain.py error 2019-11-01 12:25:57 +01:00
zhuangzhuang
b8014ee1c1 fix windows stdout messy code (#2546)
* fix windows stdout messy code

* fix type name error

* remoe unnecessary  codepoint check.
2019-11-01 12:25:48 +01:00
zdenop
d93346ffef cmake: do not report unused-command-line-argument for clan released target 2019-11-01 12:25:36 +01:00
Zdenko Podobný
5280bbcade 4.1.0 Release 2019-07-07 14:34:08 +02:00
Stefan Weil
22fb70cb85 Fix handling of single pages from multipage TIFF files (issue #2537)
That case now uses Leptonica to deliver the desired image instead of
using an inefficient loop in the Tesseract code.

See commit 54fafc4e2e which used similar
code in the past.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 10:00:46 +02:00
Stefan Weil
08ca7b8416 Fix linker error with disabled legacy engine (issue #2532)
Commit 3871caae86 introduced a build
regression when the legacy engine was disabled.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 10:00:46 +02:00
Stefan Weil
48641b0791 Remove outdated build information for Android
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 10:00:46 +02:00
Stefan Weil
e53e10503a genericvector: Remove redundant declarations
tesseract::FileReader and tesseract::FileWriter are already declared
in serialis.h which is included by genericvector.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 09:53:00 +02:00
Stefan Weil
f4698154b3 Revert "Replace callback by direct function calls in TessBaseAPI::GetComponentImages"
This reverts commit 1a44ce3178.
It removed global symbols, so the binary API was incompatible.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 07:54:15 +02:00
Stefan Weil
792b39d5c8 Revert "Move LSTMTrainer from libtesseract to libtesseract_training"
This reverts commit a30d433356.

That commit removed LSTMTrainer also from libtesseract.so which breaks
the ABI compatibility.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 07:41:22 +02:00
zdenop
b101d58621
Merge pull request #2543 from db4/4.1
Fix crash in Tesseract::classify_word_and_language()
2019-07-05 12:35:10 +02:00