zdenop
f15e2cc174
fix typo
2019-11-01 14:00:22 +01:00
Stefan Weil
7e980df016
simd: Check whether the OS supports FMA, AVX, ...
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 14:00:00 +01:00
Stefan Weil
e413b9318b
classify/Makefile: Fix inconsistent style
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 13:59:33 +01:00
Egor Pugin
55b4099ad1
Export some classify vars.
2019-11-01 13:59:14 +01:00
zdenop
0d8be252cc
Remove more code for builds with disabled legacy engine
...
Now the Tesseract library no longer includes unused code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
# Conflicts:
# src/cutil/Makefile.am
# unittest/Makefile.am
2019-11-01 13:58:37 +01:00
zdenop
c9ecab8854
Move source files which are used for training only to src/training
2019-11-01 13:50:26 +01:00
Stefan Weil
b80acd81ba
OpenCL: Add static attribute for kernel_src
...
It is only used in openclwrapper.cpp.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 13:36:22 +01:00
Stefan Weil
14665dfa2c
Remove unused functions create_edges_window, draw_raw_edge
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 13:36:15 +01:00
Stefan Weil
91f0de94bc
Remove unused function truncate_path and related files
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 13:36:07 +01:00
Stefan Weil
c3d4742af6
Remove global array kPolyBlockNames from Tesseract library
...
It is only used in unittest/layout_test.cc after moving a test from
baseapi_test.cc to that file, so it can be made local.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 13:35:55 +01:00
Stefan Weil
92b460010e
cmake: Don't link pthread on Windows
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 13:00:03 +01:00
Stefan Weil
5d2265478f
universalambigs: Add hack to fix builds with Microsoft compiler
...
The MS compiler only accepts string constants up to 65535 characters,
so shorten the string for that compiler to fix the compilation.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:59:44 +01:00
Zdenko Podobný
9dd392d8b2
move fileio.cpp and fileio.h to training (this fix android build)
2019-11-01 12:59:31 +01:00
Stefan Weil
ea34763fea
universalambigs: Replace octal characters by UTF-8 string
...
This improves readability and reduces the file size.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:59:20 +01:00
Stefan Weil
a473283482
Clean ambigs.h
...
* Remove unused kUnigramAmbigsBufferSize and kAmbigNgramSeparator
* Move some declarations to ambigs.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:59:12 +01:00
Egor Pugin
8ebcea2926
Use pangocairo-1.43 for the moment. Remove private pango header.
2019-11-01 12:59:04 +01:00
Egor Pugin
49ce908e4b
Try to fix #2599
2019-11-01 12:58:57 +01:00
Stefan Weil
7fcad19286
cmake: Add missing pthread library
...
It is needed for C++ threads since commit 85068be405
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:58:42 +01:00
Stefan Weil
b21779d699
Improve formatting of hOCR output with character boxes
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:55:49 +01:00
Stefan Weil
d338681758
Use auto data type for results of std::ftell
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:44 +01:00
Stefan Weil
47c8710ac2
Remove unused filesize_ from class InputBuffer
...
This also simplifies the constructors.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:36 +01:00
Stefan Weil
e34acfeb46
Simplify shell code (fixes warning from Codacy)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:28 +01:00
Stefan Weil
8baf817192
Use long instead of off_t for result from ftell
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:21 +01:00
Stefan Weil
055f32d422
Fix training script for macOS (issue #2578 )
...
Bash on macOS does not support "|&":
tesstrain_utils.sh: line 80: syntax error near unexpected token `&'
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:14 +01:00
Stefan Weil
a469224ec1
Fix some compiler warnings (unused local variables)
...
gcc warnings:
src/classify/protos.cpp:85:7: warning: unused variable ‘i’ [-Wunused-variable]
src/classify/protos.cpp:86:7: warning: unused variable ‘Bit’ [-Wunused-variable]
src/classify/protos.cpp:89:14: warning: unused variable ‘Config’ [-Wunused-variable]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:53:06 +01:00
zdenop
5775cf0535
Implemented improved bounding box algorithm
...
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
# Conflicts:
# src/lstm/recodebeam.cpp
2019-11-01 12:52:47 +01:00
Stefan Weil
25b1a4b951
classify: Use fixed size bit vector
...
The vector was already limited to MAX_NUM_PROTOS (512) entries or 64 bytes
in the old code. Now it uses that size right from the start which avoids
reallocating it later when entries are added.
The old code which reallocated the vector to expand it was buggy because
the realloc function can return a different pointer, but the code still
used the original pointer to reset the new bits.
Function ExpandBitVector is now unused and therefore removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:46:44 +01:00
Robert Pösel
c01d230c10
Give word's bounds to callback also during second pass
2019-11-01 12:46:37 +01:00
Stefan Weil
59659ddc6e
Remove structures.*
...
It only provided the functions new_cell, free_cell which could be replaced by new, delete.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:46:02 +01:00
Stefan Weil
40b69539ff
Remove unused functions reverse16, reverse32
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:44:29 +01:00
Stefan Weil
ae6eddcc12
Remove non portable sleep by std::this_thread::sleep_for
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:44:22 +01:00
Stefan Weil
25a6fe7ba9
arch: Reduce number of include files for dot product functions
...
dotproductavx.h and dotproductsse.h declared only two functions.
Move those declarations to dotproduct.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:29:51 +01:00
Stefan Weil
2e1cd1d448
Add dot product implementation for Intel FMA (double = tessdata_best)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:29:39 +01:00
Stefan Weil
ba8e870f85
Optimize tprintf implementation
...
It no longer uses a local buffer, so it needs less memory
and no mutex.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:28:19 +01:00
Stefan Weil
75a9926f01
FPRow: Add missing initialisation for scalar (CID 1402754)
...
Modernize the code also a little bit.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:28:11 +01:00
Stefan Weil
cad3433dc8
Fix format strings for size_t arguments (CID 1402762, 1402767)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:28:03 +01:00
Stefan Weil
c2839ecfd6
Fix format string for 64 bit integer (CID 1402986)
...
Commit c1264c189e
was not the right fix.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:26:28 +01:00
Stefan Weil
595e263ceb
tfnetwork: Add missing return statement (CID 1402992)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 12:26:21 +01:00
Egor Pugin
3afc185ad4
Implement CMake+SW build.
...
Currently only Windows is supported.
You could try it as following:
mkdir build_sw && cd build_sw && cmake .. -DSW_BUILD=1
2019-11-01 12:26:09 +01:00
zhuangzhuang1988
4b4e1f1e8d
fix tesstrain.py error
2019-11-01 12:25:57 +01:00
zhuangzhuang
b8014ee1c1
fix windows stdout messy code ( #2546 )
...
* fix windows stdout messy code
* fix type name error
* remoe unnecessary codepoint check.
2019-11-01 12:25:48 +01:00
Stefan Weil
22fb70cb85
Fix handling of single pages from multipage TIFF files (issue #2537 )
...
That case now uses Leptonica to deliver the desired image instead of
using an inefficient loop in the Tesseract code.
See commit 54fafc4e2e
which used similar
code in the past.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 10:00:46 +02:00
Stefan Weil
08ca7b8416
Fix linker error with disabled legacy engine (issue #2532 )
...
Commit 3871caae86
introduced a build
regression when the legacy engine was disabled.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 10:00:46 +02:00
Stefan Weil
e53e10503a
genericvector: Remove redundant declarations
...
tesseract::FileReader and tesseract::FileWriter are already declared
in serialis.h which is included by genericvector.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 09:53:00 +02:00
Stefan Weil
f4698154b3
Revert "Replace callback by direct function calls in TessBaseAPI::GetComponentImages"
...
This reverts commit 1a44ce3178
.
It removed global symbols, so the binary API was incompatible.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 07:54:15 +02:00
Stefan Weil
792b39d5c8
Revert "Move LSTMTrainer from libtesseract to libtesseract_training"
...
This reverts commit a30d433356
.
That commit removed LSTMTrainer also from libtesseract.so which breaks
the ABI compatibility.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-06 07:41:22 +02:00
Dmitry Bely
c310fef8f0
Fix crash in Tesseract::classify_word_and_language() when tessedit_timing_debug is enabled
2019-07-05 10:00:48 +03:00
Stefan Weil
d8494f3215
Revert "Simplify indirect call of LMPainPoints::GeneratePainPoint"
...
This reverts commit 6a0fc4f89f
.
It removed global symbols, so the binary API was incompatible.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-02 06:39:53 +02:00
Stefan Weil
1d5a320d4a
Revert "Simplify class LSTMTrainer"
...
This reverts commit 563a1717d4
.
It removed global symbols, so the binary API was incompatible.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-02 06:38:19 +02:00
Stefan Weil
4535e4605b
Update enum from unicode/uchar.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-25 14:55:03 +02:00