Commit Graph

4667 Commits

Author SHA1 Message Date
zdenop
12847d58ad
Merge pull request #2455 from bact/master
Unittest: Fix Thai valid text and add Thai illegal sequences
2019-05-25 18:36:17 +02:00
Raphael Graf
dacba02cd8 Do not link librt on OpenBSD 2019-05-25 18:08:55 +02:00
Stefan Weil
334d9b4633 unicodes: Optimize code by using constexpr and removing unused globals
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-25 14:51:28 +02:00
Stefan Weil
23d05a5e1b featdefs: Optimize code by using constexpr
This also fixes some warnings from clang++:

    src/classify/featdefs.cpp:47:15: warning: declaration requires a global constructor [-Wglobal-constructors]
    src/classify/featdefs.cpp:57:15: warning: declaration requires a global constructor [-Wglobal-constructors]
    src/classify/featdefs.cpp:66:15: warning: declaration requires a global constructor [-Wglobal-constructors]
    src/classify/featdefs.cpp:75:15: warning: declaration requires a global constructor [-Wglobal-constructors]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-25 14:46:36 +02:00
Stefan Weil
7628112273 Fix broken build for Leptonica < 1.77
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-25 14:23:43 +02:00
Stefan Weil
55901a480f Remove classify/cutoffs.h
It only defined CLASS_CUTOFF_ARRAY and some unused definitions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-25 13:54:44 +02:00
zdenop
82458db630 Merge branch 'master' of https://github.com/tesseract-ocr/tesseract 2019-05-25 11:14:28 +02:00
zdenop
539673b503 fix '--enable-visibility' build 2019-05-25 11:13:33 +02:00
zdenop
8de022ab1c
Merge pull request #2461 from stweil/tensorflow
Support build with Tensorflow
2019-05-25 10:52:37 +02:00
zdenop
e44c60c3b2 cmake: respect -DTESSDATA_PREFIX=/path (on linux) 2019-05-25 08:31:26 +02:00
Stefan Weil
32dcfd06ba Replace Tensorflow by TensorFlow
The name is written in camel case, see https://www.tensorflow.org/.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 17:14:28 +02:00
Stefan Weil
1ba8c97cac Fix linking of unittest with Tensorflow
This does not add Tensorflow tests. It only fixes the linker errors.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 17:08:48 +02:00
Stefan Weil
2441e4d8ac Implement check for Tensorflow header file
This looks for one of the header files which are included by Tesseract.
It currently uses a hard coded path which works for Debian / Ubuntu.

Simplify also the rules for linking Tensorflow.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 16:52:14 +02:00
Stefan Weil
9cdf041448 Remove "third_party/" in comments and update path names
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 14:12:52 +02:00
Stefan Weil
4382ab1a34 Support build with Tensorflow
It expects include files in /usr/include/tensorflow.

* Add configure option --with-tensorflow (disabled by default)
* Fix data type tensorflow::int64
* Remove "third_party/" in include statements
* Add dummy implementations for Backward and DebugWeights in TFNetwork
* Add files generated with protoc from tfnetwork.proto
  (so the Tensorflow sources are not needed for the build)
* Update Makefiles

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 14:11:31 +02:00
Zdenko Podobný
c69ee9af24 cmake: fix tiff linking to executable if tiffio.h is found 2019-05-24 11:12:39 +02:00
Zdenko Podobný
0f1e13a859 cmake: fix warning 2019-05-24 10:59:59 +02:00
Zdenko Podobný
294f548ac1 fix missing tiff format 2019-05-24 10:39:17 +02:00
Stefan Weil
3f74da5da9 lstmtrainer: Set constant kLearningRateDecay at compile time
sqrt(0.5) = 1 / sqrt(2) can be replaced by the macro M_SQRT1_2.

This also fixes a compiler warning:

    src/lstm/lstmtrainer.cpp:51:14: warning: declaration requires a global constructor [-Wglobal-constructors]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-23 15:01:23 +02:00
zdenop
4bab7dd83d
Merge pull request #2451 from Bharat123rox/lgtm
Some LGTM alert fixes and potential bugfixes
2019-05-22 12:19:44 +02:00
Egor Pugin
fea1f3970b
Merge pull request #2452 from stweil/tprintf
tprintf: Make code reentrant and use less memory
2019-05-22 12:31:34 +03:00
Egor Pugin
8f99880a7a
Merge pull request #2453 from stweil/crashcode
Remove SavePixForCrash and related code
2019-05-22 12:30:29 +03:00
bact
aac6f593f3
Update normstrngs_test.cc 2019-05-22 15:21:16 +07:00
bact
e05c5ecfcc
Fix Thai valid text and add Thai illegal sequences
- Fix a invalid sequence in "valid text" `kScriptText`
- Add two illegal sequence in `kBadlyFormedThaiWords`
2019-05-22 15:19:49 +07:00
Bharat123rox
bc3ea622a6 Fix bug in max_max_dist 2019-05-22 08:21:30 +02:00
Bharat123rox
0bf45e81e7 Fix LGTM and revert bugfix for later PR 2019-05-22 11:23:27 +05:30
Bharat123rox
945ccac85a Fix syntax error 2019-05-22 10:10:12 +05:30
Stefan Weil
6514479e8f Remove SavePixForCrash and related code
That debugging code uses very much memory and is no longer useful.

    text	   data	    bss	    dec	    hex	filename
     815	      0	 262144	 262959	  4032f	src/ccutil/globaloc.o

Remove also the function err_exit which was only used in ccmain/reject.cpp.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 20:25:58 +02:00
Stefan Weil
078a129674 tprintf: Make code reentrant and use less memory
Reduce the maximum message size from 64 KiB to 2 KiB which still should
be large enought for trace messages.

Create the smaller message on the stack instead of using a global
array to allow reentrancy and to reduce the memory use of Tesseract.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 20:22:58 +02:00
Stefan Weil
c926bdb265 configure: Use a hopefully more robust way to fix AX_CHECK_COMPILE_FLAG
The check for -Wno-extra-semi-stmt failed on Linux with clang++-7.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 20:21:05 +02:00
Bharat123rox
7f31a0634d Some LGTM fixes and potential bugfixes 2019-05-21 23:24:50 +05:30
zdenop
b96df3a33a
Merge pull request #2448 from stweil/pi
Remove local definition of M_PI
2019-05-21 11:47:51 +02:00
Stefan Weil
d2ca81e794 Remove local definition of M_PI
It is defined for all platforms when math.h or cmath is included
after defining the macro _USE_MATH_DEFINES.

Define _USE_MATH_DEFINES before any include statement to make sure
that M_PI gets defined. It is not necessary to define it conditionally
only for Windows.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-20 21:18:52 +02:00
Stefan Weil
d6c1fa766c configure: Fix for clang++-8 and newer
AX_CHECK_COMPILE_FLAG fails if it is used with -Werror and the compiler
raises error -Wextra-semi-stmt:

    configure:4224: checking whether C++ compiler accepts -mavx
    configure:4243: clang++-8 -c -g -O2 -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -march=native -Werror -Wno-unused-macros -mavx  conftest.cpp >&5
    conftest.cpp:20:3: error: empty expression statement has no effect; remove unnecessary ';' to silence this warning [-Werror,-Wextra-semi-stmt]
      ;
      ^
    1 error generated.

Add -Wno-extra-semi-stmt to disable those errors if possible.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-20 10:52:39 +02:00
zdenop
b753ff62ee
Merge pull request #2445 from stweil/errcode
Fix compiler warnings
2019-05-20 09:31:28 +02:00
Stefan Weil
64bdceee69 Fix compiler warnings
This fixes lots of warnings related to ERRCODE like the following one:

    src/ccutil/errcode.h:81:15: warning:
      declaration requires a global constructor [-Wglobal-constructors]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-19 22:10:22 +02:00
Stefan Weil
09edd1a604 Fix out-of-bounds writes in Classify::ReadNewCutoffs
The function did not correctly read Chinese unichars into the local
Class variable if the locale was set to de_DE.UTF-8 (or other
incompatible locales). That resulted in a wrong ClassId which was
used to write into the Cutoffs array without checking for valid bounds.

On macOS the result was a runtime error in baseapi_test (see GitHub
issue #1250):

    [ RUN      ] TesseractTest.InitConfigOnlyTest
    baseapi_test(21845,0x1134c45c0) malloc: *** error for object 0x927f96c28005e0: pointer being freed was not allocated
    baseapi_test(21845,0x1134c45c0) malloc: *** set a breakpoint in malloc_error_break to debug

Replacing sscanf by std::istringstream fixes that.
Add also an assertion to catch future out-of-bounds writes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-18 13:39:55 +02:00
Stefan Weil
639781b5c8 stringrenderer_test: Get system locale only once
This fixes a runtime exception on macOS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-18 13:24:13 +02:00
Stefan Weil
bb226c19ab Update abseil submodule to HEAD
Abseil suggests to use the latest code:
https://abseil.io/about/philosophy#we-recommend-that-you-choose-to-live-at-head

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-17 15:03:43 +02:00
zdenop
2308cbf87f
Merge pull request #2444 from zdenop/fix_travis
fix typo
2019-05-17 11:26:40 +02:00
zdenop
a54e345c9b fix typo 2019-05-17 11:19:07 +02:00
Zdenko Podobný
5282cdf7be another improvement for ca0be2fb72 2019-05-17 11:04:42 +02:00
Zdenko Podobný
e92a424efa try to fix ca0be2fb72 2019-05-17 10:51:06 +02:00
zdenop
af3dd1af06 Merge branch 'master' of https://github.com/tesseract-ocr/tesseract 2019-05-16 23:19:42 +02:00
zdenop
ca0be2fb72 cmake: fix travis build 2019-05-16 23:18:13 +02:00
Stefan Weil
68d7a679e4 Replace CR-LF line endings by LF
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 20:49:01 +02:00
Stefan Weil
cc754ed1e0 Remove space at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 20:49:01 +02:00
zdenop
198bbe3df5
Merge pull request #2441 from stweil/linkfix
Fix unittest build without legacy code and use locale for most unittests
2019-05-16 19:12:15 +02:00
Stefan Weil
8e7b1119b5 Run more unittests with the user's locale
Hopefully this improves the test coverage.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 18:12:55 +02:00
Stefan Weil
59e31e958b Fix more build error for compilation without legacy engine
Skip the tests which need the legacy code.
Add also code to those tests to use the user's locale to test that, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-16 18:12:55 +02:00