Commit Graph

224 Commits

Author SHA1 Message Date
Stefan Weil
fd30c86674 Remove endianness test (WORDS_BIGENDIAN is unused)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-20 10:47:00 +02:00
Stefan Weil
c1494fb710 Don't check for stdbool.h (only used in capi.h)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-20 10:42:28 +02:00
Stefan Weil
d4cf77c92b Don't check for limits.h (now unused)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-20 10:39:13 +02:00
Stefan Weil
a1d161326e Don't check for unused malloc.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-19 07:30:00 +02:00
Stefan Weil
ff0a7a38f7 Check compiler options depending on host cpu
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-05-27 06:52:36 +02:00
Robin Watts
f79e52a7cc NEON SIMD code.
In tests on my pi3b+, a release build of my ghostscript integration
takes 2 minutes 27 seconds to render a PDF and OCR it with the
vanilla sources. With this NEON coded added the time drops to 37
seconds.

I have not tested the configure/Makefile changes as I'm not using
them.
2020-05-20 18:54:42 +01:00
Stefan Weil
7f16162745 Fix previous commit 688f6490bb
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-12-31 09:50:07 +01:00
Stefan Weil
688f6490bb Fix broken build for pango_font_info_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-12-28 09:35:40 +01:00
amitdo
502ebe8ca9 Autotools: Pango, Cairo and ICU only required by training tools 2019-12-16 17:23:06 +02:00
Stefan Weil
39cc7b5808 automake: Improve build rules
- Use less noinst_LTLIBRARIES (saves build time and disk space)
- Move DISABLED_LEGACY_ENGINE from compiler flags to config_auto.h

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-12-03 12:22:47 +01:00
Stefan Weil
a1a139cbd2 Replace AVX_OPT, ..., AVX macros by HAVE_AVX, ... and clean related code
- Replace AVX_OPT, AVX2_OPT, FMA_OPT, SSE41_OPT
- Replace AVX, AVX2, FMA, SSE4_1
- Write new HAVE_AVX, HAVE_AVX2, HAVE_FMA, HAVE_SSE4_1 into config_auto.h
- Put related conditionals in Makefile.am in one place

This makes the code clearer and fixes a log message in
IntSimdMatrixTest.AVX2.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-28 17:51:37 +01:00
Stefan Weil
074844ce46 Show libcurl version
`tesseract --version` now also shows the version of libcurl and related
libraries if it was build with libcurl.

The preprocessor macro HAVE_LIBCURL is now defined in config_auto.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-28 16:34:52 +01:00
Stefan Weil
9ed526625a Remove compiler flag which had no effect
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-27 09:43:11 +01:00
Stefan Weil
cbd3a21cb2 automake: Flat build for src/viewer and src/wordrec
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
0cd2bdbd2b automake: Flat build for src/textord
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
558462358a automake: Flat build for src/opencl
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
6eeb486b77 automake: Flat build for src/lstm
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
7ebcc77e3b automake: Flat build for src/dict
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
6181acf367 automake: Flat build for src/cutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
159160518b automake: Flat build for src/classify
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
9730c7e167 automake: Flat build for src/ccutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
b1d449315e automake: Flat build for src/ccstruct
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
9745a9d111 automake: Flat build for src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
a166efaad6 automake: Flat build for src/arch
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
cafb1bbfd7 automake: Flat build for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
7ef20bb0e6 Use flat make for include/tesseract
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-29 12:01:51 +01:00
Stefan Weil
061eccd6ae Rename tesseract/tess_version.h -> tesseract/version.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-29 11:47:31 +01:00
amitdo
2f8884a64e Fix autotools build 2019-10-28 21:23:58 +02:00
Stefan Weil
94651e65ce Simplify configure.ac
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-01 12:32:08 +02:00
Stefan Weil
286d8275c7 Add support for image or image list by URL
This allows OCR of images from the internet without downloading them first:

    tesseract http://IMAGE_URL OUTPUT ...

It uses libcurl.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-01 12:10:45 +02:00
Zdenko Podobný
fef64d795c fix #2101 2019-07-13 20:11:03 +02:00
Stefan Weil
2d5b166876 Add dot product implementation for Intel FMA (double = tessdata_best)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-12 23:18:00 +02:00
Stefan Weil
676b18834c Fix check for icu 52.1 or newer
It detected old versions but did not disable the training build.
This completes commit 66da4df11d.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-25 10:55:33 +02:00
Stefan Weil
674d6a90d8 Remove code for embedded build
That code is unrelated to Tesseract and can be easily implemented
by external projects which require it.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-17 09:55:33 +02:00
Stefan Weil
ca885da5d3 Use C++17 compiler if possible
This allows using new features of C++17 conditionally.
Simplify also the code which checks and sets the C++ version.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-31 10:40:56 +02:00
Stefan Weil
3c9691f286 configure: Fix cross builds (check for TensorFlow header)
AC_CHECK_FILE does not work in cross builds. Such builds aborted.
Replace it by AC_CHECK_HEADERS. This fixes cross builds.

To enable TensorFlow in cross builds, more work is needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-26 17:46:44 +02:00
Raphael Graf
dacba02cd8 Do not link librt on OpenBSD 2019-05-25 18:08:55 +02:00
Stefan Weil
32dcfd06ba Replace Tensorflow by TensorFlow
The name is written in camel case, see https://www.tensorflow.org/.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 17:14:28 +02:00
Stefan Weil
2441e4d8ac Implement check for Tensorflow header file
This looks for one of the header files which are included by Tesseract.
It currently uses a hard coded path which works for Debian / Ubuntu.

Simplify also the rules for linking Tensorflow.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 16:52:14 +02:00
Stefan Weil
4382ab1a34 Support build with Tensorflow
It expects include files in /usr/include/tensorflow.

* Add configure option --with-tensorflow (disabled by default)
* Fix data type tensorflow::int64
* Remove "third_party/" in include statements
* Add dummy implementations for Backward and DebugWeights in TFNetwork
* Add files generated with protoc from tfnetwork.proto
  (so the Tensorflow sources are not needed for the build)
* Update Makefiles

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-24 14:11:31 +02:00
Stefan Weil
c926bdb265 configure: Use a hopefully more robust way to fix AX_CHECK_COMPILE_FLAG
The check for -Wno-extra-semi-stmt failed on Linux with clang++-7.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 20:21:05 +02:00
Stefan Weil
d6c1fa766c configure: Fix for clang++-8 and newer
AX_CHECK_COMPILE_FLAG fails if it is used with -Werror and the compiler
raises error -Wextra-semi-stmt:

    configure:4224: checking whether C++ compiler accepts -mavx
    configure:4243: clang++-8 -c -g -O2 -Wall -Wextra -Wpedantic -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -march=native -Werror -Wno-unused-macros -mavx  conftest.cpp >&5
    conftest.cpp:20:3: error: empty expression statement has no effect; remove unnecessary ';' to silence this warning [-Werror,-Wextra-semi-stmt]
      ;
      ^
    1 error generated.

Add -Wno-extra-semi-stmt to disable those errors if possible.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-20 10:52:39 +02:00
Stefan Weil
7917ffb6c2 configure: Fix for latest developer tools on macOS
AX_CHECK_COMPILE_FLAG fails if it is used with -Werror and the compiler
raises error -Wunused-macros. Add -Wno-unused-macros to disable those
errors if possible.

Simplify also the setting of several conditionals (AVX, AVX2, ...).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-14 22:31:23 +02:00
James R. Barlow
403361701a Fix CPPFLAGS configuration for icu4c and libarchive missing from configure.ac 2019-05-07 02:01:20 -07:00
Zdenko Podobný
3bbe4327c0 fix #2344 libpthread under-linking on FreeBSD 2019-03-27 15:37:14 +01:00
Stefan Weil
4ccbb9f830 configure: Check support of compile flags with -Werror
gcc fails if an unsupported compile flag is given, but clang and clang++
normally only emit a warning "argument unused during compilation".

The old test had accepted flags like -mavx for clang++ on non Intel hosts.
This resulted in build failures because Intel code was included.

Now the check runs with -Werror, and unsupported flags are detected as
an error. This fixes the build problem with clang++ on non Intel hosts.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-26 16:41:52 +01:00
Stefan Weil
1c7e00611b Add initial support for traineddata files in standard archive formats
This requires libarchive-dev.

Tesseract can now load traineddata files in any of the archive formats
which are supported by libarchive. Example of a zipped BagIt archive:

    $ unzip -l /usr/local/share/tessdata/zip.traineddata
    Archive:  /usr/local/share/tessdata/zip.traineddata
      Length      Date    Time    Name
    ---------  ---------- -----   ----
           55  2019-03-05 15:27   bagit.txt
            0  2019-03-05 15:25   data/
         1557  2019-03-05 15:28   manifest-sha256.txt
      1082890  2019-03-05 15:25   data/eng.word-dawg
      1487588  2019-03-05 15:25   data/eng.lstm
         7477  2019-03-05 15:25   data/eng.unicharset
        63346  2019-03-05 15:25   data/eng.shapetable
       976552  2019-03-05 15:25   data/eng.inttemp
        13408  2019-03-05 15:25   data/eng.normproto
         4322  2019-03-05 15:25   data/eng.punc-dawg
         4738  2019-03-05 15:25   data/eng.lstm-number-dawg
         1410  2019-03-05 15:25   data/eng.freq-dawg
          844  2019-03-05 15:25   data/eng.pffmtable
         6360  2019-03-05 15:25   data/eng.lstm-unicharset
         1012  2019-03-05 15:25   data/eng.lstm-recoder
         1047  2019-03-05 15:25   data/eng.unicharambigs
         4322  2019-03-05 15:25   data/eng.lstm-punc-dawg
     16109842  2019-03-05 15:25   data/eng.bigram-dawg
           80  2019-03-05 15:25   data/eng.version
         6426  2019-03-05 15:25   data/eng.number-dawg
      3694794  2019-03-05 15:25   data/eng.lstm-word-dawg
    ---------                     -------
     23468070                     21 files

`combine_tessdata -d` and `combine_tessdata -u` also work.

The traineddata files in the new format can be generated with
standard tools like zip or tar.

More work is needed for other training tools and big endian support.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-05 17:18:48 +01:00
Stefan Weil
42ea432418 configure: Check for xsltproc (needed to generate manpages)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-15 22:19:52 +01:00
Stefan Weil
fd6e281c61 Use C++14 compiler if possible
This allows using new features of C++14 conditionally.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-13 11:05:34 +01:00
Stefan Weil
b3327f4e90 Remove unneeded checks for snprintf
snprintf is a standard function which should be available
on all relevant platforms, so those checks are unnecessary.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-13 08:04:52 +01:00