Commit Graph

182 Commits

Author SHA1 Message Date
Stefan Weil
7917ffb6c2 configure: Fix for latest developer tools on macOS
AX_CHECK_COMPILE_FLAG fails if it is used with -Werror and the compiler
raises error -Wunused-macros. Add -Wno-unused-macros to disable those
errors if possible.

Simplify also the setting of several conditionals (AVX, AVX2, ...).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-14 22:31:23 +02:00
James R. Barlow
403361701a Fix CPPFLAGS configuration for icu4c and libarchive missing from configure.ac 2019-05-07 02:01:20 -07:00
Zdenko Podobný
3bbe4327c0 fix #2344 libpthread under-linking on FreeBSD 2019-03-27 15:37:14 +01:00
Stefan Weil
4ccbb9f830 configure: Check support of compile flags with -Werror
gcc fails if an unsupported compile flag is given, but clang and clang++
normally only emit a warning "argument unused during compilation".

The old test had accepted flags like -mavx for clang++ on non Intel hosts.
This resulted in build failures because Intel code was included.

Now the check runs with -Werror, and unsupported flags are detected as
an error. This fixes the build problem with clang++ on non Intel hosts.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-26 16:41:52 +01:00
Stefan Weil
1c7e00611b Add initial support for traineddata files in standard archive formats
This requires libarchive-dev.

Tesseract can now load traineddata files in any of the archive formats
which are supported by libarchive. Example of a zipped BagIt archive:

    $ unzip -l /usr/local/share/tessdata/zip.traineddata
    Archive:  /usr/local/share/tessdata/zip.traineddata
      Length      Date    Time    Name
    ---------  ---------- -----   ----
           55  2019-03-05 15:27   bagit.txt
            0  2019-03-05 15:25   data/
         1557  2019-03-05 15:28   manifest-sha256.txt
      1082890  2019-03-05 15:25   data/eng.word-dawg
      1487588  2019-03-05 15:25   data/eng.lstm
         7477  2019-03-05 15:25   data/eng.unicharset
        63346  2019-03-05 15:25   data/eng.shapetable
       976552  2019-03-05 15:25   data/eng.inttemp
        13408  2019-03-05 15:25   data/eng.normproto
         4322  2019-03-05 15:25   data/eng.punc-dawg
         4738  2019-03-05 15:25   data/eng.lstm-number-dawg
         1410  2019-03-05 15:25   data/eng.freq-dawg
          844  2019-03-05 15:25   data/eng.pffmtable
         6360  2019-03-05 15:25   data/eng.lstm-unicharset
         1012  2019-03-05 15:25   data/eng.lstm-recoder
         1047  2019-03-05 15:25   data/eng.unicharambigs
         4322  2019-03-05 15:25   data/eng.lstm-punc-dawg
     16109842  2019-03-05 15:25   data/eng.bigram-dawg
           80  2019-03-05 15:25   data/eng.version
         6426  2019-03-05 15:25   data/eng.number-dawg
      3694794  2019-03-05 15:25   data/eng.lstm-word-dawg
    ---------                     -------
     23468070                     21 files

`combine_tessdata -d` and `combine_tessdata -u` also work.

The traineddata files in the new format can be generated with
standard tools like zip or tar.

More work is needed for other training tools and big endian support.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-05 17:18:48 +01:00
Stefan Weil
42ea432418 configure: Check for xsltproc (needed to generate manpages)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-15 22:19:52 +01:00
Stefan Weil
fd6e281c61 Use C++14 compiler if possible
This allows using new features of C++14 conditionally.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-13 11:05:34 +01:00
Stefan Weil
b3327f4e90 Remove unneeded checks for snprintf
snprintf is a standard function which should be available
on all relevant platforms, so those checks are unnecessary.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-13 08:04:52 +01:00
Stefan Weil
66da4df11d configure: Remove header check for ICU
It wrongly detects old versions of ICU as valid.
Checking with pkg-config is sufficient and also sets ICU_UC_LIBS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-02-01 10:06:34 +01:00
Stefan Weil
2ccc5810f3 Add check whether compiler supports -march=native flag
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-12-05 20:13:28 +01:00
Guillaume Gigaud
92b8833838
fix(configure) Don't add rt on Android
Library rt is included in the libc on Android: https://developer.android.com/ndk/guides/stable_apis#a3
2018-11-15 13:56:28 +01:00
zdenop
cdfb768010 move langtests and unlvtests from tesseract-ocr repository to test repository 2018-11-08 22:31:32 +01:00
zdenop
51316994cc 4.0.0 Release 2018-10-29 09:53:12 +01:00
Marco Atzeri
ebbd4e3efc fixes #426; define NOUNDEFINED for cygwin 2018-10-20 11:25:28 +02:00
zdenop
d9372662ec add "sudo ldconfig" to install instruction. fixes #1212 2018-09-29 13:33:36 +02:00
Stefan Weil
be1393b1e8 Replace macro MINGW by __MINGW32__
MINGW is no longer used and now removed from configure.ac.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-04 16:05:27 +02:00
Shree Devi Kumar
92922b421c Add langtests framework with frk example 2018-08-30 14:28:34 +00:00
Stefan Weil
b15624eb2f Fix regression (shared libraries no longer supported)
The first usage of AC_CHECK_HEADERS must be unconditional,
otherwise configure fails to detect support for shared libraries.

This fixes a regression introduced by commit a07025c993.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-21 11:06:38 +02:00
Stefan Weil
58208522f0 configure: Clean code for --enable-visibility
* Remove unneeded arguments for AC_ARG_ENABLE
* Use [] instead of () for default in help text

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
a07025c993 configure: Clean code for --enable-opencl
* Remove unneeded arguments for AC_ARG_ENABLE
* Use AS_HELP_STRING
* Use [] instead of () for default in help text
* Run AC_CHECK_HEADERS, AC_CHECK_LIB only if OpenCL support is enabled

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
0ad6e3e77f configure: Clean code for --enable-legacy
* Remove unneeded arguments for AC_ARG_ENABLE
* Fix formatting of help text
* Remove help text for --enable-legacy

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
e47a9272d7 configure: Clean code for --enable-graphics
* Remove unneeded arguments for AC_ARG_ENABLE
* Remove help text for --enable-graphics

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
cfc5ef65a2 configure: Clean code for --enable-embedded
* Remove unneeded arguments for AC_ARG_ENABLE
* Use AS_HELP_STRING
* Use [] instead of () for default in help text

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
11cafd7673 configure: Clean code for --enable-debug
* Remove unneeded arguments for AC_ARG_ENABLE (needs renaming of macro)
* Use [] instead of () for default in help text

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
11d9d8e59a configure: Remove macro AC_SYS_INTERPRETER
The macro sets interpval which is not used by Tesseract.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:19:58 +02:00
Stefan Weil
0a4edf618a configure: Remove large file support
Tesseract does not handle large files (more than 2 GiB).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:19:58 +02:00
Stefan Weil
4bbebd3f7e Remove tests for function getline
The Tesseract code does not use getline.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:19:58 +02:00
Stefan Weil
081793ff48 Fix build with legacy engine disabled
Instead of defining the DISABLED_LEGACY_ENGINE macro in config_auto.h
(which is not included by all source files), define it as a preprocessor
option for those parts of the code which require it.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-04 17:56:42 +02:00
amitdo
aa9f4b4861 Add an option to compile tesseract without the code of the legacy OCR engine 2018-07-03 18:49:42 +03:00
Stefan Weil
c1c87d73ee Require tesseract/ for API header files (fixes potential name conflicts)
The tesseract/ subdirectory is no longer automatically added to the
include path of the compiler. Therefore old code which used code like

    #include "capi.h"

must now change that to

    #include "tesseract/capi.h"

This avoids name conflicts with header files from other projects.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-17 22:01:19 +02:00
Shree Devi Kumar
2563380d51 move testing and testdata to test, add unlvtests 2018-06-06 12:20:14 +00:00
Egor Pugin
104fe7931c Move training to src. 2018-04-25 11:35:26 +03:00
Egor Pugin
e95ff1159e Move sources into src dir. Update build scripts. 2018-04-25 11:02:54 +03:00
Eric Platon
4ded0d066e Revert failed attempt to support MacPort's g++
The support will require more work, and postpone for now.
2018-04-24 08:38:17 +09:00
Eric Platon
54b048fa0d Fix wrong environment test that breaks clang++ builds.
g++ builds require extra flags rejected by clang++. The bug is that the
flags are actually added unconditionally. This commit fixes the
condition.

See https://github.com/tesseract-ocr/tesseract/pull/1474
2018-04-23 16:11:24 +09:00
Stefan Weil
3b3216e883 Support version names starting with non numeric characters
Not only version names like 4.0.0, but also version names like
v4.0.0 or tesseract-4.0.0 are now supported and give the same
GENERIC_MAJOR_VERSION = 4.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-04-22 16:43:35 +02:00
Amit D
822082eeba
configure.ac: Remove obsolete macros
The newer macros that replace the obsolete ones are already present in configure.ac.

  * AC_PROG_LIBTOOL -> LT_INIT
  * AC_LANG_CPLUSPLUS -> AC_LANG([C++])
2018-04-20 03:31:08 +03:00
Amit D
20254ae5b5
configure.ac: Update minimum required autoconf version to 2.63
This is the autoconf version shipped in RHEL/CentOS 6.
2018-04-19 19:11:43 +03:00
Amit D
cf7c88dc93
configure.ac: Check for the presence of pango 1.22.0 or higher
Tesseract's training tool text2image uses these two functions: 
pango_glyph_item_iter_init_start
pango_glyph_item_iter_next_cluster

That means it requires Pango >=1.22.0:
https://developer.gnome.org/pango/stable/api-index-1-22.html
https://developer.gnome.org/pango/stable/pango-Glyph-Storage.html#pango-glyph-item-iter-init-start
https://developer.gnome.org/pango/stable/pango-Glyph-Storage.html#pango-glyph-item-iter-next-cluster
2018-04-17 14:46:16 +03:00
Amit D
98747b37ea
configure.ac - check for the present of icu 52.1 or higher 2018-04-17 12:05:50 +03:00
Stefan Weil
c89b1129d1 configure: Remove optimize option for preprocessor
It is only used by the compiler.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-04-15 08:28:11 +02:00
Eric Platon
1642d882a7 Remove useless echo statement. 2018-04-13 10:04:00 +09:00
Eric Platon
708f55423b Add flag to build compiler options with G++ on macOS.
Building with G++ on Darwin breaks when either AVX, AVX2, or SSE4.1
compiler option is set, unless G++ is actually CLANG.

This commit allows to build with G++, by asking G++ to delegate assembly
to the clang integrated assembler, instead of the GNU one.
2018-04-13 09:39:40 +09:00
Stefan Weil
ef31eaa7d7 Don't try to build manpages if asciidoc is missing
Commit f9157fd91d changed the rules for
the documentation, so make always tried to build it and failed if
asciidoc was missing since that commit.

Now configure tests whether asciidoc is available and builds the
documentation conditionally. It also reports that to the user.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-04-09 19:07:54 +02:00
Stefan Weil
f9157fd91d configure: Don't use AM_MAINTAINER_MODE by default
That macro disables automated updates when configure.ac or a Makefile.am
changes. Normally those updates are wanted because users typically
forget running ./autogen.sh.

See also the GNU documentation why AM_MAINTAINER_MODE should not be used:
https://www.gnu.org/software/automake/manual/html_node/maintainer_002dmode.html

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-04-08 14:44:14 +02:00
Zdenko Podobný
af037c27e7 rename version.h.in because the filename is too general for distribution 2018-04-02 19:11:02 +02:00
Stefan Weil
6bbfc3b5fc Create version.h from available version information (#1432)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-27 14:32:30 +02:00
Stefan Weil
53a25713ca autoconf: Get version components from PACKAGE_VERSION (#1431)
AX_SPLIT_VERSION only works after AM_INIT_AUTOMAKE, so that macro had
to be moved.

GENERIC_MAJOR_VERSION, GENERIC_MINOR_VERSION and GENERIC_MICRO_VERSION
are now set automatically and can be used in further processing.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-27 09:46:08 +02:00
Stefan Weil
d087f20212 configure: Remove GIT_REV which is no longer used (#1416)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-25 17:20:13 +02:00
Stefan Weil
81c47288a2 configure: Use m4_esyscmd_s to suppress linefeed (fix needed for macOS) (#1401)
While "echo -n" works on Debian GNU Linux, it fails to produce a valid
configure file on macOS, so try a different shorter solution.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-18 20:15:14 +01:00