AX_CHECK_COMPILE_FLAG fails if it is used with -Werror and the compiler
raises error -Wunused-macros. Add -Wno-unused-macros to disable those
errors if possible.
Simplify also the setting of several conditionals (AVX, AVX2, ...).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
gcc fails if an unsupported compile flag is given, but clang and clang++
normally only emit a warning "argument unused during compilation".
The old test had accepted flags like -mavx for clang++ on non Intel hosts.
This resulted in build failures because Intel code was included.
Now the check runs with -Werror, and unsupported flags are detected as
an error. This fixes the build problem with clang++ on non Intel hosts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
This requires libarchive-dev.
Tesseract can now load traineddata files in any of the archive formats
which are supported by libarchive. Example of a zipped BagIt archive:
$ unzip -l /usr/local/share/tessdata/zip.traineddata
Archive: /usr/local/share/tessdata/zip.traineddata
Length Date Time Name
--------- ---------- ----- ----
55 2019-03-05 15:27 bagit.txt
0 2019-03-05 15:25 data/
1557 2019-03-05 15:28 manifest-sha256.txt
1082890 2019-03-05 15:25 data/eng.word-dawg
1487588 2019-03-05 15:25 data/eng.lstm
7477 2019-03-05 15:25 data/eng.unicharset
63346 2019-03-05 15:25 data/eng.shapetable
976552 2019-03-05 15:25 data/eng.inttemp
13408 2019-03-05 15:25 data/eng.normproto
4322 2019-03-05 15:25 data/eng.punc-dawg
4738 2019-03-05 15:25 data/eng.lstm-number-dawg
1410 2019-03-05 15:25 data/eng.freq-dawg
844 2019-03-05 15:25 data/eng.pffmtable
6360 2019-03-05 15:25 data/eng.lstm-unicharset
1012 2019-03-05 15:25 data/eng.lstm-recoder
1047 2019-03-05 15:25 data/eng.unicharambigs
4322 2019-03-05 15:25 data/eng.lstm-punc-dawg
16109842 2019-03-05 15:25 data/eng.bigram-dawg
80 2019-03-05 15:25 data/eng.version
6426 2019-03-05 15:25 data/eng.number-dawg
3694794 2019-03-05 15:25 data/eng.lstm-word-dawg
--------- -------
23468070 21 files
`combine_tessdata -d` and `combine_tessdata -u` also work.
The traineddata files in the new format can be generated with
standard tools like zip or tar.
More work is needed for other training tools and big endian support.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
snprintf is a standard function which should be available
on all relevant platforms, so those checks are unnecessary.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It wrongly detects old versions of ICU as valid.
Checking with pkg-config is sufficient and also sets ICU_UC_LIBS.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The first usage of AC_CHECK_HEADERS must be unconditional,
otherwise configure fails to detect support for shared libraries.
This fixes a regression introduced by commit a07025c993.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove unneeded arguments for AC_ARG_ENABLE
* Use AS_HELP_STRING
* Use [] instead of () for default in help text
* Run AC_CHECK_HEADERS, AC_CHECK_LIB only if OpenCL support is enabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove unneeded arguments for AC_ARG_ENABLE
* Fix formatting of help text
* Remove help text for --enable-legacy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove unneeded arguments for AC_ARG_ENABLE
* Use AS_HELP_STRING
* Use [] instead of () for default in help text
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Remove unneeded arguments for AC_ARG_ENABLE (needs renaming of macro)
* Use [] instead of () for default in help text
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Instead of defining the DISABLED_LEGACY_ENGINE macro in config_auto.h
(which is not included by all source files), define it as a preprocessor
option for those parts of the code which require it.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The tesseract/ subdirectory is no longer automatically added to the
include path of the compiler. Therefore old code which used code like
#include "capi.h"
must now change that to
#include "tesseract/capi.h"
This avoids name conflicts with header files from other projects.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Not only version names like 4.0.0, but also version names like
v4.0.0 or tesseract-4.0.0 are now supported and give the same
GENERIC_MAJOR_VERSION = 4.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The newer macros that replace the obsolete ones are already present in configure.ac.
* AC_PROG_LIBTOOL -> LT_INIT
* AC_LANG_CPLUSPLUS -> AC_LANG([C++])
Building with G++ on Darwin breaks when either AVX, AVX2, or SSE4.1
compiler option is set, unless G++ is actually CLANG.
This commit allows to build with G++, by asking G++ to delegate assembly
to the clang integrated assembler, instead of the GNU one.
Commit f9157fd91d changed the rules for
the documentation, so make always tried to build it and failed if
asciidoc was missing since that commit.
Now configure tests whether asciidoc is available and builds the
documentation conditionally. It also reports that to the user.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
That macro disables automated updates when configure.ac or a Makefile.am
changes. Normally those updates are wanted because users typically
forget running ./autogen.sh.
See also the GNU documentation why AM_MAINTAINER_MODE should not be used:
https://www.gnu.org/software/automake/manual/html_node/maintainer_002dmode.html
Signed-off-by: Stefan Weil <sw@weilnetz.de>
AX_SPLIT_VERSION only works after AM_INIT_AUTOMAKE, so that macro had
to be moved.
GENERIC_MAJOR_VERSION, GENERIC_MINOR_VERSION and GENERIC_MICRO_VERSION
are now set automatically and can be used in further processing.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
While "echo -n" works on Debian GNU Linux, it fails to produce a valid
configure file on macOS, so try a different shorter solution.
Signed-off-by: Stefan Weil <sw@weilnetz.de>