Commit 65504c8cd2 misplaced the #endif.
The definition of _GNU_SOURCE is only needed for Cygwin.
Defining _GNU_SOURCE on Linux results in compiler warnings because this
macro is already defined by the compiler.
Fix this by moving the #endif to the right place. In addition the code
for Cygwin is made more robust: If a future Cygwin compiler defines
_GNU_SOURCE, too, the code will still work.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
On msys2 pango seems to always returns empty string for the suggested
font. It's a good idea to check that the string is not empty before
printing it - on all platforms.
This is not strictly necessary, but recommended in the GNU autoconf manual.
No [] was added to arguments like true or false.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The different checks had set ENABLE_TRAINING unconditionally,
thus overwriting the value from the preceding checks.
So if pango and cairo were available, but icu was missing,
users would still be offered to build the training tools.
The changes for icu and has_cpp11 are not strictly necessary,
but are made here to have uniform code patterns.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It is common practice for command line programs to print
user requested information on stdout.
This seems to be reasonable for Tesseract, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
It is common practice for command line programs to show help text
on stdout. This seems to be reasonable for Tesseract, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
With hocr_char_boxes enabled in hocr output, each grapheme now gets
its own span tag, which holds the character confidence and box
coordinates. Using x_bboxes at the ocrx_word level was
inappropriate, as it was impossible to find which grapheme was
represented by each bounding box.
Add the 'hocr_char_boxes' configuration option (off by default),
which enables printing the bounding boxes of each character in the
x_bboxes property of an ocrx_word element in hocr output.
As pointed out by Stefan Weil, conditionally defining off_t using a
macro isn't a valid approach. off_t does not have a fixed size and is
used in ABI definitions (e.g. syscalls), so silently guessing its size
risks breaking the build. Additionally, all sane and modern platforms
will have off_t.