It looks like those files were added accidentally
in commit fc6a390c6c.
Add them to .gitignore to avoid that from now on.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Restore support for the legacy engine
It is still needed to get text attributes which are unsupported by the
LSTM engine, and it also has better recognition rates for some texts.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* tesseractmain: Add missing 'static' attributes
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Support different help texts for normal and advanced users
The old option --help now shows a very basic help text.
The new option --help-extra shows the full help information.
It now also includes a hint that Tesseract supports lists of images.
Fix also the indentation in the PSM help and
use a more neutral text in the OEM help.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Add missing line feed in error message
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Dereference pointer after NULL check (CID 1385638)
Move the statement which dereferences the pointer variable "current"
after the NULL check.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Dereference pointer after NULL check (CID 1385635)
Move the statement which dereferences the pointer variable "current"
after the NULL check.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Dereference pointer after NULL check (CID 1385634)
Move the statement which dereferences the pointer variable "current"
after the NULL check.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
* Fix CID 1164527 'Constant' variable guards dead code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Since commit cdc35338c5 Tesseract checks
the value passed for `--oem NUM`.
That only works as expected when the old (now unused) engine mode values
for cube are removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
ccstruct/seam.cpp:66:26: warning:
array subscript has type 'char' [-Wchar-subscripts]
Fix it by using an unsigned index and use the same type for related values.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
with Visual Studio 2015 RTM:
Error C2039: 'back_inserter': is not a member of 'std'
Error C3861: 'back_inserter': identifier not found
need "iterator" with Visual Studio 2015 (vc14).
#include <iterator>
ftell returns a long value which can be negative when an error occurred.
It returns LONG_MAX for directories.
Both cases were not handled by the old code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
If unicharset_extractor was called without any argument,
a help message was printed by tesseract::ParseCommandLineFlags.
Replace that by the local help message which is better.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Now Tesseract adds a page break (normally form feed) by default.
It is still possible to suppress page breaks by setting an empty
page_separator.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The test expects to find phototest.tif and phototest.txt
in directory ../testing. Create symbolic links if those
files don't exist there.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
We cannot assume that the locale "en_US.UTF-8" is always available.
Using the "C" locale should work better.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
The library is provided in the build path (which is not
the same as the source path for out of tree builds).
Signed-off-by: Stefan Weil <sw@weilnetz.de>