Commit Graph

19 Commits

Author SHA1 Message Date
Zdenko Podobný
5d22fdfeed replace deprecated C++ headers (reported by clan-tidy) - partially supersedes PR #1605 2018-09-18 18:51:11 +02:00
Stefan Weil
be1393b1e8 Replace macro MINGW by __MINGW32__
MINGW is no longer used and now removed from configure.ac.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-04 16:05:27 +02:00
Noah Metzger
663be426f6 Added the option for character accumulated glyph confidences.
The parameter glyph_confidences is changed from bool to int.
An execution with value 1 outputs the hOCR file enriched with glyph confidences
for every timestep like before. An execution with value 2 outputs the timesteps
accumulated over the recognized characters.

Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-08-20 10:43:58 +02:00
Noah Metzger
91c7504a35 Added a feature to enrich the hOCR output with glyph confidences
By using the parameter -c glyph_confidences=true the user is able to enrich
the hOCR output with additional information. Tesseract then lists additionally
the timesteps with all glyphs that were considered with their confidence
for every timestep of the LSTM.

The format of the hOCR output is slightly changed: There is now a linebreak
after every word for better readability by humans.

Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-07-25 18:18:58 +02:00
Stefan Weil
55f0ca5842 Add missing include statements and clean some include statements
The changes are based on an analysis done with include-what-you-use.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-07 16:24:53 +02:00
Stefan Weil
d2febafdcd Fix compiler warnings [-Wmissing-prototypes]
Add missing include statements, add missing "static" qualifiers or
remove functions which are not used at all.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-05 16:03:02 +02:00
Stefan Weil
a74d467e90 Fix compiler warnings [-Wcomma]
clang warnings:

src/api/baseapi.cpp:1642:18: warning:
 possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1642:31: warning:
 possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1642:45: warning:
 possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1652:16: warning:
 possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1652:30: warning:
 possible misuse of comma operator here [-Wcomma]
src/api/baseapi.cpp:1662:17: warning:
 possible misuse of comma operator here [-Wcomma]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-05 12:07:04 +02:00
Amit D
62c7b796da
Merge branch 'master' into disable-legacy 2018-07-04 11:14:33 +03:00
amitdo
aa9f4b4861 Add an option to compile tesseract without the code of the legacy OCR engine 2018-07-03 18:49:42 +03:00
Stefan Weil
f7b61891bc Replace macro PI by macro M_PI
One definition for pi is sufficient.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-02 21:26:53 +02:00
Stefan Weil
e8e94d372c Fix CID 1340287 (Unchecked return value)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-01 07:54:11 +02:00
Stefan Weil
a49b8f1d21 Fix CID 1297960 (Dereference after null check)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-01 07:54:11 +02:00
Stefan Weil
86eb4dfcdc Fix CID 1164646 (Uninitialized pointer field)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-01 07:54:11 +02:00
Stefan Weil
a32d24fa65 Remove empty tessbox.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-24 19:45:12 +02:00
Stefan Weil
1371980f9f Replace string.h by standard C++ cstring
Remove the unneeded include statement in platform.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 20:40:26 +02:00
Stefan Weil
27a5908a55 Fix CID 1393239 (Dereference null return value)
Add also some error handling if fopen fails.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 21:17:02 +02:00
Stefan Weil
3292484f67 Test for correct locale settings
Normal C++ programs like those which are built for tesseract automatically
set the locale "C".

There can be different locale settings if the tesseract library is used
in other software.

A wrong locale can cause wrong results from sscanf which is used at
different places in the tesseract code, so make sure that we have the
right locale settings and fail if that is not the case.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-08 17:40:10 +02:00
Alexander Zaitsev
d54d7486b4 Use std::max/std::min instead of MAX/MIN macros. 2018-05-20 17:49:48 +03:00
Egor Pugin
e95ff1159e Move sources into src dir. Update build scripts. 2018-04-25 11:02:54 +03:00