Stefan Weil
a32d24fa65
Remove empty tessbox.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-24 19:45:12 +02:00
Stefan Weil
91522dfba5
Remove memry.h from public API
...
It is no longer needed by genericvector.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-23 21:15:54 +02:00
Stefan Weil
1a151781ea
Clean some include statements
...
The changes are based on an analysis done with include-what-you-use.
Replace also some standard header files by the corresponding
standard C++ header files.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-23 21:15:54 +02:00
Egor Pugin
15f64e0232
Remove recursive header.
2018-06-23 17:32:42 +03:00
Stefan Weil
484a1be98a
Remove unneeded include statements for scanutils.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-22 19:16:08 +02:00
Stefan Weil
11f2b12fda
Remove arch header files from public API
...
The arch header files are only used in the Tesseract code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 21:46:48 +02:00
Stefan Weil
2bafff4c64
Remove LSTM header files from public API
...
The LSTM header files are only used in the Tesseract code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 21:46:48 +02:00
Stefan Weil
1371980f9f
Replace string.h by standard C++ cstring
...
Remove the unneeded include statement in platform.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 20:40:26 +02:00
Stefan Weil
112aeb9826
Clean usage of assert.h
...
Remove unneeded include statements, remove conditional statements and
replace the remaining assert.h by their standard C++ variant cassert.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 19:31:05 +02:00
Stefan Weil
a9e2574eff
Remove public API file ndminx.h
...
It is not needed for the Tesseract code, and the Tesseract API
should not provide MIN / MAX macros.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 08:33:30 +02:00
Stefan Weil
0cb128d56b
Remove errcode.h from public API
...
It is no longer needed by genericvector.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-21 06:20:26 +02:00
Stefan Weil
44450094c3
Replace ASSERT_HOST in genericvector.h
...
genericvector.h used a mix of assert and ASSERT_HOST.
By using assert only, it does no longer depend on errcode.h
which defines the ASSERT_HOST macro.
Other files which still use ASSERT_HOST now need an explicit
include statement for errcode.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 22:32:17 +02:00
Stefan Weil
2a5a092469
Fix CID 1393241 (Dereference null return value)
...
Add also some error handling if fopen fails.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 21:17:02 +02:00
Stefan Weil
09976e6125
Fix CID 1393238 (Dereference null return value)
...
Add also some error handling if fopen fails.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 21:17:02 +02:00
Stefan Weil
27a5908a55
Fix CID 1393239 (Dereference null return value)
...
Add also some error handling if fopen fails.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 21:17:02 +02:00
Stefan Weil
f482ebdca1
Fix CID 1393243 (Uninitialized scalar field)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 20:06:28 +02:00
Stefan Weil
2ceb200186
Fix CID 1393244 and CID 1393244 (Uninitialized scalar variable)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 19:28:04 +02:00
Stefan Weil
d6391ee811
Fix CID 1393540 (Explicit null dereferenced)
...
Coverity Scan does not like incrementing of a null pointer,
so increment an index value instead of a pointer.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 17:32:02 +02:00
Stefan Weil
e87e8967d7
Remove more header files from public API
...
Install only those headers which are needed by third party applications.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-20 11:54:38 +02:00
Stefan Weil
c1c87d73ee
Require tesseract/ for API header files (fixes potential name conflicts)
...
The tesseract/ subdirectory is no longer automatically added to the
include path of the compiler. Therefore old code which used code like
#include "capi.h"
must now change that to
#include "tesseract/capi.h"
This avoids name conflicts with header files from other projects.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-17 22:01:19 +02:00
Amit D
6f85de22bc
WordFontAttributes: Check that word != nullptr earlier. Fix #1665
2018-06-13 23:38:27 +03:00
Egor Pugin
8b64602a86
Merge pull request #1660 from Shreeshrii/master
...
Change default width for images output by text2image
2018-06-11 14:23:22 +03:00
Shreeshrii
a27e91c4f9
Update tesstrain_utils.sh
2018-06-11 09:35:14 +05:30
Shreeshrii
fdc243b363
Change default width for images output by text2image
...
Fixes
Image too large to learn!! Size = 2594x48
Image not trainable
See https://github.com/tesseract-ocr/tesseract/issues/590#issuecomment-271244655
for related discussion
2018-06-11 09:34:07 +05:30
Stefan Weil
fcdcba70f4
Remove some header files from public API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-10 16:19:58 +02:00
Stefan Weil
5812972775
block_edges: Add assertions for block coordinates
...
Check whether the top right point of the block is inside of the
thresholded image t_pix. Otherwise the following code would make
illegal memory accesses.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-09 14:06:33 +02:00
Egor Pugin
cd58a861d9
Merge pull request #1653 from stweil/typo
...
scanutils: Fix typos in comments
2018-06-09 11:00:22 +03:00
Stefan Weil
a709018e94
capi: Fix regression caused by use of bool data type
...
Commit 87d33b6c9e
added code which uses bool.
Therefore stdbool.h must be included for compilations with a C compiler.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-09 08:45:45 +02:00
Stefan Weil
02277bed34
scanutils: Fix typos in comments
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-09 07:53:20 +02:00
zdenop
e7c1e0739c
Merge pull request #1649 from stweil/locale
...
Test for correct locale settings
2018-06-08 19:02:38 +02:00
Stefan Weil
3292484f67
Test for correct locale settings
...
Normal C++ programs like those which are built for tesseract automatically
set the locale "C".
There can be different locale settings if the tesseract library is used
in other software.
A wrong locale can cause wrong results from sscanf which is used at
different places in the tesseract code, so make sure that we have the
right locale settings and fail if that is not the case.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-08 17:40:10 +02:00
Stefan Weil
280db06bbf
scanutils: Fix illegal memory access
...
Format strings which contain "%*s" show this error in Valgrind:
==32503== Conditional jump or move depends on uninitialised value(s)
==32503== at 0x2B8BB0: tvfscanf(_IO_FILE*, char const*, __va_list_tag*) (scanutils.cpp:486)
==32503== by 0x2B825A: tfscanf(_IO_FILE*, char const*, ...) (scanutils.cpp:234)
==32503== by 0x272B01: read_unlv_file(STRING, int, int, BLOCK_LIST*) (blread.cpp:54)
==32503== by 0x1753CD: tesseract::Tesseract::SegmentPage(STRING const*, BLOCK_LIST*, tesseract::Tesseract*, OSResults*) (pagesegmain.cpp:115)
==32503== by 0x1363CD: tesseract::TessBaseAPI::FindLines() (baseapi.cpp:2291)
==32503== by 0x130CF1: tesseract::TessBaseAPI::Recognize(ETEXT_DESC*) (baseapi.cpp:802)
==32503== by 0x1322D3: tesseract::TessBaseAPI::ProcessPage(Pix*, int, char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1176)
==32503== by 0x131A84: tesseract::TessBaseAPI::ProcessPagesMultipageTiff(unsigned char const*, unsigned long, char const*, char const*, int, tesseract::TessResultRenderer*, int) (baseapi.cpp:1013)
==32503== by 0x132052: tesseract::TessBaseAPI::ProcessPagesInternal(char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1129)
==32503== by 0x131B1E: tesseract::TessBaseAPI::ProcessPages(char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1032)
==32503== by 0x12E00C: main (tesseractmain.cpp:537)
==32503== Uninitialised value was created by a stack allocation
==32503== at 0x272A60: read_unlv_file(STRING, int, int, BLOCK_LIST*) (blread.cpp:41)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-08 15:28:30 +02:00
zdenop
d47cebcdc8
Merge pull request #1641 from stweil/fix
...
training: Add missing linefeed to error message
2018-06-06 22:13:26 +02:00
Stefan Weil
0215d91f45
training: Add missing linefeed to error message
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-06 21:32:16 +02:00
zdenop
ee2ab73224
Merge pull request #1637 from paulk124/master
...
Reserve extra byte in LoadDataFromFile() in case caller wants to appe…
2018-06-05 16:57:40 +02:00
Paul Kitchen
805fb7699d
Reserve extra byte in LoadDataFromFile() in case caller wants to append '\0'
2018-06-05 08:19:41 -06:00
Stefan Weil
52fddc3ca9
TFile: Relax assertion and allow FRead, FWrite with count == 0
...
The assertions introduced by commit 8bea6bcc12
were too strict. The first one failed in osd_test, the second one failed
in `tesseract IMAGE BASE --psm 13 lstm.train`.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 22:42:19 +02:00
Egor Pugin
83ae900549
Merge pull request #1629 from stweil/bool
...
src/training: Replace more proprietary BOOL8 by standard bool data type
2018-06-04 18:54:31 +03:00
Stefan Weil
4f3b266efe
src/training: Replace more proprietary BOOL8 by standard bool data type
...
Update also callers of the modified functions to use
false / true instead of 0 / 1.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 16:08:03 +02:00
Stefan Weil
b292013bdc
cntraining: Replace proprietary BOOL8 by standard bool data type
...
Add also "static" attribute to local functions and remove an old comment.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 16:08:03 +02:00
Stefan Weil
8bea6bcc12
TFile: Improve handling of potential integer overflow
...
Raise an assertion for unexpected arguments and use size_t instead of int
for the size argument which is typically sizeof(some_datatype).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 13:53:36 +02:00
Stefan Weil
f2698c256d
src/training: Replace proprietary BOOL8 by standard bool data type
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-03 21:13:40 +02:00
Stefan Weil
629ded223c
tesseractmain: Allow combinations of the different help options
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
Stefan Weil
724a72a278
tesseractmain: Always use EXIT_SUCCESS and EXIT_FAILURE macros for exit status
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
Stefan Weil
b5ac8502bc
tesseractmain: EXIT_FAILURE if tesseract is called without arguments
...
When Tesseract is called without any argument, the help message is still
printed, but the exit status no longer indicates success (EXIT_OK).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
Stefan Weil
6dba34dd8c
tesseractmain: No command line options between image and outputbase
...
The image name and the outputbase should not be separated by
command line options.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
zdenop
e313ed1bb9
Merge pull request #1614 from j-kubik/master
...
Recognition progress in C API
2018-06-02 08:54:21 +02:00
Stefan Weil
6f7206f574
tesseractmain: Remove unneeded duplicate code
...
The --list-langs option is already handled by other code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-01 20:45:53 +02:00
Stefan Weil
d4ed0f841a
tesseractmain: Fail if bad command line option is given
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-01 20:04:35 +02:00
Jaroslaw Kubik
e6c9967b83
Fixed a typo in progres monitor C API
...
TessMonitorcDelete -> TessMonitorDelete
2018-06-01 19:42:28 +02:00