Commit Graph

3161 Commits

Author SHA1 Message Date
Stefan Weil
3292484f67 Test for correct locale settings
Normal C++ programs like those which are built for tesseract automatically
set the locale "C".

There can be different locale settings if the tesseract library is used
in other software.

A wrong locale can cause wrong results from sscanf which is used at
different places in the tesseract code, so make sure that we have the
right locale settings and fail if that is not the case.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-08 17:40:10 +02:00
Shree Devi Kumar
ea7f4801ed add option for UNLV tests for spa 2018-06-08 14:28:50 +00:00
Stefan Weil
280db06bbf scanutils: Fix illegal memory access
Format strings which contain "%*s" show this error in Valgrind:

==32503== Conditional jump or move depends on uninitialised value(s)
==32503==    at 0x2B8BB0: tvfscanf(_IO_FILE*, char const*, __va_list_tag*) (scanutils.cpp:486)
==32503==    by 0x2B825A: tfscanf(_IO_FILE*, char const*, ...) (scanutils.cpp:234)
==32503==    by 0x272B01: read_unlv_file(STRING, int, int, BLOCK_LIST*) (blread.cpp:54)
==32503==    by 0x1753CD: tesseract::Tesseract::SegmentPage(STRING const*, BLOCK_LIST*, tesseract::Tesseract*, OSResults*) (pagesegmain.cpp:115)
==32503==    by 0x1363CD: tesseract::TessBaseAPI::FindLines() (baseapi.cpp:2291)
==32503==    by 0x130CF1: tesseract::TessBaseAPI::Recognize(ETEXT_DESC*) (baseapi.cpp:802)
==32503==    by 0x1322D3: tesseract::TessBaseAPI::ProcessPage(Pix*, int, char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1176)
==32503==    by 0x131A84: tesseract::TessBaseAPI::ProcessPagesMultipageTiff(unsigned char const*, unsigned long, char const*, char const*, int, tesseract::TessResultRenderer*, int) (baseapi.cpp:1013)
==32503==    by 0x132052: tesseract::TessBaseAPI::ProcessPagesInternal(char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1129)
==32503==    by 0x131B1E: tesseract::TessBaseAPI::ProcessPages(char const*, char const*, int, tesseract::TessResultRenderer*) (baseapi.cpp:1032)
==32503==    by 0x12E00C: main (tesseractmain.cpp:537)
==32503==  Uninitialised value was created by a stack allocation
==32503==    at 0x272A60: read_unlv_file(STRING, int, int, BLOCK_LIST*) (blread.cpp:41)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-08 15:28:30 +02:00
zdenop
a6623065fe
Merge pull request #1645 from Shreeshrii/unlvtests
reformat EXTRA_DIST in makefile
2018-06-08 12:59:33 +02:00
zdenop
417893db71
Merge pull request #1646 from Shreeshrii/master
remove generated files committed by error
2018-06-08 12:58:31 +02:00
zdenop
a514765865
Merge pull request #1644 from Shreeshrii/gitignore
add /src/ to api, training and vs2010 in gitignore
2018-06-08 12:57:02 +02:00
Shreeshrii
68c6b42853
modify to avoid line continuations 2018-06-08 15:17:05 +05:30
Shreeshrii
df09d0db28
delete lines relating to vs2010 2018-06-08 15:12:27 +05:30
Shree Devi Kumar
1b6815364a remove generated files commited by error 2018-06-08 08:59:20 +00:00
Shree Devi Kumar
7a9fef9685 reformat EXTRA_DIST in makefile 2018-06-08 08:32:47 +00:00
zdenop
29304a4173
Merge pull request #1642 from stweil/unlvtests
Remove some files which are generated by the UNLV test
2018-06-08 08:42:14 +02:00
zdenop
1309749a15
Merge pull request #1643 from Shreeshrii/unlv
move unlvtests ignored files to .gitignore in root dir
2018-06-08 08:40:40 +02:00
Shree Devi Kumar
f2caeb43b4 add /src/ to api, training and vs2010 2018-06-08 05:34:47 +00:00
Shree Devi Kumar
828d727135 move unlvtests ignored files to .gitignore in root dir 2018-06-08 05:29:44 +00:00
Stefan Weil
fcaf192ea3 Remove some files which are generated by the UNLV test
They don't contain useful information.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-07 20:22:08 +02:00
zdenop
51ebf8a21e
Merge pull request #1640 from stweil/unlvtests
Fix script for UNLV tests
2018-06-07 07:20:54 +02:00
Stefan Weil
bbb4658733 Fix log message of UNLV tests
We must filter unwanted output from tesseract.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-07 07:18:55 +02:00
Stefan Weil
ff3b263c5b Fix script for UNLV tests
Commit 934e612a3e added too many quotes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-07 07:12:08 +02:00
zdenop
d47cebcdc8
Merge pull request #1641 from stweil/fix
training: Add missing linefeed to error message
2018-06-06 22:13:26 +02:00
Stefan Weil
0215d91f45 training: Add missing linefeed to error message
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-06 21:32:16 +02:00
zdenop
8b3501e54f
Merge pull request #1638 from Shreeshrii/master
remove testing and testdata, use from submodule test, add unlvtests
2018-06-06 14:51:59 +02:00
Shree Devi Kumar
deea045e4a Merge branch 'master' of https://github.com/tesseract-ocr/tesseract 2018-06-06 12:26:20 +00:00
Shree Devi Kumar
2563380d51 move testing and testdata to test, add unlvtests 2018-06-06 12:20:14 +00:00
zdenop
ee2ab73224
Merge pull request #1637 from paulk124/master
Reserve extra byte in LoadDataFromFile() in case caller wants to appe…
2018-06-05 16:57:40 +02:00
Paul Kitchen
805fb7699d Reserve extra byte in LoadDataFromFile() in case caller wants to append '\0' 2018-06-05 08:19:41 -06:00
zdenop
f8b689f85f
Merge pull request #1632 from Shreeshrii/master
add test as submodule
2018-06-05 11:06:14 +02:00
zdenop
4a008d5f03
Merge pull request #1633 from stweil/fix
TFile: Relax assertion and allow FRead, FWrite with count == 0
2018-06-05 11:01:16 +02:00
Stefan Weil
52fddc3ca9 TFile: Relax assertion and allow FRead, FWrite with count == 0
The assertions introduced by commit 8bea6bcc12
were too strict. The first one failed in osd_test, the second one failed
in `tesseract IMAGE BASE --psm 13 lstm.train`.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 22:42:19 +02:00
Shree Devi Kumar
c8c9807be0 Added test repo as a submodule 2018-06-04 19:20:00 +00:00
Shree Devi Kumar
c09f13b820 Added test repo as a submodule 2018-06-04 19:19:48 +00:00
Egor Pugin
83ae900549
Merge pull request #1629 from stweil/bool
src/training: Replace more proprietary BOOL8 by standard bool data type
2018-06-04 18:54:31 +03:00
Egor Pugin
e700500d4a
Merge pull request #1628 from stweil/fix
TFile: Improve handling of potential integer overflow
2018-06-04 18:53:23 +03:00
Stefan Weil
4f3b266efe src/training: Replace more proprietary BOOL8 by standard bool data type
Update also callers of the modified functions to use
false / true instead of 0 / 1.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 16:08:03 +02:00
Stefan Weil
b292013bdc cntraining: Replace proprietary BOOL8 by standard bool data type
Add also "static" attribute to local functions and remove an old comment.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 16:08:03 +02:00
Stefan Weil
8bea6bcc12 TFile: Improve handling of potential integer overflow
Raise an assertion for unexpected arguments and use size_t instead of int
for the size argument which is typically sizeof(some_datatype).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-04 13:53:36 +02:00
Egor Pugin
45b11cd93f
Reset appveyor cache. 2018-06-04 01:59:35 +03:00
Egor Pugin
0f29cb411c
Clear appveyor cache. 2018-06-04 01:59:05 +03:00
zdenop
aa35e1ce4c
Merge pull request #1626 from stweil/bool
src/training: Replace proprietary BOOL8 by standard bool data type
2018-06-03 21:44:00 +02:00
Stefan Weil
f2698c256d src/training: Replace proprietary BOOL8 by standard bool data type
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-03 21:13:40 +02:00
zdenop
3aa9b37c81
Merge pull request #1621 from stweil/cmdline
More cleanup for handling of command line arguments
2018-06-02 09:31:10 +02:00
Stefan Weil
629ded223c tesseractmain: Allow combinations of the different help options
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
Stefan Weil
724a72a278 tesseractmain: Always use EXIT_SUCCESS and EXIT_FAILURE macros for exit status
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
Stefan Weil
b5ac8502bc tesseractmain: EXIT_FAILURE if tesseract is called without arguments
When Tesseract is called without any argument, the help message is still
printed, but the exit status no longer indicates success (EXIT_OK).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
Stefan Weil
6dba34dd8c tesseractmain: No command line options between image and outputbase
The image name and the outputbase should not be separated by
command line options.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-02 09:03:56 +02:00
zdenop
e313ed1bb9
Merge pull request #1614 from j-kubik/master
Recognition progress in C API
2018-06-02 08:54:21 +02:00
Jaroslaw Kubik
67ceae8385 Fix the progres increase test
The progress increase test must compare the input value against
the variable that contains a previous value, not against it's
initial value.
2018-06-02 02:29:42 +02:00
Jaroslaw Kubik
e05d333378 Added tests for progress reporting API
The progress reporting API is now tested using googlemock tools.
2018-06-02 00:47:34 +02:00
Jaroslaw Kubik
92168c5e8b Added googlemock building instructions
The googlemock tools are already present, so why not make use of
it. It can be usefull for testing callbacks.
2018-06-02 00:42:37 +02:00
Egor Pugin
f227f84231
Merge pull request #1619 from stweil/cmdline
tesseractmain: Fail if bad command line option is given
2018-06-01 22:01:52 +03:00
Stefan Weil
6f7206f574 tesseractmain: Remove unneeded duplicate code
The --list-langs option is already handled by other code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-06-01 20:45:53 +02:00