Commit Graph

366 Commits

Author SHA1 Message Date
Ray Smith
dc8745e6fd Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe. 2017-07-14 11:14:23 -07:00
Ray Smith
7588540296 Removed changes from last commit that didn't belong 2017-07-14 11:08:26 -07:00
Ray Smith
3ec11bd37a Deleted some dead LSTM code, making everything use the recoder 2017-07-14 10:58:21 -07:00
Ray Smith
da03e4e910 Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion 2017-07-14 09:30:14 -07:00
Justin Hotchkiss Palermo
f057938069 fix filenames in comments 2017-07-02 17:35:47 -04:00
Justin Hotchkiss Palermo
1d862a54bd Add new line to a few error messages. 2017-07-01 08:40:57 -04:00
Stefan Weil
1cf8fe51a0 Remove mathfix.h
It was only needed for MS Visual Studio 2012 and older.
Those compilers are not supported for Tesseract.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-05 20:26:25 +02:00
zdenop
ffb1ec3535 Merge pull request #918 from rfschtkt/issue529
Issue529
2017-05-13 19:33:46 +02:00
Raf Schietekat
b4cf46697f Issue #529: inside main() use return rather than exit 2017-05-13 18:02:00 +02:00
Stefan Weil
84396707a8 Fix crash if output file could not be opened
This error case results in fout_ == nullptr.
Closing a nullptr file is not allowed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 17:27:07 +02:00
zdenop
29f3de9be1 Merge pull request #914 from stweil/clean
Clean code
2017-05-13 12:45:57 +02:00
Stefan Weil
5dc4af62fb baseapi: Simplify code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 12:14:29 +02:00
Stefan Weil
78142593d2 Fix order of destructor calls for DawgCache and TessBaseAPI
TessBaseAPI must release its cache use before DawgCache is destroyed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 11:35:30 +02:00
Stefan Weil
f37f858c99 main: Fix two memory leaks
When Tesseract terminates by calling the exit function,
the destructor of any local auto variable is not called.

Fix two cases by using static variables.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 21:15:52 +02:00
Stefan Weil
5e3665c6ae Remove most libtiff dependencies
libtiff is no longer needed for OpenCL, so remove that dependency.

It is still suggested for Windows to redirect warning messages
from the tesseract executable to the console.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 10:15:35 +02:00
Raf Schietekat
c335508e84 Fewer g++ -Wsign-compare warnings 2017-05-11 23:14:52 +02:00
zdenop
64994a2707 Merge pull request #900 from rfschtkt/cast
Reviewed uses of reinterpret_cast
2017-05-11 16:08:12 +02:00
Raf Schietekat
8aa0a2dd48 RAII: *::GetUNLVText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
1dab23916f RAII: *::GetBoxText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
b7b68a65dd RAII: *::GetTSVText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
a1fff874b4 RAII: *::GetHOCRText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
986970d6ca RAII: pdfrenderer.cpp: pdftext 2017-05-11 02:02:37 +02:00
Raf Schietekat
3c6e18ecf9 RAII: pdfrenderer.cpp: buffer 2017-05-11 02:02:37 +02:00
Raf Schietekat
936ca00c44 RAII: pdfrenderer.cpp: cidtogidmap 2017-05-11 02:02:37 +02:00
Raf Schietekat
2772f78170 RAII: LTRResultIterator::GetUTF8Text 2017-05-11 02:02:37 +02:00
Raf Schietekat
f75665c34f RAII: TessBaseAPI::GetUTF8Text() 2017-05-11 02:02:37 +02:00
Raf Schietekat
4840c65bf0 RAII: ResultIterator::GetUTF8Text(): was leaked inside TessBaseAPI::GetUTF8Text() 2017-05-11 02:02:37 +02:00
Raf Schietekat
3983d2f76a Reviewed uses of reinterpret_cast 2017-05-11 01:58:40 +02:00
Egor Pugin
0afd5939b1 Use NDEBUG macro instead of DEBUG. 2017-05-08 13:01:22 +03:00
Ray Smith
6ac31dcbdd Fixed DetectOS so it doesn't crash with a big image 2017-05-03 15:50:31 -07:00
Stefan Weil
c1d649ebbc api: Replace Tesseract data types by POSIX data types
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-02 18:21:44 +02:00
Stefan Weil
aea0d9a8d5 api: Remove unneeded NULL checks
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-04-30 19:23:24 +02:00
Stefan Weil
1c59914b61 Use Leptonica struct names L_Compressed_Data, Pix
The Tesseract project prefers that names, so fix the remaining exceptions.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-04-30 10:50:12 +02:00
Ray Smith
7a116ce8bb More formatting fixes from clang tidy 2017-04-28 13:38:32 -07:00
Ray Smith
77015526fa Jeff's fixes to pdf rendering 2017-04-28 13:38:13 -07:00
zdenop
13b7900ebf Merge pull request #778 from cjmayo/singleopts
tidy tesseract(1) adding missing options
2017-04-28 18:58:40 +02:00
Ray Smith
1cc511188d Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here. 2017-04-27 15:48:23 -07:00
James R. Barlow
f54577e6be Fix #786 - 3.05 linkage fails on macOS Sierra with --enable-opencl
Also needed for 4.00.
2017-04-10 22:22:49 -07:00
Jeff Breidenbach
9038faf436 Better escaping for PDF title; fixes #636 2017-04-02 19:01:16 +02:00
Igor Pylypiv
cea24b7e44 Remove redundant condition from TessBaseAPI::AdaptToWordStr()
Expression (wordstr[w] != '\0') is always true if (wordstr[w] == ' ') is true.
2017-03-23 22:55:40 -07:00
Chris Mayo
b231aee212 tidy tesseract(1) adding missing options
Together with:
- fix "C\++"
- align executable --print-parameters message
2017-03-23 20:02:50 +00:00
Stefan Weil
7b33dad059 api: Remove unused variables
This fixes a compiler warning:

api/baseapi.cpp:1621:17: warning:
 variable 'font_name' set but not used [-Wunused-but-set-variable]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-03-08 07:38:46 +01:00
Stefan Weil
cd925fd812 Fix indentation after conditional [-Wmisleading-indentation]
The indentation is wrong since commit
fd0683f9e0 and results in a gcc warning:

api/baseapi.cpp: In member function 'bool tesseract::TessBaseAPI::ProcessPagesMultipageTiff(const l_uint8*, size_t, const char*, const char*, int, tesseract::TessResultRenderer*, int)':
api/baseapi.cpp:986:5: warning: this 'if' clause does not guard... [-Wmisleading-indentation]
     if (tessedit_page_number >= 0)
     ^~
api/baseapi.cpp:988:7: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if'
       pix = (data) ? pixReadMemFromMultipageTiff(data, size, &offset)
       ^~~

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-03-07 19:05:40 +01:00
Jeff Breidenbach
fd0683f9e0 remove obsolete OpenCl code from TessBaseAPI::ProcessPagesMultipageTiff; fixes #635 2017-01-29 16:43:10 +01:00
Ray Smith
f566a45b30 clang-tidy changes from sync 2017-01-25 16:20:19 -08:00
Ray Smith
a1c22fb0d0 Fixed issue #557 2017-01-25 16:05:59 -08:00
Ray Smith
ca16a08c10 Removed dead TODO 2017-01-25 15:54:11 -08:00
Zdenko Podobný
8ce58ac458 Fix C-API 2017-01-21 07:40:54 +01:00
James R. Barlow
bf638b9202 Fix PDF syntax error: "XObject" instead of "/XObject" when textonly_pdf=false 2017-01-20 13:36:38 -08:00
Zdenko Podobný
effa5741e6 Implement invisible text only for PDF 2017-01-20 21:26:34 +01:00