Commit Graph

67 Commits

Author SHA1 Message Date
Ray Smith
dc8745e6fd Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe. 2017-07-14 11:14:23 -07:00
Ray Smith
7588540296 Removed changes from last commit that didn't belong 2017-07-14 11:08:26 -07:00
Ray Smith
3ec11bd37a Deleted some dead LSTM code, making everything use the recoder 2017-07-14 10:58:21 -07:00
Raf Schietekat
8aa0a2dd48 RAII: *::GetUNLVText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
1dab23916f RAII: *::GetBoxText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
b7b68a65dd RAII: *::GetTSVText() 2017-05-11 02:02:37 +02:00
Raf Schietekat
a1fff874b4 RAII: *::GetHOCRText() 2017-05-11 02:02:37 +02:00
Ray Smith
6ac31dcbdd Fixed DetectOS so it doesn't crash with a big image 2017-05-03 15:50:31 -07:00
Ray Smith
1cc511188d Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here. 2017-04-27 15:48:23 -07:00
zdenop
da4c064c2e Merge pull request #531 from stweil/guards
Fix header file guards and replace reserved identifiers
2016-12-15 08:29:32 +01:00
Ray Smith
9f5ba9105f Removed dependency on cube from the code 2016-12-14 10:55:15 -08:00
James R. Barlow
56b6f061cd Revise after code review 2016-12-08 15:08:48 -08:00
James R. Barlow
bc95798e01 Implement a new orientation and script detection API for C and C++
See issue #424.

The existing C API for TessBaseAPIDetectOS requires a C caller to successfully allocate struct OSResults which is actually a C++ class.  Generally it won't
be possible for a regular C compiler to do this properly.

It's also assumed that most API level users of Tesseract are only interested in Tesseract's best guess as to script and orientation, not the individual values for all possible scripts.

This introduces a new API with a better name that is more closely aligned with the output of 'tesseract -psm 0'.  Both tesseract -psm 0 and this API now share the same code in baseapi.cpp.
2016-12-07 13:21:05 -08:00
Ray Smith
d55f462c9c More clang-tidy from previous commits 2016-12-06 13:45:49 -08:00
Stefan Weil
4897796d57 Replace reserved identifiers used in #define guards header files
Use macro names as suggested by the Google C++ Style Guide
(https://google.github.io/styleguide/cppguide.html#The__define_Guard).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 15:43:03 +01:00
Stefan Weil
cefc420ddb Remove extra semicolons after member function definitions
clang++ report:
api/baseapi.h:852:4: warning:
 extra ';' after member function definition [-Wextra-semi]
[...]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 14:54:52 +01:00
Stefan Weil
85e37798cb Simplify delete operations
It is not necessary to check for null pointers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-24 17:59:13 +01:00
Ray Smith
c1c1e426b3 Added new LSTM-based neural network line recognizer 2016-11-07 15:38:07 -08:00
Ray Smith
2c837dffc3 Result of clang tidy on recent merge 2016-11-07 10:46:33 -08:00
Stefan Weil
caffb3133b Remove unneeded 'struct' from TessBaseAPI::GetHOCRText (issue #414)
It conflicts with a previous 'class' declaration for ETEXT_DESC:

include/tesseract/baseapi.h:594:21:
 Struct 'ETEXT_DESC' was previously declared as a class

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-09-05 13:17:13 +02:00
Philip Rinn
7461b61743 Fix ABI break introduced in 3.04.00, fixes #254 2016-03-08 11:35:24 +01:00
Tom Morris
6700edd8bc Cleanup TSV renderer
Remove all references to hocr, hocr.tsv, etc. Remove dead code for font
info, input filename, HTML escapes. Improved comments. Fixed
indentation.
2016-03-01 13:41:19 -05:00
Sundar M. Vaidya
d04e3259af Adds char* GetHOCRTSVText(int) as placeholder. Copy of char* GetHOCRText(int). 2016-03-01 12:13:42 -05:00
zdenop
c53add706e Merge pull request #27 from tesseract-ocr/monitor
Monitor
2016-01-05 16:28:42 +01:00
Stefan Weil
edf765b952 Remove unneeded const qualifiers
This fixes compiler warnings like this one:

api/baseapi.h:739:32: warning:
 type qualifiers ignored on function return type [-Wignored-qualifiers]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-05 06:36:42 +01:00
amitdo
6bbcb50dd9 Added osd renderer for psm 0.
Works for single page and multi-page.
2015-10-30 20:09:00 +02:00
Zdenko Podobný
545a0634da improve NO_CUBE_BUILD 2015-08-09 18:09:52 +02:00
Zdenko Podobný
71e226c44f increase version number 2015-07-21 22:46:52 +02:00
Jim O'Regan
4a6195202c fix typo 2015-05-18 12:32:36 +01:00
Zdenko Podobný
ed6ae9b974 Add monitor to GetHOCRText 2015-05-17 21:55:50 +02:00
Ray Smith
53fc4456cc Fixed issue 1252: Refactored LearnBlob and its call hierarchy to make it a member of Classify.
Eliminated the flexfx scheme for calling global feature extractor functions
through an array of function pointers.
Deleted dead code I found as a by-product.
This CL does not change BlobToTrainingSample or ExtractFeatures to be full
members of Classify (the eventual goal) as that would make it even bigger,
since there are a lot of callers to these functions.
When ExtractFeatures and BlobToTrainingSample are members of Classify they
will be able to access control parameters in Classify, which will greatly
simplify developing variations to the feature extraction process.
2015-05-12 15:22:34 -07:00
Ray Smith
0256529c1f Fixed issue 1243 2014-09-17 18:09:45 -07:00
Ray Smith
cd2653c167 Cleanup from previous changes 2014-08-12 16:12:46 -07:00
theraysmith@gmail.com
dbf6197471 Major refactor of control.cpp to enable line recognition
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
zdenop
905e6162b9 put info about (API) version; fix typo
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1117 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-22 18:31:42 +00:00
theraysmith@gmail.com
25a8c7b720 Enabled streaming input and output of multi-page documents
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop
44b0d0e28e addition to r1100
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1101 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:24:54 +00:00
zdenop
6051e40212 fix issue 1197
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1100 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:20:38 +00:00
zdenop
bdb912c186 escape input_file name in hOCR output - fix issue 1154
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1098 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 22:19:30 +00:00
theraysmith@gmail.com
2fcea93846 Fixed issues 1081-1090
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1046 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 02:23:18 +00:00
zdenop
790a3da22f remove 'class IMAGE;'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1045 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 23:32:23 +00:00
theraysmith@gmail.com
d11dc049e3 Fixed a lot of compiler/clang warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
zdenop@gmail.com
94d08567e1 fix vs2010 (and maybe vs2008) build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00
theraysmith@gmail.com
f2ec85d1e1 Added PDF renderer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@962 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:58:55 +00:00
rajesh.katikam@gmail.com
b8d7a1d139 Fixed all the crashes observed on 24 bit and 8 bit images.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
zdenop@gmail.com
b15c710385 fix declaration of ClearResults() (VC++)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@891 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 09:06:22 +00:00
zdenop@gmail.com
ee08f623ce fix issue 967
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@886 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-29 20:48:06 +00:00
theraysmith@gmail.com
88ea81c89e Added renderer to API
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@869 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:39:59 +00:00
zdenop@gmail.com
8a0878af3a fix mingw build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@856 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 08:46:57 +00:00
zdenop@gmail.com
37fb755d47 Add a command-line option (--print-parameters) to dump the parameters to stdout
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@814 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:54:14 +00:00