Stefan Weil
83588bc7a1
Classify: Avoid unneeded new / delete operations
...
Both class variables BaselineCutoffs and CharNormCutoffs were pointers
to fixed size arrays which were allocated in the constructor and
deallocated in the destructor. These two extra allocations and two
extra deallocations can be avoided.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-04-30 19:45:50 +02:00
Ray Smith
2c837dffc3
Result of clang tidy on recent merge
2016-11-07 10:46:33 -08:00
Stefan Weil
55fde61a8f
classify: Fix typos in comments and strings
...
All of them were found by codespell.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-14 22:12:06 +02:00
Ray Smith
b1d99dfe23
Added a backup adaptive classifier to take over from primary when it fills on a large document
2015-06-12 11:10:53 -07:00
Ray Smith
d74c625e52
Fixed blob division params to fix CJK training speed.
2015-06-12 10:59:26 -07:00
Ray Smith
84920b92b3
Font and classifier output structure cleanup.
...
Font recognition was poor, due to forcing a 1st and 2nd choice at
a character level, when the total score for the correct font is often
correct at the word level, so allowed the propagation of a full set
of fonts and scores to the word recognizer, which can now decide word
level fonts using the scores instead of simple votes.
Change precipitated a cleanup of output data structures for classifier
results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few
extra elements going in UnicharRating, and using that wherever possible.
That added the extra complexity of 1-rating due to a flip between 0 is
good and 0 is bad for the internal classifier scores before they are
converted to rating and certainty.
2015-05-12 17:24:34 -07:00
theraysmith@gmail.com
28c00478c6
Removed dependence on IMAGE class
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@947 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:33:44 +00:00
theraysmith@gmail.com
7ec4fd7a56
Refactorerd control functions to enable parallel blob classification
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@904 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-08 20:30:56 +00:00
theraysmith@gmail.com
99edf4ccbd
Refactored classifier to make it easier to add new ones and generalized feature extractor to allow fx from grey
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@873 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:15:06 +00:00
theraysmith@gmail.com
59d244b06e
More fixes for GRAPHICS_DISABLED from Zdenko and Ray
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@757 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-22 00:59:31 +00:00
theraysmith@gmail.com
5bc5e2a0b4
Added simultaneous multi-language capability, Added support for ShapeTable in classifier and training, Refactored class pruner, Added new uniform classifier API, Added new training error counter
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@650 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:57:42 +00:00
theraysmith
c86a0f6892
Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@570 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:45:36 +00:00
theraysmith
eba04e7c5b
Fixed debug display, training on fragments
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@533 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:00:17 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
theraysmith
694d3f2c20
Changes to classify for 3.00
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@291 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:17:36 +00:00