tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-11-29 06:09:04 +08:00

Author	SHA1	Message	Date
Ray Smith	84920b92b3	Font and classifier output structure cleanup. Font recognition was poor, due to forcing a 1st and 2nd choice at a character level, when the total score for the correct font is often correct at the word level, so allowed the propagation of a full set of fonts and scores to the word recognizer, which can now decide word level fonts using the scores instead of simple votes. Change precipitated a cleanup of output data structures for classifier results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few extra elements going in UnicharRating, and using that wherever possible. That added the extra complexity of 1-rating due to a flip between 0 is good and 0 is bad for the internal classifier scores before they are converted to rating and certainty.	2015-05-12 17:24:34 -07:00
theraysmith@gmail.com	99edf4ccbd	Refactored classifier to make it easier to add new ones and generalized feature extractor to allow fx from grey git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@873 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2013-09-23 15:15:06 +00:00
theraysmith@gmail.com	c7cef53ee3	Fixed issue 669 git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@743 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-09-21 15:20:35 +00:00
david.eger@gmail.com	71b3200625	Fix a shapetable serialization issue -- sizeof(bool) is not portable. See http://code.google.com/p/tesseract-ocr/issues/detail?id=669 git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@720 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-04-17 00:00:26 +00:00
theraysmith@gmail.com	5bc5e2a0b4	Added simultaneous multi-language capability, Added support for ShapeTable in classifier and training, Refactored class pruner, Added new uniform classifier API, Added new training error counter git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@650 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-02-02 02:57:42 +00:00

5 Commits