mirror of
https://github.com/tesseract-ocr/tesseract.git
synced 2024-11-24 02:59:07 +08:00
84920b92b3
Font recognition was poor, due to forcing a 1st and 2nd choice at a character level, when the total score for the correct font is often correct at the word level, so allowed the propagation of a full set of fonts and scores to the word recognizer, which can now decide word level fonts using the scores instead of simple votes. Change precipitated a cleanup of output data structures for classifier results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few extra elements going in UnicharRating, and using that wherever possible. That added the extra complexity of 1-rating due to a flip between 0 is good and 0 is bad for the internal classifier scores before they are converted to rating and certainty. |
||
---|---|---|
.. | ||
context.cpp | ||
dawg_cache.cpp | ||
dawg_cache.h | ||
dawg.cpp | ||
dawg.h | ||
dict.cpp | ||
dict.h | ||
hyphen.cpp | ||
Makefile.am | ||
matchdefs.h | ||
permdawg.cpp | ||
stopper.cpp | ||
stopper.h | ||
trie.cpp | ||
trie.h |