tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-12-12 15:39:04 +08:00

Author	SHA1	Message	Date
Ray Smith	8e79297dce	Final part of endian improvement. Adds big-endian support to lstm and fixes issue 518	2017-05-03 16:09:44 -07:00
Ray Smith	1cc511188d	Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here.	2017-04-27 15:48:23 -07:00
Ray Smith	84920b92b3	Font and classifier output structure cleanup. Font recognition was poor, due to forcing a 1st and 2nd choice at a character level, when the total score for the correct font is often correct at the word level, so allowed the propagation of a full set of fonts and scores to the word recognizer, which can now decide word level fonts using the scores instead of simple votes. Change precipitated a cleanup of output data structures for classifier results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few extra elements going in UnicharRating, and using that wherever possible. That added the extra complexity of 1-rating due to a flip between 0 is good and 0 is bad for the internal classifier scores before they are converted to rating and certainty.	2015-05-12 17:24:34 -07:00
theraysmith@gmail.com	dfc1a92628	Refactored classifier to make it easier to add new ones git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@874 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2013-09-23 15:16:01 +00:00
theraysmith@gmail.com	9206e92b0d	Added simultaneous multi-language capability, Refactored top-level word recognition module, Blamer module added for error analysis, Tidied up constraints on control parameters, Added UNICHARSET to WERD_CHOICE to make mult-language handling easier, Added word bigram correction git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@655 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-02-02 03:06:39 +00:00

5 Commits