tesseract

mirror of https://github.com/tesseract-ocr/tesseract.git synced 2024-12-23 23:17:49 +08:00

Author	SHA1	Message	Date
Ray Smith	b0ead95d64	Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested.	2017-07-24 11:45:57 -07:00
Stefan Weil	9929587f36	Remove extra semicolons This fixes these compiler warnings: ccmain/equationdetect.cpp:1519:2: warning: extra ‘;’ [-Wpedantic] ccstruct/blobs.cpp:65:17: warning: extra ‘;’ [-Wpedantic] ccstruct/blobs.h:178:18: warning: extra ‘;’ [-Wpedantic] ccstruct/ratngs.cpp:36:22: warning: extra ‘;’ [-Wpedantic] ccstruct/ratngs.cpp:37:22: warning: extra ‘;’ [-Wpedantic] ccutil/ambigs.cpp:46:20: warning: extra ‘;’ [-Wpedantic] ccutil/ambigs.h:137:21: warning: extra ‘;’ [-Wpedantic] cutil/structures.cpp:36:45: warning: extra ‘;’ [-Wpedantic] textord/equationdetectbase.cpp:65:2: warning: extra ‘;’ [-Wpedantic] textord/equationdetectbase.h:57:2: warning: extra ‘;’ [-Wpedantic] wordrec/lm_state.cpp:25:28: warning: extra ‘;’ [-Wpedantic] wordrec/lm_state.h:190:29: warning: extra ‘;’ [-Wpedantic] Signed-off-by: Stefan Weil <sw@weilnetz.de>	2017-07-15 12:40:34 +02:00
Stefan Weil	4a92ff5862	Fix compiler warnings for copy constructors gcc reports these warnings with -Wextra: ccstruct/pageres.h:330:3: warning: base class 'class ELIST_LINK' should be explicitly initialized in the copy constructor [-Wextra] ccstruct/ratngs.cpp:115:1: warning: base class 'class ELIST_LINK' should be explicitly initialized in the copy constructor [-Wextra] ccstruct/ratngs.h:291:3: warning: base class 'class ELIST_LINK' should be explicitly initialized in the copy constructor [-Wextra] ccutil/genericvector.h:435:3: warning: base class 'class GenericVector<WERD_RES*>' should be explicitly initialized in the copy constructor [-Wextra] Signed-off-by: Stefan Weil <sw@weilnetz.de>	2015-11-05 09:19:37 +01:00
Ray Smith	84920b92b3	Font and classifier output structure cleanup. Font recognition was poor, due to forcing a 1st and 2nd choice at a character level, when the total score for the correct font is often correct at the word level, so allowed the propagation of a full set of fonts and scores to the word recognizer, which can now decide word level fonts using the scores instead of simple votes. Change precipitated a cleanup of output data structures for classifier results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few extra elements going in UnicharRating, and using that wherever possible. That added the extra complexity of 1-rating due to a flip between 0 is good and 0 is bad for the internal classifier scores before they are converted to rating and certainty.	2015-05-12 17:24:34 -07:00
zdenop	9cf08ca8d3	fix build with -DGRAPHICS_DISABLED git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@981 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2014-01-11 23:08:54 +00:00
theraysmith@gmail.com	7ec4fd7a56	Refactorerd control functions to enable parallel blob classification git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@904 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2013-11-08 20:30:56 +00:00
theraysmith@gmail.com	4d514d5a60	Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc. git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2013-09-23 15:26:50 +00:00
zdenop@gmail.com	10c1169d98	remove unused code (Windows related) git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2013-07-08 18:21:10 +00:00
zdenop@gmail.com	cd8de9157c	change comments to doxygen block comments (api) git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@716 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-03-30 21:24:12 +00:00
david.eger@gmail.com	018f192fc2	Abolish populate_unichars(), fixing seg fault reported in Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658634 git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@675 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-02-15 01:37:00 +00:00
theraysmith@gmail.com	9206e92b0d	Added simultaneous multi-language capability, Refactored top-level word recognition module, Blamer module added for error analysis, Tidied up constraints on control parameters, Added UNICHARSET to WERD_CHOICE to make mult-language handling easier, Added word bigram correction git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@655 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2012-02-02 03:06:39 +00:00
theraysmith	82b1b201fc	Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@568 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2011-03-21 21:44:45 +00:00
zdenop@gmail.com	4523ce9f7d	3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2010-11-23 18:34:14 +00:00
joregan	a18816f839	partial merge of doxygen branch (stuff without conflicts, basically) git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@441 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2010-07-27 13:23:23 +00:00
theraysmith	903a4ffe9d	Changes to ccstruct for 3.00 git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@289 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2009-07-11 02:14:57 +00:00
theraysmith	51ed03368d	Fixes to lists so an empty constructor is not needed + reenable debugging git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@207 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2008-12-30 18:15:44 +00:00
theraysmith	c4f4840fbe	Fixed name collision with jpeg library git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@163 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2008-04-22 00:41:37 +00:00
theraysmith	ac4e0cffa2	Updated graphics output for new java-based display git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@138 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2008-02-01 00:36:18 +00:00
theraysmith	570af48b8b	Remaining changes for Unicodeization project git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@87 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2007-07-18 01:15:07 +00:00
tmbdev	425d593ebe	top-skimming import from sf.net git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk/trunk@2 d0cd1f9f-072b-0410-8dd7-cf729c803f20	2007-03-07 20:03:40 +00:00

20 Commits