Commit Graph

172 Commits

Author SHA1 Message Date
Ray Smith
941d87057e Fixed training build 2015-05-13 17:46:58 -07:00
Jim O'Regan
b13691fda0 Merge conflict: going with Ray's version 2015-05-13 08:54:28 +01:00
Ray Smith
164897210a Improved newlines and spaces in a box file so it works better with RTL languages. 2015-05-12 17:51:03 -07:00
Ray Smith
84920b92b3 Font and classifier output structure cleanup.
Font recognition was poor, due to forcing a 1st and 2nd choice at
a character level, when the total score for the correct font is often
correct at the word level, so allowed the propagation of a full set
of fonts and scores to the word recognizer, which can now decide word
level fonts using the scores instead of simple votes.

Change precipitated a cleanup of output data structures for classifier
results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few
extra elements going in UnicharRating, and using that wherever possible.
That added the extra complexity of 1-rating due to a flip between 0 is
good and 0 is bad for the internal classifier scores before they are
converted to rating and certainty.
2015-05-12 17:24:34 -07:00
Zdenko Podobný
53eab2ee92 fix issue 1354 2015-04-15 22:37:58 +02:00
Ray Smith
f927728169 Fixed issue 1207 2014-10-09 13:28:03 -07:00
Ray Smith
f77d01eb7b Fixed issue 1302 2014-10-07 09:25:53 -07:00
Ray Smith
bfd2cb83d5 Fixed issue 1303 2014-10-07 09:21:17 -07:00
Zdenko Podobný
c0640a4bef fix cygwin build (issue 1289) 2014-09-28 23:19:52 +02:00
Ray Smith
d3448c37ab Fixed issue 1264 2014-09-17 18:29:32 -07:00
theraysmith@gmail.com
dbf6197471 Major refactor of control.cpp to enable line recognition
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
theraysmith@gmail.com
36b55f7710 Removed unused variable
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1140 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:10:06 +00:00
theraysmith@gmail.com
c86fe22a62 Started TFile conversion to remove fmemopen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1139 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:09:25 +00:00
zdenop
c51691fdeb add parameter info to ParamUtils::PrintParams
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1137 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-10 19:08:20 +00:00
zdenop
7239cec2b4 fix off_t issue on OSX
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1136 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-10 16:42:45 +00:00
zdenop@gmail.com
780183226c Accept Windows EOL in config file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1115 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-03 06:57:52 +00:00
theraysmith@gmail.com
97080412fd Bunch of minor bug fixes/cleanups
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1106 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:48:48 +00:00
theraysmith@gmail.com
484b47bc5d Fixed tfscanf return value with * modifier
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1087 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-29 04:30:47 +00:00
theraysmith@gmail.com
c8e27cb8f8 Fixed segfault due to partial support of * modifier in tfscanf
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1086 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-29 04:03:41 +00:00
theraysmith@gmail.com
d7b089fbcf Fixed some clang errors about explicit constructors and more formatting.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1085 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-28 23:10:48 +00:00
theraysmith@gmail.com
d748d94aae Fixed bugs in scanutils that were causing accuracy degradation
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1084 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-28 23:06:41 +00:00
theraysmith@gmail.com
cda8e748b1 Fixed some formatting issues
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1083 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-25 01:25:42 +00:00
theraysmith@gmail.com
42bfdc21d8 Fixed issue 1134
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1082 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-25 01:07:26 +00:00
theraysmith@gmail.com
61d45d2f34 Fixed issue 1133
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1080 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 21:18:00 +00:00
theraysmith@gmail.com
3a5f699013 Applied patch to refix issue 331
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1064 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-23 23:12:53 +00:00
theraysmith@gmail.com
7f5e5264d3 Fixed issues 1093-1097
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1048 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 23:36:24 +00:00
theraysmith@gmail.com
bb2e46830f Fixed issue 1075
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1029 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-29 20:28:42 +00:00
theraysmith@gmail.com
6a10aa7985 More cleanup changes from patches
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1024 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-29 02:22:14 +00:00
zdenop@gmail.com
ac5a8a871b fix windows builds (mingw and VS2010)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1017 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-26 22:39:20 +00:00
theraysmith@gmail.com
d11dc049e3 Fixed a lot of compiler/clang warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
0d93bb7cfa More code cleanup from patches and fixing warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1011 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:09:59 +00:00
theraysmith@gmail.com
5b9a7e06eb Turned on pdfrenderer functionality that needs leptonica 1.70
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1009 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-23 23:01:10 +00:00
zdenop@gmail.com
adfac4144b amend r995
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@996 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-18 09:04:35 +00:00
zdenop@gmail.com
ef3b1d936e fix mingw build issues
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@995 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-18 09:00:54 +00:00
zdenop
26f8f58042 fix android issues
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@990 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-15 22:47:37 +00:00
zdenop@gmail.com
244731fd51 revert dll-interface for class 'GenericVector<T>'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@988 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-14 09:25:45 +00:00
zdenop@gmail.com
94d08567e1 fix vs2010 (and maybe vs2008) build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00
zdenop
8299e2a605 fix linux build, remove not used folder and spec file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@979 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 23:52:04 +00:00
theraysmith@gmail.com
da20cff7ae Fixed issue 1056
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@975 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 17:24:07 +00:00
theraysmith@gmail.com
91d2265429 More minor fixes from issues and cleanup
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
f297e5d909 Misc fixes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@942 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:27:40 +00:00
theraysmith@gmail.com
086c8d50a8 Better utf8/32 conversion
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@941 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:27:17 +00:00
theraysmith@gmail.com
7dc5296fe9 Moved -v to training
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@940 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:25:12 +00:00
theraysmith@gmail.com
d09013bcbc Made params more like Google flags
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@939 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:24:06 +00:00
theraysmith@gmail.com
cae0b9392e Added swap
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@938 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:23:21 +00:00
theraysmith@gmail.com
0ea49874ee Removed redundant hashfn.cpp and repurposed hashfn.h as stl compatibility
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@937 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:22:11 +00:00
zdenop@gmail.com
e66d433907 fix issue 938: change tessdata-dir/datadir rules; implement --tessdata-dir option
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@907 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 20:59:11 +00:00
theraysmith@gmail.com
fdb1669cda Fixed srand cast
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@892 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-11 04:57:54 +00:00
theraysmith@gmail.com
4c3475ad2e Fixed fmemopen portability problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
zdenop@gmail.com
867149578c fix VC++ compatibility for variadic macros
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@884 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-25 14:02:43 +00:00