Commit Graph

3171 Commits

Author SHA1 Message Date
Ray Smith
3adb03b5c8 Merge branch 'master' of https://code.google.com/p/tesseract-ocr
Why? Isn't git easier? Just updating from remote.
2014-08-13 13:36:36 -07:00
Ray Smith
09b439b05a Fixed issue 1241, but disabled due to making accuracy worse 2014-08-13 13:33:10 -07:00
Zdenko Podobný
769fef8c96 fix training tools build 2014-08-13 22:07:44 +02:00
Zdenko Podobný
3295dc29e2 improve testing whether it is possible to build trainings tools 2014-08-13 21:18:03 +02:00
Zdenko Podobný
481276f107 add .gitignore to ignore build files 2014-08-13 21:16:37 +02:00
Ray Smith
9c58701471 Fix to baselinedetect from issue 1205 2014-08-12 16:14:19 -07:00
Ray Smith
cd2653c167 Cleanup from previous changes 2014-08-12 16:12:46 -07:00
Ray Smith
736d327473 NOP changes from static analysis in issue 1205 2014-08-12 16:09:12 -07:00
theraysmith@gmail.com
dbf6197471 Major refactor of control.cpp to enable line recognition
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
theraysmith@gmail.com
e249d7bcb2 Added tesstrain.sh - a master training script
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1146 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:20:56 +00:00
theraysmith@gmail.com
c9385a2755 Added tesstrain.sh - a master training script
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1145 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:20:45 +00:00
theraysmith@gmail.com
1fc8898926 Fixed missing newlines in logging
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1144 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:20:08 +00:00
theraysmith@gmail.com
6fcede5c48 Fixed some leaks
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1143 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:19:24 +00:00
theraysmith@gmail.com
9f4d6fd668 Added ability to just list available fonts for text, and to underline words for training
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1142 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:19:06 +00:00
theraysmith@gmail.com
b64ad05096 Improved efficiency of image processing for PDF
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:15:25 +00:00
theraysmith@gmail.com
36b55f7710 Removed unused variable
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1140 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:10:06 +00:00
theraysmith@gmail.com
c86fe22a62 Started TFile conversion to remove fmemopen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1139 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:09:25 +00:00
theraysmith@gmail.com
d52231cff3 Started TFile conversion to remove fmemopen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1138 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:08:46 +00:00
zdenop
c51691fdeb add parameter info to ParamUtils::PrintParams
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1137 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-10 19:08:20 +00:00
zdenop
7239cec2b4 fix off_t issue on OSX
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1136 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-10 16:42:45 +00:00
zdenop
6941bffbd2 fix typo
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1135 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-09 17:53:57 +00:00
zdenop
bce2cd5f33 enable to select pdf compression type and jpeg quality (fix issue 1263)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1134 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-08 21:18:44 +00:00
zdenop@gmail.com
6cdf70b0cf Cleanup an unused variable in ccmain/osdetect.cpp - fix issue 1229
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1133 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-04 08:29:32 +00:00
zdenop
1156098567 Add font info to hocr output - fix issue 1219
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-03 16:22:12 +00:00
zdenop
19ddc89c44 update tesseract manpage and INSTALL.SVN
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1131 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-02 20:59:19 +00:00
zdenop
1ea387232b fix compatibility of uninstall: MacOSX rm needs -f instead of --force
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1127 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-24 20:39:30 +00:00
zdenop
5b779456f9 fix compatibility with leptonica 1.71 and 1.70
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1126 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-24 19:11:39 +00:00
zdenop
c550aee2f9 revert commit r1122 ;-)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1123 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-01 22:04:56 +00:00
zdenop
bcbfb93475 fix issue 1240
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1122 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-01 21:57:22 +00:00
zdenop
95b7783a95 fix issue 1228: bilevel pdf output - horizontal/vertical lines removed
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1118 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-23 21:04:37 +00:00
zdenop
905e6162b9 put info about (API) version; fix typo
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1117 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-22 18:31:42 +00:00
zdenop
41bd040ef5 fix issue 1043
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1116 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-08 21:26:02 +00:00
zdenop@gmail.com
780183226c Accept Windows EOL in config file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1115 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-03 06:57:52 +00:00
rajesh.katikam@gmail.com
3ff108cf45 OpenCL fix for PixMemTiff
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1114 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-02 05:42:05 +00:00
zdenop
fad9de4e1b fix issue 1217: GetThresholdedImage accesses possibly NULL thresholder_
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1113 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 21:21:37 +00:00
zdenop
e64f555567 fix Issue 1223: TessPolyBlockType enum is outdated in C-API
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1112 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 20:31:48 +00:00
zdenop
36f3f76d64 fix tiff issue on windows
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1111 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 07:27:54 +00:00
zdenop@gmail.com
84cdcb32cc fixed windows build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1110 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-26 06:48:58 +00:00
zdenop
19c4c2f0e7 fix C-API to resent C++ API changes - thanks to Nick White
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1109 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-25 21:03:11 +00:00
zdenop
ffe52737d5 check if input file exists
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1108 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-25 19:58:00 +00:00
theraysmith@gmail.com
97080412fd Bunch of minor bug fixes/cleanups
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1106 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:48:48 +00:00
theraysmith@gmail.com
25a8c7b720 Enabled streaming input and output of multi-page documents
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop
30e5220f2e fix training build for opencl and mingw
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1103 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-16 19:36:32 +00:00
zdenop
979f9cafe5 Add word recognition language to C-API - fix issue 1200
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1102 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-16 18:35:54 +00:00
zdenop
44b0d0e28e addition to r1100
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1101 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:24:54 +00:00
zdenop
6051e40212 fix issue 1197
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1100 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:20:38 +00:00
zdenop
2e520f2fac fix hocr/pdf output when image is provided from stdin - issue 1196
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1099 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 15:59:47 +00:00
zdenop
bdb912c186 escape input_file name in hOCR output - fix issue 1154
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1098 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 22:19:30 +00:00
zdenop
0e08cb0080 Make default language params message conditional on debug level: issue 1152
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1097 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 18:17:29 +00:00
zdenop
c3b6ac7f32 skip imagedata build to fix issue 1150 on Mac OS X
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1096 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-07 21:04:42 +00:00