Commit Graph

274 Commits

Author SHA1 Message Date
Zdenko Podobný
d1aa61c110 fix issue 1285: reimplement option to select pdf compression 2014-09-06 09:32:22 +02:00
Ray Smith
09b439b05a Fixed issue 1241, but disabled due to making accuracy worse 2014-08-13 13:33:10 -07:00
Ray Smith
cd2653c167 Cleanup from previous changes 2014-08-12 16:12:46 -07:00
theraysmith@gmail.com
dbf6197471 Major refactor of control.cpp to enable line recognition
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
zdenop
6941bffbd2 fix typo
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1135 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-09 17:53:57 +00:00
zdenop
bce2cd5f33 enable to select pdf compression type and jpeg quality (fix issue 1263)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1134 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-08 21:18:44 +00:00
zdenop@gmail.com
6cdf70b0cf Cleanup an unused variable in ccmain/osdetect.cpp - fix issue 1229
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1133 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-04 08:29:32 +00:00
zdenop
1156098567 Add font info to hocr output - fix issue 1219
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-03 16:22:12 +00:00
theraysmith@gmail.com
97080412fd Bunch of minor bug fixes/cleanups
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1106 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:48:48 +00:00
zdenop
0e08cb0080 Make default language params message conditional on debug level: issue 1152
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1097 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 18:17:29 +00:00
theraysmith@gmail.com
d7b089fbcf Fixed some clang errors about explicit constructors and more formatting.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1085 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-28 23:10:48 +00:00
theraysmith@gmail.com
cda8e748b1 Fixed some formatting issues
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1083 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-25 01:25:42 +00:00
theraysmith@gmail.com
5d61f46332 Fixed issue 1112
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1079 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 20:13:38 +00:00
theraysmith@gmail.com
9ebff1ecb0 Fixed issue 1120
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1076 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 00:58:19 +00:00
theraysmith@gmail.com
c3166382db Fixed issue 1102
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1069 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 00:10:59 +00:00
theraysmith@gmail.com
4decc0f405 Fixed issue 1099
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1067 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 00:06:36 +00:00
theraysmith@gmail.com
f3176c2eb5 Misc fixes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1063 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-23 22:54:50 +00:00
theraysmith@gmail.com
8364f24f4b Added ability for box files to store spaces and newlines
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1060 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-23 22:52:05 +00:00
theraysmith@gmail.com
7f5e5264d3 Fixed issues 1093-1097
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1048 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 23:36:24 +00:00
theraysmith@gmail.com
2fcea93846 Fixed issues 1081-1090
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1046 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 02:23:18 +00:00
zdenop
790a3da22f remove 'class IMAGE;'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1045 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 23:32:23 +00:00
theraysmith@gmail.com
df80e9dc59 Fixed problems with OSD that were exposed by fix to issue 979. Fixes issue 979 properly.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1043 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 19:16:42 +00:00
theraysmith@gmail.com
2ad63776e5 Fixed issue 979
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1034 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-30 02:20:59 +00:00
theraysmith@gmail.com
ad149844f0 Added polygonal block outline output to PageIterator
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1025 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-29 02:23:28 +00:00
theraysmith@gmail.com
6a10aa7985 More cleanup changes from patches
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1024 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-29 02:22:14 +00:00
theraysmith@gmail.com
d11dc049e3 Fixed a lot of compiler/clang warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
1a487252f4 Fixed slow-down that was caused by upping MAX_NUM_CLASSES
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1013 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:12:35 +00:00
zdenop@gmail.com
71ae509354 fix for mingw32/g++ 4.8.1
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@998 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:10:15 +00:00
theraysmith@gmail.com
5857bebdc8 Minor formatting changes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@992 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-17 18:54:16 +00:00
zdenop
3d1e1cc23d fix opencl build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@986 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-13 22:41:52 +00:00
zdenop@gmail.com
94d08567e1 fix vs2010 (and maybe vs2008) build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00
theraysmith@gmail.com
fa69183548 Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@966 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:06:45 +00:00
theraysmith@gmail.com
d8d9b390d1 misc fixes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@961 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:49:07 +00:00
theraysmith@gmail.com
372ceb8ef4 Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@960 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:48:51 +00:00
theraysmith@gmail.com
b0d67f1b5f Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@959 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:48:26 +00:00
theraysmith@gmail.com
4558794db7 Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@958 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:48:00 +00:00
theraysmith@gmail.com
d2ad450502 Added PDF renderer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@957 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:47:34 +00:00
theraysmith@gmail.com
e46510994f Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@956 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:46:37 +00:00
theraysmith@gmail.com
d2ae81d99b Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@955 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:45:57 +00:00
theraysmith@gmail.com
2a9171c9cc Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@954 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:43:17 +00:00
rajesh.katikam@gmail.com
b8d7a1d139 Fixed all the crashes observed on 24 bit and 8 bit images.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
zdenop
38b25b5777 fix issue 1018, 1031
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@918 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-06 22:07:46 +00:00
rajesh.katikam@gmail.com
bf0a83907b Cleaned up configure.ac and Makefile.am in multiple folder to use OPENCL paths
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@910 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-12 10:40:40 +00:00
rajesh.katikam@gmail.com
983aaabaae Initial version of OpenCL support added.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@909 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-11 17:43:13 +00:00
theraysmith@gmail.com
5728d6abf2 Added par_control.cpp
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@905 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-09 05:15:07 +00:00
theraysmith@gmail.com
7ec4fd7a56 Refactorerd control functions to enable parallel blob classification
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@904 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-08 20:30:56 +00:00
zdenop@gmail.com
77c1b41e4e fix svn:executable atribute, trailing spaces, version include
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@903 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-03 17:24:00 +00:00
zdenop@gmail.com
7e89c8d9db count lines from 1 in APPLY_BOXES error message; remove not needed file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@901 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-01 12:42:54 +00:00
theraysmith@gmail.com
4c3475ad2e Fixed fmemopen portability problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
zdenop@gmail.com
ee08f623ce fix issue 967
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@886 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-29 20:48:06 +00:00
zdenop@gmail.com
92c0ba06de fix issue 972
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@880 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 19:54:06 +00:00
theraysmith@gmail.com
4d514d5a60 Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
theraysmith@gmail.com
b0fb616299 Generalized feature extractor to allow fx from greyscale
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@875 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:19:50 +00:00
theraysmith@gmail.com
dfc1a92628 Refactored classifier to make it easier to add new ones
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@874 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:16:01 +00:00
theraysmith@gmail.com
2aafc9df24 Improved sub/superscript treatment
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@872 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:49:47 +00:00
zdenop@gmail.com
10c1169d98 remove unused code (Windows related)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
zdenop@gmail.com
7e14ade10d print error/warning messages to stderr/debug file instead of stdout (fix issue 911)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@843 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-16 20:31:37 +00:00
zdenop@gmail.com
16e80c06ee Test for empty choices at ChoiceIterator (fix issue 826)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@840 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 08:13:22 +00:00
theraysmith@gmail.com
64c739c8af Added sparse text mode, also fixed issue 653.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@820 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-03 19:06:41 +00:00
theraysmith@gmail.com
da1047f020 Fixed typos and improved comments
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@753 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:31:20 +00:00
theraysmith@gmail.com
f23460bec4 Removed config_auto.h from .h files
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@748 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:26:10 +00:00
theraysmith@gmail.com
7b90ed28d3 Fixed problem with NULL STRINGs
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@745 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:22:22 +00:00
theraysmith@gmail.com
c2dbb28376 Fixed issues 714, 608
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@744 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:21:24 +00:00
zdenop@gmail.com
60b0d10e16 fix for issue 690
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@736 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-01 21:57:49 +00:00
david.eger@gmail.com
eeeb4f513c Provide better paragraph segmentation without having to run fully
automatic layout analysis.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@725 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-10 00:03:34 +00:00
zdenop@gmail.com
e606c311f5 fix issue Issue 684 : show correct line in failure message "Couldn't find a matching blob"
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@723 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-22 20:51:00 +00:00
zdenop@gmail.com
cd8de9157c change comments to doxygen block comments (api)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@716 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-30 21:24:12 +00:00
zdenop@gmail.com
5958f01f5f fix doxygen warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@715 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-30 15:42:06 +00:00
zdenop@gmail.com
d4d4b8aad8 improve autools system (mingw+msys fix); implementation of --disable-tessdata-prefix
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@708 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-22 20:01:33 +00:00
david.eger@gmail.com
c0cd2cd605 Restore VC++ compatibility for paragraphs.cpp.
Missed a __func__ addition in the last merge.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@707 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-21 16:41:27 +00:00
david.eger@gmail.com
a91778397b Fix Issue 645, a char signed/unsigned issue in paragraphs.cpp.
When constructing our debug strings, our simple UTF-8 processing should skip all non-ASCII chars.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@706 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-20 20:19:00 +00:00
zdenop@gmail.com
2972cc426b + fix VS2008 warning about "non dll-interface class tesseract::LTRResultIterator used as base for dll-interface class tesseract::ResultIterator" by making LTRResultIterator also visible.
+ Changed Project preprocessor definition of WINDLLNAME, because stringizing operator doesn't seem to work when initializing tessedit_module_name in ccutil/ccutil.cpp (which was omitted in previous fixes).
+ Update vs2008/tesshelper.py for new public header files.
patch from Tom Powers (https://groups.google.com/group/tesseract-dev/msg/6da2799cd2cb9844)

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@702 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-08 21:15:13 +00:00
zdenop@gmail.com
2f1c112640 +Remove visibility from protected members of tesseract::TessBaseAPI class by applying TESS_LOCAL macro;
+Make PageIterator & ResultIterator classes visible by applying TESS_API macro;
+Fix api/Makefile.am & training/Makefile.am to allow Parallel Build Trees;
patch from Tom Powers (https://groups.google.com/group/tesseract-dev/msg/9d00579540e44055)

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@701 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-07 22:04:46 +00:00
zdenop@gmail.com
1455bf5610 set tessedit_module_name for windows;
implement 'make install LANG="eng ara deu"';
more headers need to be installed: https://groups.google.com/group/tesseract-dev/msg/a4f7424377993b2e


git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@700 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-06 22:41:43 +00:00
david.eger@gmail.com
75a9a8fae7 Address "RIL_PARA doesn't work" comment in issue 622.
http://code.google.com/p/tesseract-ocr/issues/detail?id=622

The core of the problem is that in PSM_SINGLE_BLOCK mode, Tesseract
doesn't run paragraph detection, so no paragraphs get generated.  Here,
we make sure that even if run in a mode where no paragraphs get
generated, we treat each block as its own paragraph.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@696 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-06 20:02:57 +00:00
zdenop@gmail.com
97e19443a3 install only necessary headers, fix uninstall
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@692 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 13:22:51 +00:00
zdenop@gmail.com
30a70142a0 visibility - autotools part (./configure --enable-visibility)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@690 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 23:51:33 +00:00
zdenop@gmail.com
e216adab43 fix configure.ac; unify identifiers (WIN32 vs _WIN32)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@688 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 17:31:24 +00:00
zdenop@gmail.com
49c4ce3183 fix for GRAPHICS_DISABLED build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@686 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-01 22:43:51 +00:00
zdenop@gmail.com
6ccab83bd6 fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING")
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@677 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-19 21:48:45 +00:00
david.eger@gmail.com
018f192fc2 Abolish populate_unichars(), fixing seg fault reported in Debian:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658634



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@675 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-15 01:37:00 +00:00
david.eger@gmail.com
22331c03ec Fix issue 613: assert() fail on Windows isspace() when given non-ASCII.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@671 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-10 01:44:36 +00:00
david.eger@gmail.com
78a8356a76 Put one last bigram correction debug statement behind a debug flag.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@669 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-09 20:08:17 +00:00
zdenop@gmail.com
d0c2631ec8 VC++2008 build fix for 3.02 version
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@665 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-03 22:23:12 +00:00
david.eger@gmail.com
56bc885721 Fix some debug messaging about bigram correction -- the two lists of
alternates are not independent.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@664 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-03 19:43:25 +00:00
theraysmith@gmail.com
3a998fe7ac Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Added paragraph detection in layout analysis/post OCR, Fixed inconsistent xheight during training and over-chopping, Added simultaneous multi-language capability, Refactored top-level word recognition module, Fixed problems with internally scaled images
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@651 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:59:49 +00:00
theraysmith@gmail.com
ac014eb27a Added experimental equation detector
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@646 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:50:01 +00:00
theraysmith@gmail.com
ef786ad29b Moved ResultIterator/PageIterator to ccmain
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@645 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:47:59 +00:00
zdenop@gmail.com
67f47008c7 fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
zdenop@gmail.com
da41b96f7f removed check for libtiff - leptonica is required; cleanup #ifdef/#ifndef HAVE_LIBLEPT
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@624 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-30 06:34:41 +00:00
joregan@gmail.com
bf4a09d72a make single/multiple libraries optional -- this needs testing!!!
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
theraysmith@gmail.com
4575c52ff5 Removed debugwin.cpp, fixing issue 448
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@613 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:45:59 +00:00
theraysmith@gmail.com
d5d15f32d7 Deleted Makefile.in from svn
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
9b7375edd6 MinGW portability solved + some code cleanup (based on cpplint)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@605 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-15 19:28:25 +00:00
zdenop@gmail.com
7ec3dca968 show page 0 for multipage tiff;
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
zdenop@gmail.com
4abdfdb8fe moved ccstruct/callcpp.cpp to cutil (to header file - see issue 414); moved vs2008/include/stdint.h to vs2008/port/stdint.h so we can use vs2008/include also for mingw; removed unused tessembedded.*
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@603 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 14:04:20 +00:00
theraysmith
3e8c0bc228 Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@567 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:44:05 +00:00
theraysmith
7121e51422 Deleted lots of dead code, including PBLOB
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@556 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:52:08 +00:00
theraysmith
137f4806b6 Added sub/superscript, small/dropcap detection
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@547 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 01:32:20 +00:00
theraysmith
c8465252e4 Rewrite of DENORM
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@538 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:05:48 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
zdenop@gmail.com
282aa13975 *.vcproj moved to vs2008/ (bin/ and bin.dbg/ will be in vs2008/)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@506 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-06 21:38:19 +00:00
joregan
e0b07948fc disabling gettext checks - not currently used, and something about disabling is causing subsequent autoconf checks to not run
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@492 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 16:27:39 +00:00
joregan
f2506871f9 move include of config_auto.h to not conflict with local types. Not finished
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@490 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 15:53:40 +00:00
joregan
9943e96163 fix issue 359 - patch from yukihiro.nakadaira
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@481 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 01:02:56 +00:00
zdenop@gmail.com
8e2018d9ec git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@473 d0cd1f9f-072b-0410-8dd7-cf729c803f20 2010-09-29 21:49:36 +00:00
joregan
2d7821506d small tweaks to doxygen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@451 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-12 18:55:59 +00:00
joregan
08defee46e more doxygen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@450 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-10 19:20:11 +00:00
joregan
575b2de48a doxygen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@446 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-28 00:38:09 +00:00
joregan
b6e3cbea5a more doxygen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@445 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 16:39:45 +00:00
joregan
924f231808 more doxygen
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@442 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 14:58:33 +00:00
joregan
a18816f839 partial merge of doxygen branch (stuff without conflicts, basically)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@441 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 13:23:23 +00:00
joregan
4acaabdb62 make some static
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@440 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-26 18:21:10 +00:00
joregan
7e8bd73aea some casts to get rid of persistent warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@435 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 21:19:53 +00:00
joregan
cd96d8ede5 more warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@434 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 18:11:00 +00:00
joregan
edf7e7694c silence more useless warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@432 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 15:11:19 +00:00
joregan
522a8ccfc4 fix issue 332
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@429 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-20 10:31:49 +00:00
joregan
54e610e7c0 mark 2 functions static (start to cut down on the export bloat)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@428 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 23:29:17 +00:00
joregan
7fee1ed025 this code was so illegible that I *must* replace it *now*
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@427 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 22:38:39 +00:00
joregan
69d6d35f28 patch for issue 304 from max.markin
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@422 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 02:32:21 +00:00
joregan
a301f9a5c7 start of i18n
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@418 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 01:59:13 +00:00
joregan
5279e34296 GRAPHICS_ENABLED means ScrollView, but the correct #define was not being set
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@407 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-27 16:03:29 +00:00
joregan
00f6c5d371 more
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@405 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-27 15:29:01 +00:00
joregan
95db341728 update comment about format
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@398 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-05 11:52:17 +00:00
joregan
cfcd9a1b5a make cppcheck happy
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@388 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-30 03:16:54 +00:00
joregan
5c8ad7ee72 add config_auto.h anywhere #ifndef GRAPHICS_DISABLED is used
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@384 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-28 12:03:45 +00:00
joregan
ddcb98565a update generated autoconf/make stuff
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@369 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-26 14:21:37 +00:00
joregan
34d8258049 use libtool
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@368 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-26 14:20:20 +00:00
joregan
38a6b18a5f disable MSVC warning C4244 in a number of places to cut down the noise
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@363 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-26 10:22:27 +00:00
theraysmith
8d654e7476 Fixed issue 243, ungraded helpers, genericvector
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@340 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-19 22:35:35 +00:00
theraysmith
57d669ff84 Fixed issue 229: lack of bits per sample
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@316 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-08-20 22:30:21 +00:00
theraysmith
9e67cb0773 More accidetal files
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@290 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:15:47 +00:00
theraysmith
eb0ab3ed02 Deleting files from ccstruct added by mistake to ccmain
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@288 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:12:51 +00:00
theraysmith
96e8b51feb More changes to ccmain for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@287 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:07:25 +00:00
theraysmith
109d1c8f21 Some changes in ccmain for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@286 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:03:51 +00:00
theraysmith
2ac934453f Improved box accuracy on failed blobs
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@270 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-06-30 01:48:21 +00:00
theraysmith
bea5e04b76 Fixed compilation with GRAPHICS_DISABLED
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@250 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-06-03 17:24:08 +00:00
theraysmith
f3060abf71 Automake changes for potential RC of 2.04
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@248 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-06-03 02:50:54 +00:00
theraysmith
e4b9281726 Fixed output of tprintf for windows
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@235 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-06-02 21:59:39 +00:00
theraysmith
51ed03368d Fixes to lists so an empty constructor is not needed + reenable debugging
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@207 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-30 18:15:44 +00:00
theraysmith
cb3b9b492f Fixed tiffio problems with 32 bit images, issue 160 and duplicates
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@204 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-24 01:02:14 +00:00
tmbdev
a978ccb68f changed runautoconf instructions
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@183 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-08-18 20:18:21 +00:00
mezhirov
3f218cd158 Bugfix (usage of bounding_union() changed)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@169 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-29 16:54:43 +00:00
theraysmith
f3e67dd89b Improved autoconf to find leptonica headers if present
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@168 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 17:34:42 +00:00
theraysmith
3cf46f21d4 Fixed stupid crash error in 2.02
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@167 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 15:42:11 +00:00
mezhirov
a4d75230fc Converted 8 spaces to tabs in two Makefile.am-s.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@166 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 14:49:14 +00:00
theraysmith
7870d67c21 Fixed name collision with jpeg library
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@157 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:32:14 +00:00
theraysmith
10265fb9cc Updated graphics output for new java-based display
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@136 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:33:18 +00:00
theraysmith
d543e8c2bc added leptonica support and additional interfaces
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@135 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:28:18 +00:00
theraysmith
830a2f54b9 Removed some compiler warnings on operator precedence
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@131 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:13:28 +00:00