Commit Graph

2747 Commits

Author SHA1 Message Date
theraysmith@gmail.com
99edf4ccbd Refactored classifier to make it easier to add new ones and generalized feature extractor to allow fx from grey
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@873 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:15:06 +00:00
theraysmith@gmail.com
2aafc9df24 Improved sub/superscript treatment
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@872 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:49:47 +00:00
theraysmith@gmail.com
96c662ed6e Improved baseline fit
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@871 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:48:16 +00:00
theraysmith@gmail.com
42144b9698 Improved baseline fit
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@870 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:43:47 +00:00
theraysmith@gmail.com
88ea81c89e Added renderer to API
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@869 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:39:59 +00:00
zdenop@gmail.com
5b9cfaf30d fix issue 962
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@866 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-28 20:59:15 +00:00
zdenop@gmail.com
b5e16669e1 fix issue 946/reopen issue 903
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@865 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-25 15:54:30 +00:00
zdenop@gmail.com
b1fd75ccf9 amend r:862
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@863 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-14 14:11:16 +00:00
zdenop@gmail.com
c45bb08a6e check inputformat before getting number of pages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@862 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-14 13:58:23 +00:00
zdenop@gmail.com
ebd0ba8134 remove unused code (tesseractmain.h)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@861 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:23:47 +00:00
zdenop@gmail.com
10c1169d98 remove unused code (Windows related)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
zdenop@gmail.com
b5d3d66a68 remove unused code(gettext)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@859 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-07 16:39:13 +00:00
zdenop@gmail.com
4c16ff6a1f use leptonica for getting number of pages instead of own code
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@858 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 16:07:25 +00:00
zdenop@gmail.com
d919bfde1e increase version number
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@857 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 09:35:42 +00:00
zdenop@gmail.com
8a0878af3a fix mingw build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@856 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 08:46:57 +00:00
zdenop@gmail.com
418a7ad16f allow to have text file with list of images as input
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@855 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-27 21:53:53 +00:00
zdenop@gmail.com
e5628e5e1a fix hOCR output - do not print empty words: issue 903
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@854 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-23 15:10:24 +00:00
theraysmith@gmail.com
4d9e544085 Fixed debian bug#704911: assert failure during training
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@851 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-13 22:53:43 +00:00
zdenop@gmail.com
74dc14ebd4 fix copying a TessResultIterator using CAPI (issue 934)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@849 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-02 21:25:41 +00:00
zdenop@gmail.com
b9abecfb34 Auto append dot in combine_tessdata (issue 932); provide more info for combine_tessdata utility
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@848 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-02 20:50:55 +00:00
zdenop@gmail.com
ad004bddf7 More info for combine_tessdata files (thanks to gmvbif - issue 917)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@845 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-21 18:34:47 +00:00
zdenop@gmail.com
e4c00773de fix typo (issue 908)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@844 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-16 20:42:02 +00:00
zdenop@gmail.com
7e14ade10d print error/warning messages to stderr/debug file instead of stdout (fix issue 911)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@843 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-16 20:31:37 +00:00
zdenop@gmail.com
642e9e7615 fix segfault for PSM_SINGLE_CHAR (issue 845)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@842 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 19:45:29 +00:00
zdenop@gmail.com
62b2e12b72 replace option -o with -c
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@841 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 17:06:14 +00:00
zdenop@gmail.com
16e80c06ee Test for empty choices at ChoiceIterator (fix issue 826)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@840 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 08:13:22 +00:00
zdenop@gmail.com
80040b834b Fix segfault at ComputeNormMatch/normmatch.cpp:118 (issue 755)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@839 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 08:00:23 +00:00
zdenop@gmail.com
16aa99315a make ocrclass.h public header (fix issue 897 )
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@838 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-30 18:12:11 +00:00
zdenop@gmail.com
7dcfd02c22 Allow arbitrary configuration options to be set from the command line (fix issue 893)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@837 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 20:43:14 +00:00
zdenop@gmail.com
1032cb1692 fix issue 881: capi.h redefines things from Leptonica, causing compilation failures
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@836 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 17:57:21 +00:00
zdenop@gmail.com
a04a5c1f42 Tesseract should exit with an error if ProcessPages fails (fixed issue 891)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@834 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-12 08:14:13 +00:00
zdenop@gmail.com
a6bee550e8 Add lang and dir attributes to each word in hOCR output (fix issue 878);
Unify usage of single quote in hOCR output 


git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@832 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-03-28 21:37:55 +00:00
zdenop@gmail.com
902d73dda4 fix download link in vs2008/doc/setup.html
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@831 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-03-12 17:43:11 +00:00
zdenop@gmail.com
db52047420 fix issue 809: invalid hOCR output file on windows when input filename has non ascii chars.
Add release date to vs2008/doc/versions.html

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@828 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-02-23 15:01:21 +00:00
zdenop@gmail.com
e8f7dc8b54 fix issue 426 - Cannot get Viewer to work on MacOS X
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@827 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-02-23 11:57:40 +00:00
zdenop@gmail.com
32d212d0c6 add new config file - get.image
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@826 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-02-23 11:56:49 +00:00
zdenop@gmail.com
6e59888b76 put back --with-extra-libraries and --with-extra-includes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@824 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-31 23:10:18 +00:00
zdenop@gmail.com
5afcfde428 clean up configure.ac (fix for issue 819 and 763)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@823 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-30 22:57:49 +00:00
theraysmith@gmail.com
00a79cb93a Fixed crash reported as bug 697544 to debian
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@822 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-16 00:16:12 +00:00
theraysmith@gmail.com
64c739c8af Added sparse text mode, also fixed issue 653.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@820 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-03 19:06:41 +00:00
theraysmith@gmail.com
d0693a6d3b Integrated patch to AUTHORS fixing issue 814 and adding more authors from the code
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@819 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-03 18:02:49 +00:00
zdenop@gmail.com
37cb31afc9 Sanitise pkg-config file (issue 817)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@817 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-01-02 12:53:48 +00:00
zdenop@gmail.com
5947f3da38 add NSIS script for Windows installer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@815 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-27 19:53:43 +00:00
zdenop@gmail.com
37fb755d47 Add a command-line option (--print-parameters) to dump the parameters to stdout
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@814 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:54:14 +00:00
zdenop@gmail.com
4812fac33e Fix issue 427: print result to stdout instead to file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@813 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:52:42 +00:00
zdenop@gmail.com
8a2b5f0ead Fix issue 808: Check for output file write permissions before performing lengthy OCR operation
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@812 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:49:15 +00:00
zdenop@gmail.com
42c92c3e29 avoid multiple tesseract inits in tesseract executable
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@811 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:47:06 +00:00
zdenop@gmail.com
9b2906c67e fix issue 800: Get rid of glob() for searching available languages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@810 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-30 22:11:22 +00:00
zdenop@gmail.com
5d9fd5fb72 add word confidence info (x_wconf) to hocr output/fix issue 748
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@806 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-06 21:18:35 +00:00
zdenop@gmail.com
d54d1e0095 fix tarball portability
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@802 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-01 22:56:52 +00:00