zdenop
bdb912c186
escape input_file name in hOCR output - fix issue 1154
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1098 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 22:19:30 +00:00
zdenop
30f6ae6742
amendment to r1091
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1095 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-07 20:53:03 +00:00
zdenop
ee73e3b107
fix issue 123: user-words (and user-patterns) file specified by command line
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1093 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-04 21:11:00 +00:00
zdenop
bc09cd9040
fix formating in C-API and add TessChoiceIteratorDelete
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1092 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-03 20:21:37 +00:00
zdenop
f86e9d83d4
add ChoiceIterator to C-API - fix issue 1149
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1091 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-03 09:29:20 +00:00
theraysmith@gmail.com
45e106820f
Fixed issue 1116
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1074 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 00:50:27 +00:00
zdenop@gmail.com
2367ba1f6e
fix PDF rendering for Arabic. http://ftp.de.debian.org/debian/pool/main/t/tesseract/tesseract_3.03.02-3.diff.gz
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1055 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-21 10:11:32 +00:00
zdenop
d451b28054
fix issue 1127; add unvl output to tesseract executable
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1052 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-02 14:40:21 +00:00
zdenop
f01ea0e485
C-API: remove comments
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1047 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 20:01:55 +00:00
theraysmith@gmail.com
2fcea93846
Fixed issues 1081-1090
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1046 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 02:23:18 +00:00
zdenop
790a3da22f
remove 'class IMAGE;'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1045 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 23:32:23 +00:00
theraysmith@gmail.com
864b2f6d80
Fixed problems with selection/copy/paste in some PDF viewers
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1042 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 19:14:16 +00:00
zdenop
4e526f987e
C-API: another update API based on changes in baseapi.h
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1041 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 13:24:55 +00:00
zdenop
0e238d43ba
C-API: update API based on changes in baseapi.h; add renderer
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1040 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-02 23:50:47 +00:00
zdenop
32789291a8
provide output for -psm 0
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1037 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 12:56:36 +00:00
zdenop
080e0c028a
C-API: add function to set init parameter during Init with c-string array
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1036 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 10:36:40 +00:00
theraysmith@gmail.com
4585a4c9df
Fixed empty page with color input
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1032 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-30 02:18:51 +00:00
theraysmith@gmail.com
0ddc7bfcaf
Fixed first-word only bug in PDF output.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1022 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-27 22:40:03 +00:00
theraysmith@gmail.com
d11dc049e3
Fixed a lot of compiler/clang warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
1a487252f4
Fixed slow-down that was caused by upping MAX_NUM_CLASSES
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1013 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:12:35 +00:00
theraysmith@gmail.com
0d93bb7cfa
More code cleanup from patches and fixing warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1011 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:09:59 +00:00
theraysmith@gmail.com
5b9a7e06eb
Turned on pdfrenderer functionality that needs leptonica 1.70
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1009 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-23 23:01:10 +00:00
zdenop@gmail.com
9f2730600d
fix segfault for --list-langs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1006 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 21:32:20 +00:00
zdenop@gmail.com
21756518e2
don't display tesseract info line if output is stdout
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@999 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:17:37 +00:00
zdenop@gmail.com
71ae509354
fix for mingw32/g++ 4.8.1
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@998 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:10:15 +00:00
zdenop@gmail.com
ef3b1d936e
fix mingw build issues
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@995 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-18 09:00:54 +00:00
zdenop
ff5fb7f105
fix issue 1044: OS X build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@994 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-17 20:53:15 +00:00
zdenop@gmail.com
94d08567e1
fix vs2010 (and maybe vs2008) build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00
zdenop
f2e4dba850
fix issue 995 - produce output orientation info
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@982 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 14:47:15 +00:00
theraysmith@gmail.com
91d2265429
More minor fixes from issues and cleanup
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
256929ce5a
Cleaned up stdin implementation
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@969 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:48:43 +00:00
theraysmith@gmail.com
f2ec85d1e1
Added PDF renderer
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@962 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:58:55 +00:00
zdenop@gmail.com
9c25eda469
fix issue 813: implement input through stdin
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@936 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-07 16:48:52 +00:00
zdenop
ed28bae8d2
produce only one output file in case of hocr
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@935 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 14:01:32 +00:00
zdenop
11f7eea7e1
fix tiff identification
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@934 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 13:25:42 +00:00
zdenop
577e919215
move PERF_COUNT_START message below tesseract message; implement parameter to suppress test blob messages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@932 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:58:52 +00:00
zdenop
fced05f419
identify all supported tiff version by leptonica
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@931 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:47:07 +00:00
zdenop
8b3e590123
fix OpenCL build on OSX 10.9; add info about OpenCL to 'tesseract -v'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@921 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-14 08:35:14 +00:00
zdenop
9de80e0a06
fix resource leaks - issues 1034, 1038, 1040. Thanks to Martin Ettl
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@920 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-13 22:13:52 +00:00
rajesh.katikam@gmail.com
b8d7a1d139
Fixed all the crashes observed on 24 bit and 8 bit images.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
rajesh.katikam@gmail.com
bf0a83907b
Cleaned up configure.ac and Makefile.am in multiple folder to use OPENCL paths
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@910 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-12 10:40:40 +00:00
rajesh.katikam@gmail.com
983aaabaae
Initial version of OpenCL support added.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@909 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-11 17:43:13 +00:00
zdenop@gmail.com
c7ba981e04
fix validity of hocr output of multipage image
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@908 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 22:00:54 +00:00
zdenop@gmail.com
e66d433907
fix issue 938: change tessdata-dir/datadir rules; implement --tessdata-dir option
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@907 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 20:59:11 +00:00
zdenop@gmail.com
77c1b41e4e
fix svn:executable atribute, trailing spaces, version include
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@903 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-03 17:24:00 +00:00
zdenop@gmail.com
b15c710385
fix declaration of ClearResults() (VC++)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@891 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 09:06:22 +00:00
theraysmith@gmail.com
4c3475ad2e
Fixed fmemopen portability problem
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
zdenop@gmail.com
ee08f623ce
fix issue 967
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@886 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-29 20:48:06 +00:00
zdenop@gmail.com
af319b4d90
fix for windows build - part 1
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@883 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-25 09:56:49 +00:00
theraysmith@gmail.com
4d514d5a60
Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
theraysmith@gmail.com
88ea81c89e
Added renderer to API
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@869 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:39:59 +00:00
zdenop@gmail.com
b5e16669e1
fix issue 946/reopen issue 903
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@865 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-25 15:54:30 +00:00
zdenop@gmail.com
b1fd75ccf9
amend r:862
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@863 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-14 14:11:16 +00:00
zdenop@gmail.com
c45bb08a6e
check inputformat before getting number of pages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@862 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-14 13:58:23 +00:00
zdenop@gmail.com
ebd0ba8134
remove unused code (tesseractmain.h)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@861 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:23:47 +00:00
zdenop@gmail.com
10c1169d98
remove unused code (Windows related)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
zdenop@gmail.com
b5d3d66a68
remove unused code(gettext)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@859 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-07 16:39:13 +00:00
zdenop@gmail.com
4c16ff6a1f
use leptonica for getting number of pages instead of own code
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@858 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 16:07:25 +00:00
zdenop@gmail.com
8a0878af3a
fix mingw build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@856 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 08:46:57 +00:00
zdenop@gmail.com
418a7ad16f
allow to have text file with list of images as input
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@855 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-27 21:53:53 +00:00
zdenop@gmail.com
e5628e5e1a
fix hOCR output - do not print empty words: issue 903
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@854 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-23 15:10:24 +00:00
zdenop@gmail.com
74dc14ebd4
fix copying a TessResultIterator using CAPI (issue 934)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@849 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-02 21:25:41 +00:00
zdenop@gmail.com
62b2e12b72
replace option -o with -c
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@841 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 17:06:14 +00:00
zdenop@gmail.com
7dcfd02c22
Allow arbitrary configuration options to be set from the command line (fix issue 893)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@837 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 20:43:14 +00:00
zdenop@gmail.com
1032cb1692
fix issue 881: capi.h redefines things from Leptonica, causing compilation failures
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@836 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 17:57:21 +00:00
zdenop@gmail.com
a04a5c1f42
Tesseract should exit with an error if ProcessPages fails (fixed issue 891)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@834 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-12 08:14:13 +00:00
zdenop@gmail.com
a6bee550e8
Add lang and dir attributes to each word in hOCR output (fix issue 878);
...
Unify usage of single quote in hOCR output
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@832 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-03-28 21:37:55 +00:00
zdenop@gmail.com
db52047420
fix issue 809: invalid hOCR output file on windows when input filename has non ascii chars.
...
Add release date to vs2008/doc/versions.html
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@828 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-02-23 15:01:21 +00:00
zdenop@gmail.com
37fb755d47
Add a command-line option (--print-parameters) to dump the parameters to stdout
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@814 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:54:14 +00:00
zdenop@gmail.com
4812fac33e
Fix issue 427: print result to stdout instead to file
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@813 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:52:42 +00:00
zdenop@gmail.com
8a2b5f0ead
Fix issue 808: Check for output file write permissions before performing lengthy OCR operation
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@812 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:49:15 +00:00
zdenop@gmail.com
42c92c3e29
avoid multiple tesseract inits in tesseract executable
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@811 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:47:06 +00:00
zdenop@gmail.com
9b2906c67e
fix issue 800: Get rid of glob() for searching available languages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@810 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-30 22:11:22 +00:00
zdenop@gmail.com
5d9fd5fb72
add word confidence info (x_wconf) to hocr output/fix issue 748
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@806 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-06 21:18:35 +00:00
theraysmith@gmail.com
af04ae882f
Made use of _ macro and stderr consistent with error messages.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@780 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-22 23:40:19 +00:00
zdenop@gmail.com
6b4970776d
Fixed tessdata_dir for tessseract executable.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@777 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-11 19:47:17 +00:00
theraysmith@gmail.com
605fd7488b
Fixed relative-to-executable tessdata location, while allowing for addition of terminating /
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@774 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-09 00:41:08 +00:00
zdenop@gmail.com
ceff3288d7
fix issue 764...
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@768 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-27 08:43:55 +00:00
zdenop@gmail.com
fb91759cdc
fix issue 764 and clean tabulators, trim trailing spaces...
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@767 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-27 08:24:46 +00:00
zdenop@gmail.com
23f1d16037
fix fox issue 346 / GetAvailableLanguagesAsVector
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@760 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-24 05:20:23 +00:00
zdenop@gmail.com
dc8bd4682b
C-API (fix issue 362)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@759 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-24 05:14:11 +00:00
theraysmith@gmail.com
fbf7968490
Fixed problem with blank pages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@750 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:27:25 +00:00
zdenop@gmail.com
2a57976c41
- fix msys buil (missing -lws2_32 for library)
...
- remove old debian leptonica package
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@738 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-25 19:53:41 +00:00
zdenop@gmail.com
306a8216e1
fix creating box file from empty image (issue 516)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@737 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-03 22:32:17 +00:00
zdenop@gmail.com
c8eedb25a6
added ocr-capabilities for hocr conformity; XHTML 1.0 Transitional conformity; improved hocr output readability
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@729 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-28 20:44:23 +00:00
david.eger@gmail.com
6a9a3ddcb2
Zdeno pointed out that ocr_line (though not ocr_word) is actually in the hocr spec.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@728 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-27 23:58:09 +00:00
david.eger@gmail.com
d9d70919bb
Conform to the hocr spec: hocr doesn't have ocr_word, but instead has ocrx_word.
...
Tested with ExactImage's hocr2pdf.
$ tesseract phototest.tif phototest hocr
$ hocr2pdf -i phototest.tif -o ./phototest.pdf < ./phototest.hocr
$ evince phototest.pdf
See: https://docs.google.com/document/preview?id=1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@726 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-25 17:36:25 +00:00
david.eger@gmail.com
eeeb4f513c
Provide better paragraph segmentation without having to run fully
...
automatic layout analysis.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@725 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-10 00:03:34 +00:00
zdenop@gmail.com
aa14e8b212
fix Mingw shared build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@718 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-02 12:14:37 +00:00
zdenop@gmail.com
cd8de9157c
change comments to doxygen block comments (api)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@716 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-30 21:24:12 +00:00
zdenop@gmail.com
ee44165d3d
improve doxygen config; fix doxygen warnings for baseapi.h
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@712 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 20:38:14 +00:00
zdenop@gmail.com
3115fbfdcb
another fix MinGW+MSYS
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@709 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-24 10:14:47 +00:00
zdenop@gmail.com
d4d4b8aad8
improve autools system (mingw+msys fix); implementation of --disable-tessdata-prefix
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@708 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-22 20:01:33 +00:00
zdenop@gmail.com
2f1c112640
+Remove visibility from protected members of tesseract::TessBaseAPI class by applying TESS_LOCAL macro;
...
+Make PageIterator & ResultIterator classes visible by applying TESS_API macro;
+Fix api/Makefile.am & training/Makefile.am to allow Parallel Build Trees;
patch from Tom Powers (https://groups.google.com/group/tesseract-dev/msg/9d00579540e44055 )
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@701 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-07 22:04:46 +00:00
david.eger@gmail.com
c2e84c4606
Fix two issues with GetHOCRText():
...
+ make it not seg-fault if called without calling SetInputName().
+ make it not leak memory (thank you valgrind)
http://code.google.com/p/tesseract-ocr/issues/detail?id=463
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@699 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-06 21:18:16 +00:00
zdenop@gmail.com
765832d449
fixes issue 573 where boolean was being compared to float;
...
tesseract prints full version info when -v arg;
removes extra includes from tesseractmain.h;
removes extra DLLEXPORT & DLLIMPORT from hosts.h;
remove CCUTIL_IMPORTS & CCUTIL_EXPORTS from vs2008 *.vcproj;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@694 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-04 22:27:16 +00:00
zdenop@gmail.com
97e19443a3
install only necessary headers, fix uninstall
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@692 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 13:22:51 +00:00
zdenop@gmail.com
3b326532cc
fix --enable-multiple-libraries; implement quite mode (issue 580)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@691 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 11:48:59 +00:00
zdenop@gmail.com
30a70142a0
visibility - autotools part (./configure --enable-visibility)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@690 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 23:51:33 +00:00
zdenop@gmail.com
a776e0be85
TP: visibility trial - code & windows build changes (without autotools changes)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@689 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 17:48:45 +00:00
zdenop@gmail.com
e216adab43
fix configure.ac; unify identifiers (WIN32 vs _WIN32)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@688 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 17:31:24 +00:00
zdenop@gmail.com
49c4ce3183
fix for GRAPHICS_DISABLED build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@686 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-01 22:43:51 +00:00
zdenop@gmail.com
df1cbdd7d3
fix for issue 463 (GetHOCRText segfaults unless SetInputName has been called first); removed declaration of GetLastInitLanguage
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@684 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-27 17:19:20 +00:00
zdenop@gmail.com
492f9119c2
check return code of API init (issue 593)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@680 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-26 14:48:35 +00:00
zdenop@gmail.com
6ccab83bd6
fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING")
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@677 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-19 21:48:45 +00:00
theraysmith@gmail.com
23dfabcab1
Cleaned up externally used namespace by removing includes from baseapi.h
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@657 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:16 +00:00
theraysmith@gmail.com
ef786ad29b
Moved ResultIterator/PageIterator to ccmain
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@645 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:47:59 +00:00
zdenop@gmail.com
67f47008c7
fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
max.markin@gmail.com
0fef845950
VC2010: add support for dynamic linking
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@629 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-15 22:17:19 +00:00
zdenop@gmail.com
da41b96f7f
removed check for libtiff - leptonica is required; cleanup #ifdef/#ifndef HAVE_LIBLEPT
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@624 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-30 06:34:41 +00:00
joregan@gmail.com
bf4a09d72a
make single/multiple libraries optional -- this needs testing!!!
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
theraysmith@gmail.com
0d969b7b3a
Fixed problem of config file vs command line for pageseg mode
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@611 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:41:51 +00:00
theraysmith@gmail.com
7ab0a97180
Fixed comment re bln_numericmode
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@610 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:41:03 +00:00
theraysmith@gmail.com
d5d15f32d7
Deleted Makefile.in from svn
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
7ec3dca968
show page 0 for multipage tiff;
...
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
zdenop@gmail.com
9b9efa8e4c
man pages included to install script, improved windows installer script (issue 425), output format for "tesseract -v" changed to "3.00 version", README cleanup.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@601 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-08 20:33:18 +00:00
zdenop@gmail.com
411e074b4d
fix for issues 479, 524 + tests for input image (there are no leptonica error messages on Windows console)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@597 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-29 21:55:49 +00:00
zdenop@gmail.com
1ad70ea8ff
fixing issues 518 and 521
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@596 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-27 20:56:40 +00:00
zdenop@gmail.com
505c8dbece
changed "xocr_word" to "ocrx_word" according hOCR spec
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@585 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-05-24 20:53:58 +00:00
zdenop@gmail.com
b54eee99ac
configure script requires liblept;
...
add '--version' option for tesseract as alternative to '-v'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@584 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-05-24 20:17:28 +00:00
theraysmith
c81483f714
Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@566 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:43:04 +00:00
theraysmith
a3f30eb5c7
Deleted lots of dead code, including PBLOB
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@555 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:51:34 +00:00
theraysmith
0d81f4b649
Fixed problem that was preventing pagesegmode from being set by config file
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@554 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:43:38 +00:00
theraysmith
f040994f51
Fixed closing meta element in hocr output
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@549 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 06:25:20 +00:00
theraysmith
a7db6dada9
Fix for linking with leptonica on Linux.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@548 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 01:40:39 +00:00
theraysmith
137f4806b6
Added sub/superscript, small/dropcap detection
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@547 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 01:32:20 +00:00
zdenop@gmail.com
c707b26d5f
fixed VC++2008 Express build after last changes
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@543 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 12:46:41 +00:00
theraysmith
ef59841ebe
Moved multipage code to BaseAPI and tidied up command line handling
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@532 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 00:58:30 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
zdenop@gmail.com
7511d76315
fixed hocr to produce valid document (acording http://validator.w3.org/ ) - issue http://code.google.com/p/tesseract-ocr/issues/detail?id=401
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@525 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-17 20:03:58 +00:00
zdenop@gmail.com
a750ffed7a
fixed issue 394: The tessedit_pageseg_mode does not work; thanks sms@fritzwidmer.ch
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@517 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-02 20:52:58 +00:00
zdenop@gmail.com
fa4d4589cb
fixed hocr (escape special special characters; thank to aizvorski) + hocr config)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@515 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-29 19:03:06 +00:00
zdenop@gmail.com
346da8c1e5
missing returns in nonvoid functions (thanks to rusnakp) issue 389;
...
corrected windows installation script - tesseract should be not run as start-up application;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@514 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-26 13:35:02 +00:00
zdenop@gmail.com
21d6ea66c2
better handlig of multipage tiff (issue 380 http://code.google.com/p/tesseract-ocr/issues/detail?id=380 )
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@513 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-22 06:13:54 +00:00
joregan
e0b07948fc
disabling gettext checks - not currently used, and something about disabling is causing subsequent autoconf checks to not run
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@492 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 16:27:39 +00:00
joregan
f2506871f9
move include of config_auto.h to not conflict with local types. Not finished
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@490 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 15:53:40 +00:00
joregan
7efbd3dab7
crap
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@444 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 15:17:52 +00:00
joregan
a18816f839
partial merge of doxygen branch (stuff without conflicts, basically)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@441 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 13:23:23 +00:00
joregan
7e8bd73aea
some casts to get rid of persistent warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@435 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 21:19:53 +00:00
joregan
522a8ccfc4
fix issue 332
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@429 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-20 10:31:49 +00:00
joregan
722967201a
attempting to test this; not working so far
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@421 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 02:13:27 +00:00
joregan
a301f9a5c7
start of i18n
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@418 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 01:59:13 +00:00
joregan
5279e34296
GRAPHICS_ENABLED means ScrollView, but the correct #define was not being set
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@407 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-27 16:03:29 +00:00
joregan
c4118eb6cb
change define
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@404 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-27 15:27:10 +00:00
joregan
d2c234d5ff
fix issue 318
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@402 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-09 12:08:47 +00:00
joregan
43690e23c8
apply patch from issue 320
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@401 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-09 11:52:20 +00:00
joregan
150c1f741c
bah! fix this properly
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@399 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-06-05 14:13:31 +00:00
joregan
f81004260f
guard around delete[]
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@391 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-30 20:22:54 +00:00
joregan
cfcd9a1b5a
make cppcheck happy
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@388 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-30 03:16:54 +00:00
joregan
8cd185d49f
float casts within fabs() - partial patch from issue 304
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@374 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-27 01:45:47 +00:00