Commit Graph

378 Commits

Author SHA1 Message Date
Zdenko Podobný
67ede37b50 Fixes #74 NO_CUBE_BUILD with reverting to ANDROID_BUILD in baseapi 2015-08-09 18:09:30 +02:00
Zdenko Podobný
628de5ba3f enable pdfrender with NO_CUBE_BUILD 2015-08-07 23:20:22 +02:00
Jeff Breidenbach
9dcf2c6aa8 replace CubeUtils::UTF8ToUTF32 in pdfrenderer 2015-08-07 22:18:33 +02:00
Zdenko Podobný
66a76a9477 Revert "temporary add config/*, configure and Makefile.in for release"
This reverts commits ec9581d8f2, 1afe382c4e, 4b2cfabcc1
2015-07-31 21:44:43 +02:00
Zdenko Podobný
41478fd5a1 implement build without cube (-DNO_CUBE_BUILD) 2015-07-24 11:51:44 +02:00
Zdenko Podobný
71e226c44f increase version number 2015-07-21 22:46:52 +02:00
zdenop
e4f4893fb8 Merge pull request #52 from unbe/null-pointer-access-in-hocr
Fix null pointer dereference when writing font name into HOCR.
2015-07-20 07:40:59 +02:00
artem
2b6801eddb Fix null pointer dereference when writing font name into HOCR. 2015-07-19 22:05:02 +02:00
unbe
67ffea8877 Update capi.cpp
Make TessDeleteResultRenderer use delete, not delete[]
2015-07-19 15:15:42 +02:00
Zdenko Podobný
ec9581d8f2 temporary add configure and Makefile.in for release 2015-07-11 09:42:43 +02:00
Ray Smith
a303ab9d00 Misc fixes, mostly clang formatting, but some bug fixes in matrix, werd, and tesstrain_utils. Also updates unicharset to match traineddata files. 2015-07-09 14:28:20 -07:00
Ray Smith
b1d99dfe23 Added a backup adaptive classifier to take over from primary when it fills on a large document 2015-06-12 11:10:53 -07:00
Ray Smith
ab0f4e2c38 Clang fixes to earlier changes and build compatability with Google environment 2015-06-12 10:53:21 -07:00
orbitcowboy
9328f0e5d4 Fix potential null pointer dereference in ccmain/paragraphs.cpp. 2015-05-19 10:17:44 +02:00
Jim O'Regan
4a6195202c fix typo 2015-05-18 12:32:36 +01:00
Zdenko Podobný
438edd6c7b added row attributes to hocr output 2015-05-17 22:13:59 +02:00
Zdenko Podobný
ed6ae9b974 Add monitor to GetHOCRText 2015-05-17 21:55:50 +02:00
Zdenko Podobný
59bcbc79b3 fix GIT_VER info in VS2010 2015-05-15 15:14:49 +02:00
Zdenko Podobný
e98849b482 rint error message when pdf.ttf is not found. 2015-05-15 15:14:00 +02:00
Zdenko Podobný
035b324f0f reflect the latest commits in VS2010 build 2015-05-14 10:52:54 +02:00
Jim O'Regan
b13691fda0 Merge conflict: going with Ray's version 2015-05-13 08:54:28 +01:00
Ray Smith
03f3c9dc88 Misc fixes missed from previous commits 2015-05-12 18:13:15 -07:00
Ray Smith
6b634170c1 Significant change to invisible font system
to improve correctness and compatibility with
external programs, particularly ghostscript.
We will start mapping everything to a single glyph,
rather than allowing characters to run off the end
of the font.

A more detailed design discussion is embedded into
pdfrenderer.cpp comments. The font, source code
that produces the font, and the design comments
were contributed by Ken Sharp from Artifex Software.
2015-05-12 17:33:18 -07:00
Ray Smith
4a3caefd92 Add ability to build under android (without cube or scrollview). 2015-05-12 15:41:15 -07:00
Ray Smith
53fc4456cc Fixed issue 1252: Refactored LearnBlob and its call hierarchy to make it a member of Classify.
Eliminated the flexfx scheme for calling global feature extractor functions
through an array of function pointers.
Deleted dead code I found as a by-product.
This CL does not change BlobToTrainingSample or ExtractFeatures to be full
members of Classify (the eventual goal) as that would make it even bigger,
since there are a lot of callers to these functions.
When ExtractFeatures and BlobToTrainingSample are members of Classify they
will be able to access control parameters in Classify, which will greatly
simplify developing variations to the feature extraction process.
2015-05-12 15:22:34 -07:00
Zdenko Podobný
d508751e58 Fixed issue 1317 - git revision info used as version info for autotools & DEBUG 2015-05-02 12:15:13 +02:00
Zdenko Podobný
4c7c960bfd fix issue 1417 2015-02-07 22:22:20 +01:00
Zdenko Podobný
09b0c91fc9 fix Issue 1398 2015-02-06 23:44:58 +01:00
Zdenko Podobný
e0441d0c6b fix typo/ issue 1397 2014-12-31 22:31:50 +01:00
Zdenko Podobný
473141c1de fix bool in c-api 2014-12-28 17:55:56 +01:00
Zdenko Podobný
4da712d04d Add paragraph info to C-API(fix issue 1388) 2014-12-07 14:07:14 +01:00
Zdenko Podobný
239f350a72 remove const from C API TessResultIteratorGetChoiceIterator (issue 1342) 2014-10-14 22:46:11 +02:00
Ray Smith
242b14ae7f Reduced size of multi-renderer implementation from code review 2014-10-09 13:29:46 -07:00
Ray Smith
d9699c4099 Fixed bidi handling in PDF output 2014-10-09 13:29:01 -07:00
Zdenko Podobný
d0cb1071b2 remove parameters tessedit_pdf_jpg_quality, tessedit_pdf_compression (reasons are in i1300 and i1285) 2014-10-07 23:37:34 +02:00
Zdenko Podobný
4904afe65b fix issue 1300 - patch from #35 2014-10-06 22:43:56 +02:00
Zdenko Podobný
4c01561b0f fix issue 1300 - patch from #26 2014-10-02 21:19:17 +02:00
Zdenko Podobný
c0640a4bef fix cygwin build (issue 1289) 2014-09-28 23:19:52 +02:00
Zdenko Podobný
f8613fab22 fix issue 1300 /patches from breidenbach 2014-09-21 16:38:24 +02:00
Zdenko Podobný
9e8629d9ef allow multiple output in tesseract executable (https://groups.google.com/d/msg/tesseract-ocr/Z_WUKmJDVxc/1vc3W0xJZ2oJ) 2014-09-19 23:33:47 +02:00
Ray Smith
648e7ca311 Merge branch 'master' of https://code.google.com/p/tesseract-ocr
Usual git need to merge if local is out of date.
2014-09-17 18:10:17 -07:00
Ray Smith
0256529c1f Fixed issue 1243 2014-09-17 18:09:45 -07:00
Jim O'Regan
c0c719306a update docs for TessBaseAPI::SetProbabilityInContextFunc based on Ray's email today 2014-09-09 20:37:27 +01:00
Zdenko Podobný
d1aa61c110 fix issue 1285: reimplement option to select pdf compression 2014-09-06 09:32:22 +02:00
Ray Smith
cd2653c167 Cleanup from previous changes 2014-08-12 16:12:46 -07:00
theraysmith@gmail.com
dbf6197471 Major refactor of control.cpp to enable line recognition
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
theraysmith@gmail.com
b64ad05096 Improved efficiency of image processing for PDF
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:15:25 +00:00
zdenop
bce2cd5f33 enable to select pdf compression type and jpeg quality (fix issue 1263)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1134 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-08 21:18:44 +00:00
zdenop
1156098567 Add font info to hocr output - fix issue 1219
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-03 16:22:12 +00:00
zdenop
5b779456f9 fix compatibility with leptonica 1.71 and 1.70
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1126 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-24 19:11:39 +00:00
zdenop
95b7783a95 fix issue 1228: bilevel pdf output - horizontal/vertical lines removed
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1118 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-23 21:04:37 +00:00
zdenop
905e6162b9 put info about (API) version; fix typo
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1117 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-22 18:31:42 +00:00
zdenop
fad9de4e1b fix issue 1217: GetThresholdedImage accesses possibly NULL thresholder_
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1113 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 21:21:37 +00:00
zdenop
e64f555567 fix Issue 1223: TessPolyBlockType enum is outdated in C-API
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1112 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 20:31:48 +00:00
zdenop
36f3f76d64 fix tiff issue on windows
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1111 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 07:27:54 +00:00
zdenop@gmail.com
84cdcb32cc fixed windows build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1110 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-26 06:48:58 +00:00
zdenop
19c4c2f0e7 fix C-API to resent C++ API changes - thanks to Nick White
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1109 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-25 21:03:11 +00:00
zdenop
ffe52737d5 check if input file exists
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1108 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-25 19:58:00 +00:00
theraysmith@gmail.com
25a8c7b720 Enabled streaming input and output of multi-page documents
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop
979f9cafe5 Add word recognition language to C-API - fix issue 1200
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1102 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-16 18:35:54 +00:00
zdenop
44b0d0e28e addition to r1100
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1101 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:24:54 +00:00
zdenop
6051e40212 fix issue 1197
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1100 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:20:38 +00:00
zdenop
2e520f2fac fix hocr/pdf output when image is provided from stdin - issue 1196
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1099 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 15:59:47 +00:00
zdenop
bdb912c186 escape input_file name in hOCR output - fix issue 1154
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1098 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 22:19:30 +00:00
zdenop
30f6ae6742 amendment to r1091
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1095 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-07 20:53:03 +00:00
zdenop
ee73e3b107 fix issue 123: user-words (and user-patterns) file specified by command line
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1093 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-04 21:11:00 +00:00
zdenop
bc09cd9040 fix formating in C-API and add TessChoiceIteratorDelete
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1092 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-03 20:21:37 +00:00
zdenop
f86e9d83d4 add ChoiceIterator to C-API - fix issue 1149
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1091 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-03 09:29:20 +00:00
theraysmith@gmail.com
45e106820f Fixed issue 1116
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1074 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 00:50:27 +00:00
zdenop@gmail.com
2367ba1f6e fix PDF rendering for Arabic. http://ftp.de.debian.org/debian/pool/main/t/tesseract/tesseract_3.03.02-3.diff.gz
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1055 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-21 10:11:32 +00:00
zdenop
d451b28054 fix issue 1127; add unvl output to tesseract executable
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1052 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-02 14:40:21 +00:00
zdenop
f01ea0e485 C-API: remove comments
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1047 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 20:01:55 +00:00
theraysmith@gmail.com
2fcea93846 Fixed issues 1081-1090
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1046 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 02:23:18 +00:00
zdenop
790a3da22f remove 'class IMAGE;'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1045 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 23:32:23 +00:00
theraysmith@gmail.com
864b2f6d80 Fixed problems with selection/copy/paste in some PDF viewers
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1042 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 19:14:16 +00:00
zdenop
4e526f987e C-API: another update API based on changes in baseapi.h
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1041 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 13:24:55 +00:00
zdenop
0e238d43ba C-API: update API based on changes in baseapi.h; add renderer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1040 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-02 23:50:47 +00:00
zdenop
32789291a8 provide output for -psm 0
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1037 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 12:56:36 +00:00
zdenop
080e0c028a C-API: add function to set init parameter during Init with c-string array
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1036 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 10:36:40 +00:00
theraysmith@gmail.com
4585a4c9df Fixed empty page with color input
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1032 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-30 02:18:51 +00:00
theraysmith@gmail.com
0ddc7bfcaf Fixed first-word only bug in PDF output.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1022 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-27 22:40:03 +00:00
theraysmith@gmail.com
d11dc049e3 Fixed a lot of compiler/clang warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
1a487252f4 Fixed slow-down that was caused by upping MAX_NUM_CLASSES
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1013 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:12:35 +00:00
theraysmith@gmail.com
0d93bb7cfa More code cleanup from patches and fixing warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1011 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:09:59 +00:00
theraysmith@gmail.com
5b9a7e06eb Turned on pdfrenderer functionality that needs leptonica 1.70
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1009 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-23 23:01:10 +00:00
zdenop@gmail.com
9f2730600d fix segfault for --list-langs
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1006 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 21:32:20 +00:00
zdenop@gmail.com
21756518e2 don't display tesseract info line if output is stdout
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@999 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:17:37 +00:00
zdenop@gmail.com
71ae509354 fix for mingw32/g++ 4.8.1
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@998 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:10:15 +00:00
zdenop@gmail.com
ef3b1d936e fix mingw build issues
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@995 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-18 09:00:54 +00:00
zdenop
ff5fb7f105 fix issue 1044: OS X build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@994 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-17 20:53:15 +00:00
zdenop@gmail.com
94d08567e1 fix vs2010 (and maybe vs2008) build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00
zdenop
f2e4dba850 fix issue 995 - produce output orientation info
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@982 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 14:47:15 +00:00
theraysmith@gmail.com
91d2265429 More minor fixes from issues and cleanup
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
256929ce5a Cleaned up stdin implementation
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@969 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:48:43 +00:00
theraysmith@gmail.com
f2ec85d1e1 Added PDF renderer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@962 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:58:55 +00:00
zdenop@gmail.com
9c25eda469 fix issue 813: implement input through stdin
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@936 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-07 16:48:52 +00:00
zdenop
ed28bae8d2 produce only one output file in case of hocr
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@935 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 14:01:32 +00:00
zdenop
11f7eea7e1 fix tiff identification
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@934 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 13:25:42 +00:00
zdenop
577e919215 move PERF_COUNT_START message below tesseract message; implement parameter to suppress test blob messages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@932 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:58:52 +00:00
zdenop
fced05f419 identify all supported tiff version by leptonica
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@931 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:47:07 +00:00
zdenop
8b3e590123 fix OpenCL build on OSX 10.9; add info about OpenCL to 'tesseract -v'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@921 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-14 08:35:14 +00:00
zdenop
9de80e0a06 fix resource leaks - issues 1034, 1038, 1040. Thanks to Martin Ettl
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@920 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-13 22:13:52 +00:00
rajesh.katikam@gmail.com
b8d7a1d139 Fixed all the crashes observed on 24 bit and 8 bit images.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
rajesh.katikam@gmail.com
bf0a83907b Cleaned up configure.ac and Makefile.am in multiple folder to use OPENCL paths
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@910 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-12 10:40:40 +00:00
rajesh.katikam@gmail.com
983aaabaae Initial version of OpenCL support added.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@909 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-11 17:43:13 +00:00
zdenop@gmail.com
c7ba981e04 fix validity of hocr output of multipage image
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@908 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 22:00:54 +00:00
zdenop@gmail.com
e66d433907 fix issue 938: change tessdata-dir/datadir rules; implement --tessdata-dir option
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@907 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 20:59:11 +00:00
zdenop@gmail.com
77c1b41e4e fix svn:executable atribute, trailing spaces, version include
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@903 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-03 17:24:00 +00:00
zdenop@gmail.com
b15c710385 fix declaration of ClearResults() (VC++)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@891 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 09:06:22 +00:00
theraysmith@gmail.com
4c3475ad2e Fixed fmemopen portability problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
zdenop@gmail.com
ee08f623ce fix issue 967
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@886 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-29 20:48:06 +00:00
zdenop@gmail.com
af319b4d90 fix for windows build - part 1
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@883 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-25 09:56:49 +00:00
theraysmith@gmail.com
4d514d5a60 Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
theraysmith@gmail.com
88ea81c89e Added renderer to API
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@869 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:39:59 +00:00
zdenop@gmail.com
b5e16669e1 fix issue 946/reopen issue 903
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@865 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-25 15:54:30 +00:00
zdenop@gmail.com
b1fd75ccf9 amend r:862
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@863 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-14 14:11:16 +00:00
zdenop@gmail.com
c45bb08a6e check inputformat before getting number of pages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@862 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-14 13:58:23 +00:00
zdenop@gmail.com
ebd0ba8134 remove unused code (tesseractmain.h)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@861 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:23:47 +00:00
zdenop@gmail.com
10c1169d98 remove unused code (Windows related)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
zdenop@gmail.com
b5d3d66a68 remove unused code(gettext)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@859 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-07 16:39:13 +00:00
zdenop@gmail.com
4c16ff6a1f use leptonica for getting number of pages instead of own code
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@858 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 16:07:25 +00:00
zdenop@gmail.com
8a0878af3a fix mingw build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@856 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-05 08:46:57 +00:00
zdenop@gmail.com
418a7ad16f allow to have text file with list of images as input
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@855 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-27 21:53:53 +00:00
zdenop@gmail.com
e5628e5e1a fix hOCR output - do not print empty words: issue 903
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@854 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-23 15:10:24 +00:00
zdenop@gmail.com
74dc14ebd4 fix copying a TessResultIterator using CAPI (issue 934)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@849 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-02 21:25:41 +00:00
zdenop@gmail.com
62b2e12b72 replace option -o with -c
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@841 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 17:06:14 +00:00
zdenop@gmail.com
7dcfd02c22 Allow arbitrary configuration options to be set from the command line (fix issue 893)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@837 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 20:43:14 +00:00
zdenop@gmail.com
1032cb1692 fix issue 881: capi.h redefines things from Leptonica, causing compilation failures
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@836 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 17:57:21 +00:00
zdenop@gmail.com
a04a5c1f42 Tesseract should exit with an error if ProcessPages fails (fixed issue 891)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@834 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-12 08:14:13 +00:00
zdenop@gmail.com
a6bee550e8 Add lang and dir attributes to each word in hOCR output (fix issue 878);
Unify usage of single quote in hOCR output 


git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@832 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-03-28 21:37:55 +00:00
zdenop@gmail.com
db52047420 fix issue 809: invalid hOCR output file on windows when input filename has non ascii chars.
Add release date to vs2008/doc/versions.html

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@828 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-02-23 15:01:21 +00:00
zdenop@gmail.com
37fb755d47 Add a command-line option (--print-parameters) to dump the parameters to stdout
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@814 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:54:14 +00:00
zdenop@gmail.com
4812fac33e Fix issue 427: print result to stdout instead to file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@813 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:52:42 +00:00
zdenop@gmail.com
8a2b5f0ead Fix issue 808: Check for output file write permissions before performing lengthy OCR operation
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@812 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:49:15 +00:00
zdenop@gmail.com
42c92c3e29 avoid multiple tesseract inits in tesseract executable
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@811 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:47:06 +00:00
zdenop@gmail.com
9b2906c67e fix issue 800: Get rid of glob() for searching available languages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@810 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-30 22:11:22 +00:00
zdenop@gmail.com
5d9fd5fb72 add word confidence info (x_wconf) to hocr output/fix issue 748
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@806 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-11-06 21:18:35 +00:00
theraysmith@gmail.com
af04ae882f Made use of _ macro and stderr consistent with error messages.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@780 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-22 23:40:19 +00:00
zdenop@gmail.com
6b4970776d Fixed tessdata_dir for tessseract executable.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@777 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-11 19:47:17 +00:00
theraysmith@gmail.com
605fd7488b Fixed relative-to-executable tessdata location, while allowing for addition of terminating /
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@774 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-09 00:41:08 +00:00
zdenop@gmail.com
ceff3288d7 fix issue 764...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@768 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-27 08:43:55 +00:00
zdenop@gmail.com
fb91759cdc fix issue 764 and clean tabulators, trim trailing spaces...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@767 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-27 08:24:46 +00:00
zdenop@gmail.com
23f1d16037 fix fox issue 346 / GetAvailableLanguagesAsVector
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@760 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-24 05:20:23 +00:00
zdenop@gmail.com
dc8bd4682b C-API (fix issue 362)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@759 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-24 05:14:11 +00:00
theraysmith@gmail.com
fbf7968490 Fixed problem with blank pages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@750 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:27:25 +00:00
zdenop@gmail.com
2a57976c41 - fix msys buil (missing -lws2_32 for library)
- remove old debian leptonica package


git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@738 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-25 19:53:41 +00:00
zdenop@gmail.com
306a8216e1 fix creating box file from empty image (issue 516)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@737 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-03 22:32:17 +00:00
zdenop@gmail.com
c8eedb25a6 added ocr-capabilities for hocr conformity; XHTML 1.0 Transitional conformity; improved hocr output readability
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@729 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-28 20:44:23 +00:00
david.eger@gmail.com
6a9a3ddcb2 Zdeno pointed out that ocr_line (though not ocr_word) is actually in the hocr spec.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@728 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-27 23:58:09 +00:00
david.eger@gmail.com
d9d70919bb Conform to the hocr spec: hocr doesn't have ocr_word, but instead has ocrx_word.
Tested with ExactImage's hocr2pdf. 
$ tesseract phototest.tif phototest hocr
$ hocr2pdf -i phototest.tif -o ./phototest.pdf < ./phototest.hocr 
$ evince phototest.pdf 

See: https://docs.google.com/document/preview?id=1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0 



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@726 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-25 17:36:25 +00:00
david.eger@gmail.com
eeeb4f513c Provide better paragraph segmentation without having to run fully
automatic layout analysis.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@725 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-10 00:03:34 +00:00
zdenop@gmail.com
aa14e8b212 fix Mingw shared build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@718 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-02 12:14:37 +00:00
zdenop@gmail.com
cd8de9157c change comments to doxygen block comments (api)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@716 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-30 21:24:12 +00:00
zdenop@gmail.com
ee44165d3d improve doxygen config; fix doxygen warnings for baseapi.h
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@712 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 20:38:14 +00:00
zdenop@gmail.com
3115fbfdcb another fix MinGW+MSYS
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@709 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-24 10:14:47 +00:00
zdenop@gmail.com
d4d4b8aad8 improve autools system (mingw+msys fix); implementation of --disable-tessdata-prefix
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@708 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-22 20:01:33 +00:00
zdenop@gmail.com
2f1c112640 +Remove visibility from protected members of tesseract::TessBaseAPI class by applying TESS_LOCAL macro;
+Make PageIterator & ResultIterator classes visible by applying TESS_API macro;
+Fix api/Makefile.am & training/Makefile.am to allow Parallel Build Trees;
patch from Tom Powers (https://groups.google.com/group/tesseract-dev/msg/9d00579540e44055)

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@701 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-07 22:04:46 +00:00
david.eger@gmail.com
c2e84c4606 Fix two issues with GetHOCRText():
+ make it not seg-fault if called without calling SetInputName().
+ make it not leak memory (thank you valgrind)

http://code.google.com/p/tesseract-ocr/issues/detail?id=463



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@699 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-06 21:18:16 +00:00
zdenop@gmail.com
765832d449 fixes issue 573 where boolean was being compared to float;
tesseract prints full version info when -v arg;
removes extra includes from tesseractmain.h;
removes extra DLLEXPORT & DLLIMPORT from hosts.h;
remove CCUTIL_IMPORTS & CCUTIL_EXPORTS from vs2008 *.vcproj;


git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@694 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-04 22:27:16 +00:00
zdenop@gmail.com
97e19443a3 install only necessary headers, fix uninstall
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@692 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 13:22:51 +00:00
zdenop@gmail.com
3b326532cc fix --enable-multiple-libraries; implement quite mode (issue 580)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@691 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 11:48:59 +00:00
zdenop@gmail.com
30a70142a0 visibility - autotools part (./configure --enable-visibility)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@690 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 23:51:33 +00:00
zdenop@gmail.com
a776e0be85 TP: visibility trial - code & windows build changes (without autotools changes)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@689 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 17:48:45 +00:00
zdenop@gmail.com
e216adab43 fix configure.ac; unify identifiers (WIN32 vs _WIN32)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@688 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 17:31:24 +00:00
zdenop@gmail.com
49c4ce3183 fix for GRAPHICS_DISABLED build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@686 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-01 22:43:51 +00:00
zdenop@gmail.com
df1cbdd7d3 fix for issue 463 (GetHOCRText segfaults unless SetInputName has been called first); removed declaration of GetLastInitLanguage
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@684 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-27 17:19:20 +00:00
zdenop@gmail.com
492f9119c2 check return code of API init (issue 593)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@680 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-26 14:48:35 +00:00
zdenop@gmail.com
6ccab83bd6 fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING")
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@677 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-19 21:48:45 +00:00
theraysmith@gmail.com
23dfabcab1 Cleaned up externally used namespace by removing includes from baseapi.h
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@657 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:16 +00:00
theraysmith@gmail.com
ef786ad29b Moved ResultIterator/PageIterator to ccmain
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@645 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:47:59 +00:00
zdenop@gmail.com
67f47008c7 fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
max.markin@gmail.com
0fef845950 VC2010: add support for dynamic linking
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@629 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-15 22:17:19 +00:00
zdenop@gmail.com
da41b96f7f removed check for libtiff - leptonica is required; cleanup #ifdef/#ifndef HAVE_LIBLEPT
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@624 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-30 06:34:41 +00:00
joregan@gmail.com
bf4a09d72a make single/multiple libraries optional -- this needs testing!!!
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
theraysmith@gmail.com
0d969b7b3a Fixed problem of config file vs command line for pageseg mode
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@611 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:41:51 +00:00
theraysmith@gmail.com
7ab0a97180 Fixed comment re bln_numericmode
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@610 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:41:03 +00:00
theraysmith@gmail.com
d5d15f32d7 Deleted Makefile.in from svn
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
7ec3dca968 show page 0 for multipage tiff;
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
zdenop@gmail.com
9b9efa8e4c man pages included to install script, improved windows installer script (issue 425), output format for "tesseract -v" changed to "3.00 version", README cleanup.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@601 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-08 20:33:18 +00:00
zdenop@gmail.com
411e074b4d fix for issues 479, 524 + tests for input image (there are no leptonica error messages on Windows console)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@597 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-29 21:55:49 +00:00
zdenop@gmail.com
1ad70ea8ff fixing issues 518 and 521
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@596 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-27 20:56:40 +00:00
zdenop@gmail.com
505c8dbece changed "xocr_word" to "ocrx_word" according hOCR spec
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@585 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-05-24 20:53:58 +00:00
zdenop@gmail.com
b54eee99ac configure script requires liblept;
add '--version' option for tesseract as alternative to '-v'

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@584 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-05-24 20:17:28 +00:00
theraysmith
c81483f714 Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@566 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:43:04 +00:00
theraysmith
a3f30eb5c7 Deleted lots of dead code, including PBLOB
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@555 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:51:34 +00:00
theraysmith
0d81f4b649 Fixed problem that was preventing pagesegmode from being set by config file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@554 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:43:38 +00:00
theraysmith
f040994f51 Fixed closing meta element in hocr output
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@549 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 06:25:20 +00:00
theraysmith
a7db6dada9 Fix for linking with leptonica on Linux.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@548 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 01:40:39 +00:00
theraysmith
137f4806b6 Added sub/superscript, small/dropcap detection
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@547 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-12-09 01:32:20 +00:00
zdenop@gmail.com
c707b26d5f fixed VC++2008 Express build after last changes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@543 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 12:46:41 +00:00
theraysmith
ef59841ebe Moved multipage code to BaseAPI and tidied up command line handling
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@532 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 00:58:30 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
zdenop@gmail.com
7511d76315 fixed hocr to produce valid document (acording http://validator.w3.org/) - issue http://code.google.com/p/tesseract-ocr/issues/detail?id=401
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@525 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-17 20:03:58 +00:00
zdenop@gmail.com
a750ffed7a fixed issue 394: The tessedit_pageseg_mode does not work; thanks sms@fritzwidmer.ch
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@517 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-02 20:52:58 +00:00
zdenop@gmail.com
fa4d4589cb fixed hocr (escape special special characters; thank to aizvorski) + hocr config)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@515 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-29 19:03:06 +00:00
zdenop@gmail.com
346da8c1e5 missing returns in nonvoid functions (thanks to rusnakp) issue 389;
corrected windows installation script - tesseract should be not run as start-up application;

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@514 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-26 13:35:02 +00:00
zdenop@gmail.com
21d6ea66c2 better handlig of multipage tiff (issue 380 http://code.google.com/p/tesseract-ocr/issues/detail?id=380)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@513 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-22 06:13:54 +00:00
joregan
e0b07948fc disabling gettext checks - not currently used, and something about disabling is causing subsequent autoconf checks to not run
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@492 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 16:27:39 +00:00
joregan
f2506871f9 move include of config_auto.h to not conflict with local types. Not finished
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@490 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 15:53:40 +00:00
joregan
7efbd3dab7 crap
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@444 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 15:17:52 +00:00