Ray Smith
ab0f4e2c38
Clang fixes to earlier changes and build compatability with Google environment
2015-06-12 10:53:21 -07:00
orbitcowboy
9328f0e5d4
Fix potential null pointer dereference in ccmain/paragraphs.cpp.
2015-05-19 10:17:44 +02:00
Jim O'Regan
4a6195202c
fix typo
2015-05-18 12:32:36 +01:00
Zdenko Podobný
438edd6c7b
added row attributes to hocr output
2015-05-17 22:13:59 +02:00
Zdenko Podobný
ed6ae9b974
Add monitor to GetHOCRText
2015-05-17 21:55:50 +02:00
Zdenko Podobný
59bcbc79b3
fix GIT_VER info in VS2010
2015-05-15 15:14:49 +02:00
Zdenko Podobný
e98849b482
rint error message when pdf.ttf is not found.
2015-05-15 15:14:00 +02:00
Zdenko Podobný
035b324f0f
reflect the latest commits in VS2010 build
2015-05-14 10:52:54 +02:00
Jim O'Regan
b13691fda0
Merge conflict: going with Ray's version
2015-05-13 08:54:28 +01:00
Ray Smith
03f3c9dc88
Misc fixes missed from previous commits
2015-05-12 18:13:15 -07:00
Ray Smith
6b634170c1
Significant change to invisible font system
...
to improve correctness and compatibility with
external programs, particularly ghostscript.
We will start mapping everything to a single glyph,
rather than allowing characters to run off the end
of the font.
A more detailed design discussion is embedded into
pdfrenderer.cpp comments. The font, source code
that produces the font, and the design comments
were contributed by Ken Sharp from Artifex Software.
2015-05-12 17:33:18 -07:00
Ray Smith
4a3caefd92
Add ability to build under android (without cube or scrollview).
2015-05-12 15:41:15 -07:00
Ray Smith
53fc4456cc
Fixed issue 1252: Refactored LearnBlob and its call hierarchy to make it a member of Classify.
...
Eliminated the flexfx scheme for calling global feature extractor functions
through an array of function pointers.
Deleted dead code I found as a by-product.
This CL does not change BlobToTrainingSample or ExtractFeatures to be full
members of Classify (the eventual goal) as that would make it even bigger,
since there are a lot of callers to these functions.
When ExtractFeatures and BlobToTrainingSample are members of Classify they
will be able to access control parameters in Classify, which will greatly
simplify developing variations to the feature extraction process.
2015-05-12 15:22:34 -07:00
Zdenko Podobný
d508751e58
Fixed issue 1317 - git revision info used as version info for autotools & DEBUG
2015-05-02 12:15:13 +02:00
Zdenko Podobný
4c7c960bfd
fix issue 1417
2015-02-07 22:22:20 +01:00
Zdenko Podobný
09b0c91fc9
fix Issue 1398
2015-02-06 23:44:58 +01:00
Zdenko Podobný
e0441d0c6b
fix typo/ issue 1397
2014-12-31 22:31:50 +01:00
Zdenko Podobný
473141c1de
fix bool in c-api
2014-12-28 17:55:56 +01:00
Zdenko Podobný
4da712d04d
Add paragraph info to C-API(fix issue 1388)
2014-12-07 14:07:14 +01:00
Zdenko Podobný
239f350a72
remove const from C API TessResultIteratorGetChoiceIterator (issue 1342)
2014-10-14 22:46:11 +02:00
Ray Smith
242b14ae7f
Reduced size of multi-renderer implementation from code review
2014-10-09 13:29:46 -07:00
Ray Smith
d9699c4099
Fixed bidi handling in PDF output
2014-10-09 13:29:01 -07:00
Zdenko Podobný
d0cb1071b2
remove parameters tessedit_pdf_jpg_quality, tessedit_pdf_compression (reasons are in i1300 and i1285)
2014-10-07 23:37:34 +02:00
Zdenko Podobný
4904afe65b
fix issue 1300 - patch from #35
2014-10-06 22:43:56 +02:00
Zdenko Podobný
4c01561b0f
fix issue 1300 - patch from #26
2014-10-02 21:19:17 +02:00
Zdenko Podobný
c0640a4bef
fix cygwin build (issue 1289)
2014-09-28 23:19:52 +02:00
Zdenko Podobný
f8613fab22
fix issue 1300 /patches from breidenbach
2014-09-21 16:38:24 +02:00
Zdenko Podobný
9e8629d9ef
allow multiple output in tesseract executable ( https://groups.google.com/d/msg/tesseract-ocr/Z_WUKmJDVxc/1vc3W0xJZ2oJ )
2014-09-19 23:33:47 +02:00
Ray Smith
648e7ca311
Merge branch 'master' of https://code.google.com/p/tesseract-ocr
...
Usual git need to merge if local is out of date.
2014-09-17 18:10:17 -07:00
Ray Smith
0256529c1f
Fixed issue 1243
2014-09-17 18:09:45 -07:00
Jim O'Regan
c0c719306a
update docs for TessBaseAPI::SetProbabilityInContextFunc based on Ray's email today
2014-09-09 20:37:27 +01:00
Zdenko Podobný
d1aa61c110
fix issue 1285: reimplement option to select pdf compression
2014-09-06 09:32:22 +02:00
Ray Smith
cd2653c167
Cleanup from previous changes
2014-08-12 16:12:46 -07:00
theraysmith@gmail.com
dbf6197471
Major refactor of control.cpp to enable line recognition
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
theraysmith@gmail.com
b64ad05096
Improved efficiency of image processing for PDF
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:15:25 +00:00
zdenop
bce2cd5f33
enable to select pdf compression type and jpeg quality (fix issue 1263)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1134 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-08 21:18:44 +00:00
zdenop
1156098567
Add font info to hocr output - fix issue 1219
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-03 16:22:12 +00:00
zdenop
5b779456f9
fix compatibility with leptonica 1.71 and 1.70
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1126 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-07-24 19:11:39 +00:00
zdenop
95b7783a95
fix issue 1228: bilevel pdf output - horizontal/vertical lines removed
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1118 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-23 21:04:37 +00:00
zdenop
905e6162b9
put info about (API) version; fix typo
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1117 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-06-22 18:31:42 +00:00
zdenop
fad9de4e1b
fix issue 1217: GetThresholdedImage accesses possibly NULL thresholder_
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1113 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 21:21:37 +00:00
zdenop
e64f555567
fix Issue 1223: TessPolyBlockType enum is outdated in C-API
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1112 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 20:31:48 +00:00
zdenop
36f3f76d64
fix tiff issue on windows
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1111 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-31 07:27:54 +00:00
zdenop@gmail.com
84cdcb32cc
fixed windows build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1110 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-26 06:48:58 +00:00
zdenop
19c4c2f0e7
fix C-API to resent C++ API changes - thanks to Nick White
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1109 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-25 21:03:11 +00:00
zdenop
ffe52737d5
check if input file exists
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1108 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-25 19:58:00 +00:00
theraysmith@gmail.com
25a8c7b720
Enabled streaming input and output of multi-page documents
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop
979f9cafe5
Add word recognition language to C-API - fix issue 1200
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1102 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-16 18:35:54 +00:00
zdenop
44b0d0e28e
addition to r1100
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1101 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:24:54 +00:00
zdenop
6051e40212
fix issue 1197
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1100 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 21:20:38 +00:00
zdenop
2e520f2fac
fix hocr/pdf output when image is provided from stdin - issue 1196
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1099 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 15:59:47 +00:00
zdenop
bdb912c186
escape input_file name in hOCR output - fix issue 1154
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1098 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 22:19:30 +00:00
zdenop
30f6ae6742
amendment to r1091
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1095 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-07 20:53:03 +00:00
zdenop
ee73e3b107
fix issue 123: user-words (and user-patterns) file specified by command line
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1093 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-04 21:11:00 +00:00
zdenop
bc09cd9040
fix formating in C-API and add TessChoiceIteratorDelete
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1092 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-03 20:21:37 +00:00
zdenop
f86e9d83d4
add ChoiceIterator to C-API - fix issue 1149
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1091 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-03 09:29:20 +00:00
theraysmith@gmail.com
45e106820f
Fixed issue 1116
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1074 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-04-24 00:50:27 +00:00
zdenop@gmail.com
2367ba1f6e
fix PDF rendering for Arabic. http://ftp.de.debian.org/debian/pool/main/t/tesseract/tesseract_3.03.02-3.diff.gz
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1055 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-21 10:11:32 +00:00
zdenop
d451b28054
fix issue 1127; add unvl output to tesseract executable
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1052 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-02 14:40:21 +00:00
zdenop
f01ea0e485
C-API: remove comments
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1047 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 20:01:55 +00:00
theraysmith@gmail.com
2fcea93846
Fixed issues 1081-1090
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1046 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-04 02:23:18 +00:00
zdenop
790a3da22f
remove 'class IMAGE;'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1045 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 23:32:23 +00:00
theraysmith@gmail.com
864b2f6d80
Fixed problems with selection/copy/paste in some PDF viewers
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1042 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 19:14:16 +00:00
zdenop
4e526f987e
C-API: another update API based on changes in baseapi.h
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1041 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 13:24:55 +00:00
zdenop
0e238d43ba
C-API: update API based on changes in baseapi.h; add renderer
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1040 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-02 23:50:47 +00:00
zdenop
32789291a8
provide output for -psm 0
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1037 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 12:56:36 +00:00
zdenop
080e0c028a
C-API: add function to set init parameter during Init with c-string array
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1036 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 10:36:40 +00:00
theraysmith@gmail.com
4585a4c9df
Fixed empty page with color input
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1032 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-30 02:18:51 +00:00
theraysmith@gmail.com
0ddc7bfcaf
Fixed first-word only bug in PDF output.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1022 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-27 22:40:03 +00:00
theraysmith@gmail.com
d11dc049e3
Fixed a lot of compiler/clang warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
1a487252f4
Fixed slow-down that was caused by upping MAX_NUM_CLASSES
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1013 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:12:35 +00:00
theraysmith@gmail.com
0d93bb7cfa
More code cleanup from patches and fixing warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1011 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:09:59 +00:00
theraysmith@gmail.com
5b9a7e06eb
Turned on pdfrenderer functionality that needs leptonica 1.70
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1009 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-23 23:01:10 +00:00
zdenop@gmail.com
9f2730600d
fix segfault for --list-langs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1006 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 21:32:20 +00:00
zdenop@gmail.com
21756518e2
don't display tesseract info line if output is stdout
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@999 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:17:37 +00:00
zdenop@gmail.com
71ae509354
fix for mingw32/g++ 4.8.1
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@998 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:10:15 +00:00
zdenop@gmail.com
ef3b1d936e
fix mingw build issues
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@995 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-18 09:00:54 +00:00
zdenop
ff5fb7f105
fix issue 1044: OS X build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@994 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-17 20:53:15 +00:00
zdenop@gmail.com
94d08567e1
fix vs2010 (and maybe vs2008) build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@983 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 20:13:55 +00:00
zdenop
f2e4dba850
fix issue 995 - produce output orientation info
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@982 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 14:47:15 +00:00
theraysmith@gmail.com
91d2265429
More minor fixes from issues and cleanup
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
256929ce5a
Cleaned up stdin implementation
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@969 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:48:43 +00:00
theraysmith@gmail.com
f2ec85d1e1
Added PDF renderer
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@962 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:58:55 +00:00
zdenop@gmail.com
9c25eda469
fix issue 813: implement input through stdin
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@936 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-07 16:48:52 +00:00
zdenop
ed28bae8d2
produce only one output file in case of hocr
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@935 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 14:01:32 +00:00
zdenop
11f7eea7e1
fix tiff identification
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@934 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 13:25:42 +00:00
zdenop
577e919215
move PERF_COUNT_START message below tesseract message; implement parameter to suppress test blob messages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@932 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:58:52 +00:00
zdenop
fced05f419
identify all supported tiff version by leptonica
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@931 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:47:07 +00:00
zdenop
8b3e590123
fix OpenCL build on OSX 10.9; add info about OpenCL to 'tesseract -v'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@921 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-14 08:35:14 +00:00
zdenop
9de80e0a06
fix resource leaks - issues 1034, 1038, 1040. Thanks to Martin Ettl
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@920 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-13 22:13:52 +00:00
rajesh.katikam@gmail.com
b8d7a1d139
Fixed all the crashes observed on 24 bit and 8 bit images.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
rajesh.katikam@gmail.com
bf0a83907b
Cleaned up configure.ac and Makefile.am in multiple folder to use OPENCL paths
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@910 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-12 10:40:40 +00:00
rajesh.katikam@gmail.com
983aaabaae
Initial version of OpenCL support added.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@909 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-11 17:43:13 +00:00
zdenop@gmail.com
c7ba981e04
fix validity of hocr output of multipage image
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@908 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 22:00:54 +00:00
zdenop@gmail.com
e66d433907
fix issue 938: change tessdata-dir/datadir rules; implement --tessdata-dir option
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@907 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 20:59:11 +00:00
zdenop@gmail.com
77c1b41e4e
fix svn:executable atribute, trailing spaces, version include
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@903 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-03 17:24:00 +00:00
zdenop@gmail.com
b15c710385
fix declaration of ClearResults() (VC++)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@891 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 09:06:22 +00:00
theraysmith@gmail.com
4c3475ad2e
Fixed fmemopen portability problem
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
zdenop@gmail.com
ee08f623ce
fix issue 967
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@886 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-29 20:48:06 +00:00
zdenop@gmail.com
af319b4d90
fix for windows build - part 1
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@883 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-25 09:56:49 +00:00