Sundar M. Vaidya
59d593d796
Calls TessHOcrTsvRenderer if tessedit_create_hocrtsv is true.
2016-03-01 12:23:12 -05:00
amitdo
6be9d7a5f8
Fix #64 . Make box training work
...
This commit is better than 06fc0533c
. Hopefully, this is the last fix to box training issue.
2016-01-29 03:37:34 +02:00
amitdo
06fc0533c8
Fix #184 . Training should work now
2016-01-17 14:27:35 +02:00
amitdo
a20156fc67
Add missing ')'_to make the code compile
2015-12-11 19:42:16 +02:00
amitdo
c2f5e9b849
If there is no explicit renderer(s), default to TessTextRenderer
...
Revert fd429c32
, 43834da7
, 05de195e
.
See #49 , #59 .
The code in this commit solves the issue in a more elegant way, IMHO.
Now you can use:
* `tesseract eurotext.tif eurotext txt pdf`
* `tesseract eurotext.tif eurotext txt hocr`
* `tesseract eurotext.tif eurotext txt hocr pdf`
NOTE:
With `tesseract eurotext.tif eurotext`
or `tesseract eurotext.tif eurotext txt`
the psm will be set to '3', but...
With `tesseract eurotext.tif eurotext txt pdf`
or `tesseract eurotext.tif eurotext txt hocr`
the psm will be set to '1'.
2015-12-11 19:06:49 +02:00
Stefan Weil
71c9e028f7
tesseractmain: Prettify help message
...
Commit 99110df757
improved the help text
in several aspects, but also introduced new inconsistencies which this
patch tries to fix.
* Align columns (this needed replacing tabs by spaces).
* Start explaining text with uppercase.
* Replace "the stdout" by "stdout.
* Small changes in help text for page segmentation modes.
* Split options in OCR options and single options
(partially revert commit 99110df757
).
In addition, whitespace characters at end of lines were removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-29 10:26:40 +01:00
amitdo
99110df757
tesseractmain.cpp: Split huge main() to sub functions
...
Add these functions to api/tesseractmain.cpp:
PrintVersionInfo()
PrintUsage()
PrintHelpForPSM()
PrintHelpMessage()
SetVariablesFromCLArgs()
PrintLangsList()
FixPageSegMode()
ParseArgs()
PreloadRenderers()
2015-11-26 11:36:16 +02:00
Stefan Weil
03f37c0cdc
tesseractmain: Fix unterminated string
...
Coverity bug report: CID 1270421 "Buffer not null terminated".
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-24 17:17:17 +01:00
amitdo
6bbcb50dd9
Added osd renderer for psm 0.
...
Works for single page and multi-page.
2015-10-30 20:09:00 +02:00
amitdo
dcfdd5c035
OSD: Print script name instead of meaningless script id
2015-10-28 09:50:28 +02:00
Stefan Weil
11b2a4d9af
api: Fix typos in comments (all found by codespell)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-14 21:54:27 +02:00
Zdenko Podobný
628de5ba3f
enable pdfrender with NO_CUBE_BUILD
2015-08-07 23:20:22 +02:00
Zdenko Podobný
41478fd5a1
implement build without cube (-DNO_CUBE_BUILD)
2015-07-24 11:51:44 +02:00
Ray Smith
242b14ae7f
Reduced size of multi-renderer implementation from code review
2014-10-09 13:29:46 -07:00
Zdenko Podobný
9e8629d9ef
allow multiple output in tesseract executable ( https://groups.google.com/d/msg/tesseract-ocr/Z_WUKmJDVxc/1vc3W0xJZ2oJ )
2014-09-19 23:33:47 +02:00
theraysmith@gmail.com
b64ad05096
Improved efficiency of image processing for PDF
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:15:25 +00:00
zdenop
1156098567
Add font info to hocr output - fix issue 1219
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-03 16:22:12 +00:00
zdenop@gmail.com
84cdcb32cc
fixed windows build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1110 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-26 06:48:58 +00:00
theraysmith@gmail.com
25a8c7b720
Enabled streaming input and output of multi-page documents
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop
2e520f2fac
fix hocr/pdf output when image is provided from stdin - issue 1196
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1099 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 15:59:47 +00:00
zdenop
ee73e3b107
fix issue 123: user-words (and user-patterns) file specified by command line
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1093 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-04 21:11:00 +00:00
zdenop
d451b28054
fix issue 1127; add unvl output to tesseract executable
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1052 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-02 14:40:21 +00:00
zdenop
32789291a8
provide output for -psm 0
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1037 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 12:56:36 +00:00
theraysmith@gmail.com
d11dc049e3
Fixed a lot of compiler/clang warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
0d93bb7cfa
More code cleanup from patches and fixing warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1011 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:09:59 +00:00
zdenop@gmail.com
9f2730600d
fix segfault for --list-langs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1006 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 21:32:20 +00:00
zdenop@gmail.com
21756518e2
don't display tesseract info line if output is stdout
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@999 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:17:37 +00:00
zdenop
f2e4dba850
fix issue 995 - produce output orientation info
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@982 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 14:47:15 +00:00
theraysmith@gmail.com
91d2265429
More minor fixes from issues and cleanup
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
256929ce5a
Cleaned up stdin implementation
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@969 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:48:43 +00:00
theraysmith@gmail.com
f2ec85d1e1
Added PDF renderer
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@962 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:58:55 +00:00
zdenop@gmail.com
9c25eda469
fix issue 813: implement input through stdin
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@936 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-07 16:48:52 +00:00
zdenop
ed28bae8d2
produce only one output file in case of hocr
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@935 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 14:01:32 +00:00
zdenop
577e919215
move PERF_COUNT_START message below tesseract message; implement parameter to suppress test blob messages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@932 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:58:52 +00:00
zdenop
8b3e590123
fix OpenCL build on OSX 10.9; add info about OpenCL to 'tesseract -v'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@921 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-14 08:35:14 +00:00
rajesh.katikam@gmail.com
b8d7a1d139
Fixed all the crashes observed on 24 bit and 8 bit images.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
zdenop@gmail.com
e66d433907
fix issue 938: change tessdata-dir/datadir rules; implement --tessdata-dir option
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@907 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 20:59:11 +00:00
theraysmith@gmail.com
88ea81c89e
Added renderer to API
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@869 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:39:59 +00:00
zdenop@gmail.com
10c1169d98
remove unused code (Windows related)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
zdenop@gmail.com
b5d3d66a68
remove unused code(gettext)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@859 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-07 16:39:13 +00:00
zdenop@gmail.com
418a7ad16f
allow to have text file with list of images as input
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@855 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-27 21:53:53 +00:00
zdenop@gmail.com
62b2e12b72
replace option -o with -c
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@841 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 17:06:14 +00:00
zdenop@gmail.com
7dcfd02c22
Allow arbitrary configuration options to be set from the command line (fix issue 893)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@837 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 20:43:14 +00:00
zdenop@gmail.com
a04a5c1f42
Tesseract should exit with an error if ProcessPages fails (fixed issue 891)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@834 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-12 08:14:13 +00:00
zdenop@gmail.com
37fb755d47
Add a command-line option (--print-parameters) to dump the parameters to stdout
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@814 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:54:14 +00:00
zdenop@gmail.com
4812fac33e
Fix issue 427: print result to stdout instead to file
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@813 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:52:42 +00:00
zdenop@gmail.com
8a2b5f0ead
Fix issue 808: Check for output file write permissions before performing lengthy OCR operation
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@812 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:49:15 +00:00
zdenop@gmail.com
42c92c3e29
avoid multiple tesseract inits in tesseract executable
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@811 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:47:06 +00:00
theraysmith@gmail.com
af04ae882f
Made use of _ macro and stderr consistent with error messages.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@780 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-22 23:40:19 +00:00
zdenop@gmail.com
6b4970776d
Fixed tessdata_dir for tessseract executable.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@777 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-10-11 19:47:17 +00:00