Commit Graph

99 Commits

Author SHA1 Message Date
Stefan Weil
076f21c1f2 Print version to stdout instead to stderr
Most command line programs print the version to stdout.
This seams to be reasonable for Tesseract, too.

Now a shell statement like "VERSION=$(tesseract --version)" works
without I/O redirection.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-03-16 12:10:27 +01:00
amitdo
bf5345f6a1 Don't display tesseract's banner when quiet mode is active 2016-03-07 19:25:09 +02:00
Tom Morris
6700edd8bc Cleanup TSV renderer
Remove all references to hocr, hocr.tsv, etc. Remove dead code for font
info, input filename, HTML escapes. Improved comments. Fixed
indentation.
2016-03-01 13:41:19 -05:00
Sundar M. Vaidya
59d593d796 Calls TessHOcrTsvRenderer if tessedit_create_hocrtsv is true. 2016-03-01 12:23:12 -05:00
amitdo
6be9d7a5f8 Fix #64. Make box training work
This commit is better than 06fc0533c. Hopefully, this is the last fix to box training issue.
2016-01-29 03:37:34 +02:00
amitdo
06fc0533c8 Fix #184. Training should work now 2016-01-17 14:27:35 +02:00
amitdo
a20156fc67 Add missing ')'_to make the code compile 2015-12-11 19:42:16 +02:00
amitdo
c2f5e9b849 If there is no explicit renderer(s), default to TessTextRenderer
Revert fd429c32, 43834da7, 05de195e.

See #49, #59.

The code in this commit solves the issue in a more elegant way, IMHO.

Now you can use:
  * `tesseract eurotext.tif eurotext txt pdf`
  * `tesseract eurotext.tif eurotext txt hocr`
  * `tesseract eurotext.tif eurotext txt hocr pdf`

NOTE:
  With `tesseract eurotext.tif eurotext`
  or `tesseract eurotext.tif eurotext txt`
  the psm will be set to '3', but...
  With `tesseract eurotext.tif eurotext txt pdf`
  or `tesseract eurotext.tif eurotext txt hocr`
  the psm will be set to '1'.
2015-12-11 19:06:49 +02:00
Stefan Weil
71c9e028f7 tesseractmain: Prettify help message
Commit 99110df757 improved the help text
in several aspects, but also introduced new inconsistencies which this
patch tries to fix.

* Align columns (this needed replacing tabs by spaces).
* Start explaining text with uppercase.
* Replace "the stdout" by "stdout.
* Small changes in help text for page segmentation modes.
* Split options in OCR options and single options
  (partially revert commit 99110df757).

In addition, whitespace characters at end of lines were removed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-29 10:26:40 +01:00
amitdo
99110df757 tesseractmain.cpp: Split huge main() to sub functions
Add these functions to api/tesseractmain.cpp:
PrintVersionInfo()
PrintUsage()
PrintHelpForPSM()
PrintHelpMessage()
SetVariablesFromCLArgs()
PrintLangsList()
FixPageSegMode()
ParseArgs()
PreloadRenderers()
2015-11-26 11:36:16 +02:00
Stefan Weil
03f37c0cdc tesseractmain: Fix unterminated string
Coverity bug report: CID 1270421 "Buffer not null terminated".

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-24 17:17:17 +01:00
amitdo
6bbcb50dd9 Added osd renderer for psm 0.
Works for single page and multi-page.
2015-10-30 20:09:00 +02:00
amitdo
dcfdd5c035 OSD: Print script name instead of meaningless script id 2015-10-28 09:50:28 +02:00
Stefan Weil
11b2a4d9af api: Fix typos in comments (all found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-14 21:54:27 +02:00
Zdenko Podobný
628de5ba3f enable pdfrender with NO_CUBE_BUILD 2015-08-07 23:20:22 +02:00
Zdenko Podobný
41478fd5a1 implement build without cube (-DNO_CUBE_BUILD) 2015-07-24 11:51:44 +02:00
Ray Smith
242b14ae7f Reduced size of multi-renderer implementation from code review 2014-10-09 13:29:46 -07:00
Zdenko Podobný
9e8629d9ef allow multiple output in tesseract executable (https://groups.google.com/d/msg/tesseract-ocr/Z_WUKmJDVxc/1vc3W0xJZ2oJ) 2014-09-19 23:33:47 +02:00
theraysmith@gmail.com
b64ad05096 Improved efficiency of image processing for PDF
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:15:25 +00:00
zdenop
1156098567 Add font info to hocr output - fix issue 1219
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-03 16:22:12 +00:00
zdenop@gmail.com
84cdcb32cc fixed windows build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1110 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-26 06:48:58 +00:00
theraysmith@gmail.com
25a8c7b720 Enabled streaming input and output of multi-page documents
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-21 15:46:21 +00:00
zdenop
2e520f2fac fix hocr/pdf output when image is provided from stdin - issue 1196
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1099 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-11 15:59:47 +00:00
zdenop
ee73e3b107 fix issue 123: user-words (and user-patterns) file specified by command line
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1093 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-04 21:11:00 +00:00
zdenop
d451b28054 fix issue 1127; add unvl output to tesseract executable
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1052 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-03-02 14:40:21 +00:00
zdenop
32789291a8 provide output for -psm 0
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1037 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-01 12:56:36 +00:00
theraysmith@gmail.com
d11dc049e3 Fixed a lot of compiler/clang warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
0d93bb7cfa More code cleanup from patches and fixing warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1011 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-24 21:09:59 +00:00
zdenop@gmail.com
9f2730600d fix segfault for --list-langs
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1006 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 21:32:20 +00:00
zdenop@gmail.com
21756518e2 don't display tesseract info line if output is stdout
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@999 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-22 08:17:37 +00:00
zdenop
f2e4dba850 fix issue 995 - produce output orientation info
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@982 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-12 14:47:15 +00:00
theraysmith@gmail.com
91d2265429 More minor fixes from issues and cleanup
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@974 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 01:38:00 +00:00
theraysmith@gmail.com
256929ce5a Cleaned up stdin implementation
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@969 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:48:43 +00:00
theraysmith@gmail.com
f2ec85d1e1 Added PDF renderer
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@962 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:58:55 +00:00
zdenop@gmail.com
9c25eda469 fix issue 813: implement input through stdin
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@936 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-07 16:48:52 +00:00
zdenop
ed28bae8d2 produce only one output file in case of hocr
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@935 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-06 14:01:32 +00:00
zdenop
577e919215 move PERF_COUNT_START message below tesseract message; implement parameter to suppress test blob messages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@932 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-22 21:58:52 +00:00
zdenop
8b3e590123 fix OpenCL build on OSX 10.9; add info about OpenCL to 'tesseract -v'
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@921 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-14 08:35:14 +00:00
rajesh.katikam@gmail.com
b8d7a1d139 Fixed all the crashes observed on 24 bit and 8 bit images.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@919 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-12-10 10:52:54 +00:00
zdenop@gmail.com
e66d433907 fix issue 938: change tessdata-dir/datadir rules; implement --tessdata-dir option
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@907 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-10 20:59:11 +00:00
theraysmith@gmail.com
88ea81c89e Added renderer to API
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@869 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-20 19:39:59 +00:00
zdenop@gmail.com
10c1169d98 remove unused code (Windows related)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
zdenop@gmail.com
b5d3d66a68 remove unused code(gettext)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@859 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-07 16:39:13 +00:00
zdenop@gmail.com
418a7ad16f allow to have text file with list of images as input
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@855 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-27 21:53:53 +00:00
zdenop@gmail.com
62b2e12b72 replace option -o with -c
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@841 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-02 17:06:14 +00:00
zdenop@gmail.com
7dcfd02c22 Allow arbitrary configuration options to be set from the command line (fix issue 893)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@837 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-29 20:43:14 +00:00
zdenop@gmail.com
a04a5c1f42 Tesseract should exit with an error if ProcessPages fails (fixed issue 891)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@834 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-04-12 08:14:13 +00:00
zdenop@gmail.com
37fb755d47 Add a command-line option (--print-parameters) to dump the parameters to stdout
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@814 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:54:14 +00:00
zdenop@gmail.com
4812fac33e Fix issue 427: print result to stdout instead to file
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@813 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:52:42 +00:00
zdenop@gmail.com
8a2b5f0ead Fix issue 808: Check for output file write permissions before performing lengthy OCR operation
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@812 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-12-23 17:49:15 +00:00