Commit Graph

44 Commits

Author SHA1 Message Date
Stefan Weil
386dd8a0c0 Update (master branch was renamed to main)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-13 07:42:46 +02:00
Stefan Weil
7fc9a34f79 Rename processed TIFF output file and add page number if needed (fixes issue #3544)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-09-01 14:16:05 +02:00
Stefan Weil
b7e8134dea Update URLs for Google groups
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-04-11 10:43:28 +02:00
Stefan Weil
3f2892bc04
Update description for fry language to match Wikipedia 2020-12-08 05:59:17 +01:00
Merlijn Wajer
5ff273675c tesseract.1.asc: sync with languages available in tessdata-fast
cos, div, fao, fyr, gla, hye are available in Ubuntu's 'tesseract-ocr-*'
packages but not mentioned in the manpage.
2020-12-04 18:16:45 +01:00
Merlijn Wajer
58f7a72f00 Remove references to "kur" and "tgl", add "fil" to man page
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is
now "kmr", which is actually Latin) now, but "kur" is not present in
tessdata_fast nor in tessdata_best. [1] [2]

"tgl" (Tagalo) is now named "fil" (Filipino) [3]

[1] https://github.com/tesseract-ocr/langdata/issues/124
[2] https://github.com/tesseract-ocr/tessdata_best/issues/23
[3] https://github.com/tesseract-ocr/langdata/issues/84
2020-12-01 23:43:50 +01:00
Stefan Weil
16553014e0 Replace references to the old wiki by new URLs
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-03 11:37:41 +01:00
Stefan Weil
5f76a8495b Sort options alphabetically in tesseract man page
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-16 10:19:00 +01:00
Stefan Weil
b55984fb88 Add description for new --dpi option in tesseract man page
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-16 09:33:41 +01:00
Stefan Weil
26b4457b86 Add description for new --psm values in tesseract man page
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-16 09:24:40 +01:00
Stefan Weil
a6981ae548 Improve man page for tesseract
Format it like the example
https://github.com/asciidoc/asciidoc/blob/master/doc/asciidoc.1.txt.

Replace tab characters by blanks.

Add also a chapter on environment variables.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-16 08:54:28 +01:00
Stefan Weil
e14797563b Update documentation for supported languages
kur_ara.traineddata was renamed to kmr.traineddata.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-15 11:07:54 +01:00
Stefan Weil
85d7feebf7 Add missing documentation for --help-extra
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-03-15 09:36:10 +01:00
Chris Mayo
a9d3efb6e3 Document that configfile can be a file path
Useful for custom config or when pointing tessdata to alternate
traineddata.
2019-03-05 19:47:54 +00:00
Chris Mayo
c3b18cfd27 Improve description of configs and parameters in tesseract(1)
Try to make the relationship between configs, -c and --print-parameters
clearer by always using parameter and not variable.

Include the filenames created by each config.
2019-02-06 20:03:51 +00:00
Chris Mayo
da279e4216 Tidy tesseract(1)
A typo and missing full stops.
2019-02-05 19:58:40 +00:00
Stefan Weil
a0e6586e63 Fix documentation for page segmentation mode 2
It never worked, so add a comment that the implementation is missing.
Add also a to-do comment.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-01-09 13:51:44 +01:00
Jake Sebright
e398601bf5 Include ALTO in list of supported output formats 2018-12-15 10:41:24 +01:00
Stefan Weil
3315931859 Merge and enhance documentation on language and script models
Add also links to the user forum and to the Wiki and update the
history text.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-05 16:55:21 +02:00
Stefan Weil
383dcf70b5 Document some more config options for tesseract
Clarify also the name(s) of the generated OCR result file(s):
Tesseract does not create a file named outbase.txt by default.

Fix also a sentence in the language section.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-05 16:03:51 +02:00
Stefan Weil
3e9b0acc5c Update tesseract man page
- move Tesseract 4 release note to other release notes
- format command line options in text
- add link to release notes (wiki)
- add link to contributors (GitHub)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-04 22:10:22 +02:00
Shree Devi Kumar
0c39d3446b Update tesseract man page about both OCR engines in tesseract 4 2018-10-04 04:01:26 +00:00
Stefan Weil
a387e1f71e Add documentation for lists of images to the tesseract man page
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-19 09:32:02 +02:00
Stefan Weil
6a28cce96b Fix whitespace issues
* Remove whitespace (blanks, tabs, cr) at line endings

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-01 13:19:52 +02:00
Charles Li
84f315db6c
Update tesseract.1.asc
Minor typo in options section for --user-patterns
2018-07-02 13:27:45 -07:00
Stefan Weil
365611f24a doc: Fix asciidoc escapes for C++ (#1427)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-25 21:02:33 +02:00
Stefan Weil
15638a5ce4 doc: Add missing language to list (#1368)
tessdata_fast includes bre.traineddata.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-13 18:58:53 +01:00
Stefan Weil
bdf6629722 Update version in README and manpages (#1381)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-12 21:39:29 +01:00
Stefan Weil
08ef815fe5 doc: Remove unsupported traineddata from list (#1367)
The languages dan_frak, deu_frak and slk_frak were contributions.
They are not part of tessdata_fast.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-10 08:41:58 +01:00
Shreeshrii
40f43111e0 Add list of scripts to manpage for tesseract (#1347) 2018-02-24 09:37:25 +01:00
zdenop
44588a3c7c
add commas to language list 2018-02-23 11:27:55 +01:00
Zdenko Podobný
035325dfd0 Update language list based on tessdata_fast; fix #1343 2018-02-23 11:19:18 +01:00
Chris Mayo
b231aee212 tidy tesseract(1) adding missing options
Together with:
- fix "C\++"
- align executable --print-parameters message
2017-03-23 20:02:50 +00:00
Stefan Weil
61d0e8f0ff doc: Fix line endings
Remove spaces at line endings and replace CRLF by LF.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 20:41:37 +01:00
Stefan Weil
92d981b93a Change tesseract parameter -psm to --psm
For compatibility reasons the old variant is still supported.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-30 22:23:46 +01:00
Cristian Ciupitu
71e217f8a9 Fix a typo in tesseract(1) man page
C++ needs to escaped as C\+\+ in the AsciiDoc source code.
2016-11-08 23:20:48 +02:00
Zdenko Podobný
dcc457cc05 add new lang info 2015-06-28 22:26:39 +02:00
Zdenko Podobný
9b7f2527f1 fix links in doc; autotools requires README 2015-06-13 00:08:05 +02:00
zdenop
19ddc89c44 update tesseract manpage and INSTALL.SVN
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1131 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-02 20:59:19 +00:00
david.eger@gmail.com
a253ea224a Add some documentation on how to use config files and user dictionaries.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@719 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-09 19:41:06 +00:00
david.eger@gmail.com
58e06c8c45 Update man pages for Tesseract 3.02.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@670 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-09 22:55:47 +00:00
joregan
ceb787c0a4 change table to horizontal list, because the table stuff looks awful in the generated man page
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@471 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-29 17:59:10 +00:00
joregan
2c76f06155 fix the damned escaping on C++
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@470 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-29 17:56:39 +00:00
joregan
ec7bc49cc1 using asciidoc source probably makes more sense
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@469 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-29 16:46:04 +00:00