Update version in README and manpages (#1381)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
This commit is contained in:
Stefan Weil 2018-03-12 21:39:29 +01:00 committed by zdenop
parent 8fb68746fb
commit bdf6629722
3 changed files with 22 additions and 22 deletions

View File

@ -33,7 +33,7 @@ In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google.
The latest stable version is **[3.05.01](https://github.com/tesseract-ocr/tesseract/releases/tag/3.05.01)**, released on June 1, 2017. Latest source code for 3.05 is available from [3.05 branch on GitHub](https://github.com/tesseract-ocr/tesseract/tree/3.05). The latest stable version is **[3.05.01](https://github.com/tesseract-ocr/tesseract/releases/tag/3.05.01)**, released on June 1, 2017. Latest source code for 3.05 is available from [3.05 branch on GitHub](https://github.com/tesseract-ocr/tesseract/tree/3.05).
Source code for the new **[LSTM based 4.00.00alpha version](https://github.com/tesseract-ocr/tesseract)** is available from the master branch on GitHub. Please note this branch is under active development. Source code for the new **[LSTM based 4.0 version](https://github.com/tesseract-ocr/tesseract)** is available from the master branch on GitHub. Please note this branch is under active development.
See **[Release Notes](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes)** and **[Change Log](https://github.com/tesseract-ocr/tesseract/blob/master/ChangeLog)** for more details of the releases. See **[Release Notes](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes)** and **[Change Log](https://github.com/tesseract-ocr/tesseract/blob/master/ChangeLog)** for more details of the releases.

View File

@ -11,7 +11,7 @@ SYNOPSIS
DESCRIPTION DESCRIPTION
----------- -----------
combine_tessdata(1) is the main program to combine/extract/overwrite/list/compact combine_tessdata(1) is the main program to combine/extract/overwrite/list/compact
tessdata components in [lang].traineddata files. tessdata components in [lang].traineddata files.
To combine all the individual tessdata components (unicharset, DAWGs, To combine all the individual tessdata components (unicharset, DAWGs,
@ -59,10 +59,10 @@ OPTIONS
*-c* '.traineddata' 'FILE'...: *-c* '.traineddata' 'FILE'...:
Compacts the LSTM component in the .traineddata file to int. Compacts the LSTM component in the .traineddata file to int.
*-d* '.traineddata' 'FILE'...: *-d* '.traineddata' 'FILE'...:
Lists directory of components from the .traineddata file. Lists directory of components from the .traineddata file.
*-e* '.traineddata' 'FILE'...: *-e* '.traineddata' 'FILE'...:
Extracts the specified components from the .traineddata file Extracts the specified components from the .traineddata file
@ -81,7 +81,7 @@ CAVEATS
COMPONENTS COMPONENTS
---------- ----------
The components in a Tesseract lang.traineddata file as of The components in a Tesseract lang.traineddata file as of
Tesseract 4.00alpha are briefly described below; For more information on Tesseract 4.0 are briefly described below; For more information on
many of these files, see many of these files, see
<https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract> <https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract>
and and
@ -89,7 +89,7 @@ and
lang.config:: lang.config::
(Optional) Language-specific overrides to default config variables. (Optional) Language-specific overrides to default config variables.
For 4.00alpha traineddata files, lang.config provides control parameters which For 4.0 traineddata files, lang.config provides control parameters which
can affect layout analysis, and sub-languages. can affect layout analysis, and sub-languages.
lang.unicharset:: lang.unicharset::
@ -148,36 +148,36 @@ lang.params-model::
(Optional - 3.0x legacy tesseract) . (Optional - 3.0x legacy tesseract) .
lang.lstm:: lang.lstm::
(Required - 4.00alpha LSTM) Neural net trained recognition model generated by lstmtraining. (Required - 4.0 LSTM) Neural net trained recognition model generated by lstmtraining.
lang.lstm-punc-dawg:: lang.lstm-punc-dawg::
(Optional - 4.00alpha LSTM) A dawg made from punctuation patterns found around words. (Optional - 4.0 LSTM) A dawg made from punctuation patterns found around words.
The "word" part is replaced by a single space. Uses lang.lstm-unicharset. The "word" part is replaced by a single space. Uses lang.lstm-unicharset.
lang.lstm-word-dawg:: lang.lstm-word-dawg::
(Optional - 4.00alpha LSTM) A dawg made from dictionary words from the language. (Optional - 4.0 LSTM) A dawg made from dictionary words from the language.
Uses lang.lstm-unicharset. Uses lang.lstm-unicharset.
lang.lstm-number-dawg:: lang.lstm-number-dawg::
(Optional - 4.00alpha LSTM) A dawg made from tokens which originally contained digits. (Optional - 4.0 LSTM) A dawg made from tokens which originally contained digits.
Each digit is replaced by a space character. Uses lang.lstm-unicharset. Each digit is replaced by a space character. Uses lang.lstm-unicharset.
lang.lstm-unicharset:: lang.lstm-unicharset::
(Required - 4.00alpha LSTM) The unicode character set that Tesseract recognizes, with properties. (Required - 4.0 LSTM) The unicode character set that Tesseract recognizes, with properties.
Same unicharset must be used to train the LSTM and build the lstm-*-dawgs files. Same unicharset must be used to train the LSTM and build the lstm-*-dawgs files.
lang.lstm-recoder:: lang.lstm-recoder::
(Required - 4.00alpha LSTM) Unicharcompress, aka the recoder, which maps the unicharset (Required - 4.0 LSTM) Unicharcompress, aka the recoder, which maps the unicharset
further to the codes actually used by the neural network recognizer. This is created as further to the codes actually used by the neural network recognizer. This is created as
part of the starter traineddata by combine_lang_model. part of the starter traineddata by combine_lang_model.
lang.version:: lang.version::
(Optional) Version string for the traineddata file. (Optional) Version string for the traineddata file.
First appeared in version 4.00alpha of Tesseract. First appeared in version 4.0 of Tesseract.
Old version of traineddata files will report Version string:Pre-4.0.0. Old version of traineddata files will report Version string:Pre-4.0.0.
4.00alpha version of traineddata files may include the network spec 4.0 version of traineddata files may include the network spec
used for LSTM training as part of version string. used for LSTM training as part of version string.
HISTORY HISTORY
------- -------
combine_tessdata(1) first appeared in version 3.00 of Tesseract combine_tessdata(1) first appeared in version 3.00 of Tesseract

View File

@ -115,7 +115,7 @@ SINGLE OPTIONS
LANGUAGES LANGUAGES
--------- ---------
The currently available traineddata files for tesseract 4.00 The currently available traineddata files for tesseract 4.0
for the following languages are in for the following languages are in
(in https://github.com/tesseract-ocr/tessdata_fast): (in https://github.com/tesseract-ocr/tessdata_fast):
@ -244,7 +244,7 @@ argument '-l foo'.
SCRIPTS SCRIPTS
------- -------
The traineddata files for the following scripts for tesseract 4.00 The traineddata files for the following scripts for tesseract 4.0
are also in https://github.com/tesseract-ocr/tessdata_fast. are also in https://github.com/tesseract-ocr/tessdata_fast.
In most cases, each of these contains all the languages that use that script PLUS English. In most cases, each of these contains all the languages that use that script PLUS English.