Update version in README and manpages (#1381)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
This commit is contained in:
Stefan Weil 2018-03-12 21:39:29 +01:00 committed by zdenop
parent 8fb68746fb
commit bdf6629722
3 changed files with 22 additions and 22 deletions

View File

@ -33,7 +33,7 @@ In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google.
The latest stable version is **[3.05.01](https://github.com/tesseract-ocr/tesseract/releases/tag/3.05.01)**, released on June 1, 2017. Latest source code for 3.05 is available from [3.05 branch on GitHub](https://github.com/tesseract-ocr/tesseract/tree/3.05).
Source code for the new **[LSTM based 4.00.00alpha version](https://github.com/tesseract-ocr/tesseract)** is available from the master branch on GitHub. Please note this branch is under active development.
Source code for the new **[LSTM based 4.0 version](https://github.com/tesseract-ocr/tesseract)** is available from the master branch on GitHub. Please note this branch is under active development.
See **[Release Notes](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes)** and **[Change Log](https://github.com/tesseract-ocr/tesseract/blob/master/ChangeLog)** for more details of the releases.

View File

@ -81,7 +81,7 @@ CAVEATS
COMPONENTS
----------
The components in a Tesseract lang.traineddata file as of
Tesseract 4.00alpha are briefly described below; For more information on
Tesseract 4.0 are briefly described below; For more information on
many of these files, see
<https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract>
and
@ -89,7 +89,7 @@ and
lang.config::
(Optional) Language-specific overrides to default config variables.
For 4.00alpha traineddata files, lang.config provides control parameters which
For 4.0 traineddata files, lang.config provides control parameters which
can affect layout analysis, and sub-languages.
lang.unicharset::
@ -148,34 +148,34 @@ lang.params-model::
(Optional - 3.0x legacy tesseract) .
lang.lstm::
(Required - 4.00alpha LSTM) Neural net trained recognition model generated by lstmtraining.
(Required - 4.0 LSTM) Neural net trained recognition model generated by lstmtraining.
lang.lstm-punc-dawg::
(Optional - 4.00alpha LSTM) A dawg made from punctuation patterns found around words.
(Optional - 4.0 LSTM) A dawg made from punctuation patterns found around words.
The "word" part is replaced by a single space. Uses lang.lstm-unicharset.
lang.lstm-word-dawg::
(Optional - 4.00alpha LSTM) A dawg made from dictionary words from the language.
(Optional - 4.0 LSTM) A dawg made from dictionary words from the language.
Uses lang.lstm-unicharset.
lang.lstm-number-dawg::
(Optional - 4.00alpha LSTM) A dawg made from tokens which originally contained digits.
(Optional - 4.0 LSTM) A dawg made from tokens which originally contained digits.
Each digit is replaced by a space character. Uses lang.lstm-unicharset.
lang.lstm-unicharset::
(Required - 4.00alpha LSTM) The unicode character set that Tesseract recognizes, with properties.
(Required - 4.0 LSTM) The unicode character set that Tesseract recognizes, with properties.
Same unicharset must be used to train the LSTM and build the lstm-*-dawgs files.
lang.lstm-recoder::
(Required - 4.00alpha LSTM) Unicharcompress, aka the recoder, which maps the unicharset
(Required - 4.0 LSTM) Unicharcompress, aka the recoder, which maps the unicharset
further to the codes actually used by the neural network recognizer. This is created as
part of the starter traineddata by combine_lang_model.
lang.version::
(Optional) Version string for the traineddata file.
First appeared in version 4.00alpha of Tesseract.
First appeared in version 4.0 of Tesseract.
Old version of traineddata files will report Version string:Pre-4.0.0.
4.00alpha version of traineddata files may include the network spec
4.0 version of traineddata files may include the network spec
used for LSTM training as part of version string.
HISTORY

View File

@ -115,7 +115,7 @@ SINGLE OPTIONS
LANGUAGES
---------
The currently available traineddata files for tesseract 4.00
The currently available traineddata files for tesseract 4.0
for the following languages are in
(in https://github.com/tesseract-ocr/tessdata_fast):
@ -244,7 +244,7 @@ argument '-l foo'.
SCRIPTS
-------
The traineddata files for the following scripts for tesseract 4.00
The traineddata files for the following scripts for tesseract 4.0
are also in https://github.com/tesseract-ocr/tessdata_fast.
In most cases, each of these contains all the languages that use that script PLUS English.