Updated 4.0 with LSTM (markdown)

Amit D 2016-11-30 10:43:41 +02:00
parent 2fed661380
commit 6600b098fd

@ -3,12 +3,14 @@
The upcoming 4.0 release of Tesseract-ocr will be based on LSTM technology as per [the slides in das tutorial 2016] (https://github.com/tesseract-ocr/docs/tree/master/das_tutorial2016)
[Status - Nov 21, 2016] Initial alpha of source code and documentation added.
Model data for 100 languages still under test and coming soon.
Fixes to the training process and training documentation will follow.
Please see [NeuralNetsInTesseract4.00](NeuralNetsInTesseract4.00) for more
information and note that there are still some important omissions documented
there - initially works (well) on x86/Linux.
[Nov 28, 2016]
Model data for 101 languages was uploaded to the [tessdata repository](https://github.com/tesseract-ocr/tessdata)
[Update - Nov 30, 2016] See comments on
https://github.com/tesseract-ocr/tesseract/issues/40