Updated Data Files in tessdata_fast (markdown)

2025-07-21 11:36:15 +08:00 · 2019-08-07 14:03:40 +05:30 · 2019-08-07 14:03:40 +05:30 · 2c1a8015e9
commit 2c1a8015e9
parent 13c011b7e8
1 changed files with 6 additions and 1 deletions
--- a/Data-Files-in-tessdata_fast.md
+++ b/Data-Files-in-tessdata_fast.md
@ -1,6 +1,6 @@
 ## Traineddata Files for Version 4.00 +

-We have three sets of .traineddata files for `tesseract` versions 4.00 and above on GitHub in three separate repositories. 
+We have three sets of official .traineddata files trained at Google, for `tesseract` versions 4.00 and above, in three separate repositories. 

 * [tessdata_fast](https://github.com/tesseract-ocr/tessdata_fast) (Sep 2017) best "value for money" in speed vs accuracy, Integer models
 * [tessdata_best](https://github.com/tesseract-ocr/tessdata_best) (Sep 2017) best results on Google's eval data, slower, Float models. These are the only models that can be used as base for finetune training
@ -8,6 +8,11 @@ We have three sets of .traineddata files for `tesseract` versions 4.00 and above

 When using the traineddata files from the **`tessdata_best`** and **`tessdata_fast`** repositories, only the new LSTM-based OCR engine is supported. The legacy tesseract engine is NOT supported with these files, so Tesseract's oem modes '0' and '2' won't work with them. 

+Community contributed traineddata files can be found at:
+
+* [tessdata_contrib](https://github.com/tesseract-ocr/tessdata_contrib) repo
+* [Wiki page with links to externals repos](https://github.com/tesseract-ocr/tesseract/wiki/Data-Files-Contributions)
+
 ## Information specific to tessdata_fast

 First, `fast` is trained with a spec that produces a smaller net than `best`. As a result of smaller model, the prediction will be faster.