mirror of
https://github.com/tesseract-ocr/tesseract.git
synced 2024-11-24 02:59:07 +08:00
Update language list based on tessdata_fast; fix #1343
This commit is contained in:
parent
6f80c35b3f
commit
035325dfd0
@ -115,8 +115,9 @@ SINGLE OPTIONS
|
||||
LANGUAGES
|
||||
---------
|
||||
|
||||
There are currently language packs available for the following languages
|
||||
(in https://github.com/tesseract-ocr/tessdata):
|
||||
The currently available traineddata files for tesseract 4.00
|
||||
for the following languages are in
|
||||
(in https://github.com/tesseract-ocr/tessdata_fast):
|
||||
|
||||
*afr* (Afrikaans)
|
||||
*amh* (Amharic)
|
||||
@ -176,26 +177,33 @@ There are currently language packs available for the following languages
|
||||
*khm* (Central Khmer)
|
||||
*kir* (Kirghiz; Kyrgyz)
|
||||
*kor* (Korean)
|
||||
*kor_vert* (Korean (vertical))
|
||||
*kur* (Kurdish)
|
||||
*kur_ara* (Kurdish (Arabic))
|
||||
*lao* (Lao)
|
||||
*lat* (Latin)
|
||||
*lav* (Latvian)
|
||||
*lit* (Lithuanian)
|
||||
*ltz* (Luxembourgish)
|
||||
*mal* (Malayalam)
|
||||
*mar* (Marathi)
|
||||
*mkd* (Macedonian)
|
||||
*mlt* (Maltese)
|
||||
*mon* (Mongolian)
|
||||
*mri* (Maori)
|
||||
*msa* (Malay)
|
||||
*mya* (Burmese)
|
||||
*nep* (Nepali)
|
||||
*nld* (Dutch; Flemish)
|
||||
*nor* (Norwegian)
|
||||
*oci* (Occitan (post 1500))
|
||||
*ori* (Oriya)
|
||||
*osd* (Orientation and script detection module)
|
||||
*pan* (Panjabi; Punjabi)
|
||||
*pol* (Polish)
|
||||
*por* (Portuguese)
|
||||
*pus* (Pushto; Pashto)
|
||||
*que* (Quechua)
|
||||
*ron* (Romanian; Moldavian; Moldovan)
|
||||
*rus* (Russian)
|
||||
*san* (Sanskrit)
|
||||
@ -203,20 +211,24 @@ There are currently language packs available for the following languages
|
||||
*slk* (Slovak)
|
||||
*slk_frak* (Slovak - Fraktur)
|
||||
*slv* (Slovenian)
|
||||
*snd* (Sindhi)
|
||||
*spa* (Spanish; Castilian)
|
||||
*spa_old* (Spanish; Castilian - Old)
|
||||
*sqi* (Albanian)
|
||||
*srp* (Serbian)
|
||||
*srp_latn* (Serbian - Latin)
|
||||
*sun* (Sundanese)
|
||||
*swa* (Swahili)
|
||||
*swe* (Swedish)
|
||||
*syr* (Syriac)
|
||||
*tam* (Tamil)
|
||||
*tat* (Tatar)
|
||||
*tel* (Telugu)
|
||||
*tgk* (Tajik)
|
||||
*tgl* (Tagalog)
|
||||
*tha* (Thai)
|
||||
*tir* (Tigrinya)
|
||||
*ton* (Tonga)
|
||||
*tur* (Turkish)
|
||||
*uig* (Uighur; Uyghur)
|
||||
*ukr* (Ukrainian)
|
||||
@ -225,6 +237,7 @@ There are currently language packs available for the following languages
|
||||
*uzb_cyrl* (Uzbek - Cyrilic)
|
||||
*vie* (Vietnamese)
|
||||
*yid* (Yiddish)
|
||||
*yor* (Yoruba)
|
||||
|
||||
To use a non-standard language pack named *foo.traineddata*, set the
|
||||
*TESSDATA_PREFIX* environment variable so the file can be found at
|
||||
|
Loading…
Reference in New Issue
Block a user