From 59555e6b1ac94b05a1178463bb80e02ac8df309c Mon Sep 17 00:00:00 2001 From: Shreeshrii Date: Fri, 18 Mar 2016 15:43:21 +0530 Subject: [PATCH] Updated Command Line Usage (markdown) --- Command-Line-Usage.md | 130 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 119 insertions(+), 11 deletions(-) diff --git a/Command-Line-Usage.md b/Command-Line-Usage.md index 18e6cbf..5c09dc8 100644 --- a/Command-Line-Usage.md +++ b/Command-Line-Usage.md @@ -49,31 +49,139 @@ osd.traineddata, for Orientation and Segmentation and eng.traineddata and other The following command would give the same result as above, if eng.traineddata and osd.traineddata files are in /usr/share/tessdata directory. tesseract --tessdata-dir /usr/share imagename outputbase -l eng psm 3 +____________________________________ -## Using One Language - - tesseract --tessdata-dir /usr/share ./testing/phototest.tif ./testing/phototest -l eng - -## Using Multiple Languages - - tesseract --tessdata-dir /usr/share ./testing/eurotext.png ./testing/eurotext-engdeu -l eng+deu +Following examples use this image which has text in multiple languages. ![eurotext.png](http://dev.blog.fairway.ne.jp/wp-content/uploads/2014/04/eurotext.png) +## Using One Language + + tesseract --tessdata-dir ./ ./testing/eurotext.png ./testing/eurotext-eng -l eng + +Output + + The (quick) [brown] {fox} jumps! + Over the $43,456.78 #90 dog + & duck/goose, as 12.5% of E-mail + from aspammer@website.com is spam. + Der ,,schnelle” braune Fuchs springt + fiber den faulen Hund. Le renard brun + «rapide» saute par-dessus le chien + paresseux. La volpe marrone rapida + salta sopra i] cane pigro. El zorro + marrén répido salta sobre el perro + perezoso. A raposa marrom répida + salta sobre 0 C50 preguieoso. + +## Using Multiple Languages + + tesseract --tessdata-dir ./ ./testing/eurotext.png ./testing/eurotext-engdeu -l eng+deu + +Output + + The (quick) [brown] {fox} jumps! + Over the $43,456.78 #90 dog + & duck/goose, as 12.5% of E-mail + from aspammer@website.com is spam. + Der „schnelle” braune Fuchs springt + über den faulen Hund. Le renard brun + «rapide» saute par-dessus le chien + paresseux. La volpe marrone rapida + salta sopra il cane pigro. El zorro + marrön räpido salta sobre el perro + perezoso. A raposa marrom räpida + salta sobre o cäo preguieoso. + The output can be different based on the order of languages, so -l eng+deu can give different result than -l deu+eng. ## Using different Page Segmentation Modes - tesseract --tessdata-dir /usr/share testing/san002.tif testing/san002-psm3 -l san +The following examples are using this image with text in Devanagari script and Sanskrit language. - tesseract --tessdata-dir /usr/share testing/san002.tif testing/san002-psm6 -l san -psm 6 +![san002.png] (https://cloud.githubusercontent.com/assets/82178/13678011/81953684-e6ba-11e5-91e8-5c40518e94a6.png) -## Searchable pdf ouptput + tesseract --tessdata-dir /usr/share testing/san002.png testing/san002-psm6 -l san -psm 6 + +Output + + विर्व्य 16 + ज्यालत्रुखीसह्स्रनामक्तोव्रम्- नामाकळिट्. 191 + दुर्गासहस्रनामस्तीत्रम्- १ नामांक्ळिन्नू ॰213 + द्रुर्गासहस्रनत्मस्तीन्रम्- २ नामावळिऽ 238 + द्दुगसिद्द्स्रनत्मक्तोत्रम्दकाराद्दि(३) नामाव'ळिऽ 263 + ट्टुगसिहस्रनामक्तोत्रम्- ४ नामावळिइं 300 + पार्वतीं ह्यो) सहस्रनामातोत्रम्- नामावळिऽ’ 329 + द्दुर्गानवाक्षरीन्निशतींनत्माव'क्ति 355 + द्बुर्गाष्टोत्तरङ्प्तनत्मरतोव्रम्- नामावक्ति 360 + र्व्यत्मामस्वोत्रम्- नामाक्ळिऽ 363 + अन्नपूण्स्सिहस्रनत्मस्तीत्रम्- नामावक्ति 365 + अन्नघूर्गाष्टोत्तस्यातनामस्तीन्रम्- नामावक्ति 394 + क्रुलकुर्व्यसहस्रनत्मक्तोत्रम्- कवचम्… नामावळिथ् 397- + कुमारींसहृस्रनामक्तोन्नम्- नामावळिय् 432 + गङ्ग’म्यासद्वृस्रनप्मक्तोव्रम्- नाम।वक्ति` 457 + गङ्ग’म्याष्टोत्तराप्तनामप्तोत्रम्- नामावळिऽ 488 + गङ्गादातनप्तास्तोत्रम्- नामावक्ति 491 + यमुनासहस्रनामरतोव्रम्- नम्पावळिय् 493 + 'शिवगङ्गासद्दृस्रनत्माव'ळि 517 + गम्पत्रीसह्स्रनत्मक्तोत्रम्- नाम।व'ळिऽ (१) 531 + +## Searchable pdf output + + tesseract --tessdata-dir ./ ./testing/eurotext.png ./testing/eurotext-eng -l eng pdf ## HOCR output + tesseract --tessdata-dir ./ ./testing/eurotext.png ./testing/eurotext-eng -l eng hocr + +Output + + + + + + + + + + + +
+
+

+ The (quick) [brown] {fox} jumps! + + Over the $43,456.78 <lazy> #90 dog + + & duck/goose, as 12.5% of E-mail + + from aspammer@website.com is spam. + + Der ,,schnelle” braune Fuchs springt + + fiber den faulen Hund. Le renard brun + + «rapide» saute par-dessus le chien + + paresseux. La volpe marrone rapida + + salta sopra i] cane pigro. El zorro + + marrén répido salta sobre el perro + + perezoso. A raposa marrom répida + + salta sobre 0 C50 preguieoso. + +

+
+
+ + + + ## TSV output (only available in 3.05-dev in master branch) -