diff --git a/Command-Line-Usage.md b/Command-Line-Usage.md index e4c2583..4c5f92a 100644 --- a/Command-Line-Usage.md +++ b/Command-Line-Usage.md @@ -42,7 +42,9 @@ tesseract imagename outputbase -This uses English as the default language and 3 as the Page Segmentation Mode. The default output format is text. osd.traineddata, for Orientation and Segmentation and eng.traineddata and other language data files for English should be in the tessdata directory. TESSDATA_PREFIX environment variable should be set to the parent directory of your "tessdata" directory. +This uses **English **as the default language and 3 as the Page Segmentation Mode. The default output format is **text**. + +osd.traineddata, for Orientation and Segmentation and eng.traineddata and other language data files for English should be in the "tessdata" directory. TESSDATA_PREFIX environment variable should be set to the parent directory of "tessdata" directory. The following command would give the same result as above, if eng.traineddata and osd.traineddata files are in /usr/share/tessdata directory. @@ -50,14 +52,22 @@ The following command would give the same result as above, if eng.traineddata an ## Using One Language - tesseract --tessdata-dir /usr/share ./testing/phototest.tif ./testing/phototest -l eng -psm 3 + tesseract --tessdata-dir /usr/share ./testing/phototest.tif ./testing/phototest -l eng -![phototest.tif](https://github.com/tesseract-ocr/tesseract/blob/master/testing/phototest.tif?raw=true) + ![phototest.tif](https://github.com/tesseract-ocr/tesseract/blob/master/testing/phototest.tif?raw=true) ## Using Multiple Languages + tesseract --tessdata-dir /usr/share ./testing/eurotext.tif ./testing/eurotext-engdeu -l eng+deu + +The output can be different based on the order of languages, so -l eng+deu can give different result than -l deu+eng. + ## Using different Page Segmentation Modes + tesseract --tessdata-dir /usr/share testing/san002.tif testing/san002-psm3 -l san + + tesseract --tessdata-dir /usr/share testing/san002.tif testing/san002-psm6 -l san -psm 6 + ## Searchable pdf ouptput ## HOCR output @@ -67,4 +77,3 @@ The following command would give the same result as above, if eng.traineddata an -