Minor edits to Readme

2024-11-24 02:59:07 +08:00 · 2015-05-21 19:23:42 -07:00 · 2015-05-21 19:23:42 -07:00 · a36a5f96d0
commit a36a5f96d0
parent f8ebff262e
1 changed files with 35 additions and 31 deletions
--- a/README.md
+++ b/README.md
@ -1,32 +1,35 @@
 Note that this is a text-only and possibly out-of-date version of the 
 wiki ReadMe, which is located at:
-  https://github.com/tesseract-ocr/tesseract/blob/master/README
+  https://github.com/tesseract-ocr/tesseract/blob/master/README.md
 Introduction
 ============
 This package contains the Tesseract Open Source OCR Engine.
-Originally developed at Hewlett Packard Laboratories Bristol and
+Originally developed at Hewlett-Packard Laboratories Bristol and
-at Hewlett Packard Co, Greeley Colorado, all the code
+at Hewlett-Packard Co, Greeley Colorado, all the code
 in this distribution is now licensed under the Apache License:
- * Licensed under the Apache License, Version 2.0 (the "License");
+    Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
+    you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
+    You may obtain a copy of the License at
- * http://www.apache.org/licenses/LICENSE-2.0
+
- * Unless required by applicable law or agreed to in writing, software
+       http://www.apache.org/licenses/LICENSE-2.0
- * distributed under the License is distributed on an "AS IS" BASIS,
+
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    Unless required by applicable law or agreed to in writing, software
- * See the License for the specific language governing permissions and
+    distributed under the License is distributed on an "AS IS" BASIS,
- * limitations under the License.
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
 Dependencies and Licenses
 =========================
-Leptonica is required. (www.leptonica.com). Tesseract no longer compiles
+[Leptonica](http://www.leptonica.com) is required. Tesseract no longer 
-without Leptonica.
+compiles without Leptonica.
 Libtiff is no longer required as a direct dependency.
@ -34,15 +37,16 @@ Installing and Running Tesseract
 --------------------------------
 All Users Do NOT Ignore!
 The tarballs are split into pieces.
 tesseract-x.xx.tar.gz contains all the source code.
-tesseract-x.xx.<lang>.tar.gz contains the language data files for <lang>.
+tesseract-x.xx.`<lang>`.tar.gz contains the language data files for `<lang>`.
 You need at least one of these or Tesseract will not work.
 Note that tesseract-x.xx.tar.gz unpacks to the tesseract-ocr directory.
-tesseract-x.xx.<lang>.tar.gz unpacks to the tessdata directory which 
+tesseract-x.xx.`<lang>`.tar.gz unpacks to the tessdata directory which 
 belongs inside your tesseract-ocr directory. It is therefore best to 
 download them into your tesseract-x.xx directory, so you can use unpack 
 here or equivalent. You can unpack as many of the language packs as you 
@ -52,7 +56,7 @@ before you run make install. If you unpack them as root to the
 destination directory of make install, then the user ids and access
 permissions might be messed up.
-boxtiff-2.xx.<lang>.tar.gz contains data that was used in training for 
+boxtiff-2.xx.`<lang>`.tar.gz contains data that was used in training for 
 those that want to do their own training. Most users should NOT download
 these files.
@ -63,8 +67,8 @@ Tesseract wiki https://github.com/tesseract-ocr/tesseract/wiki
 Windows
 -------
-Please use installer (for 3.00 and above). Tesseract is library with 
+Please use the installer (for 3.00 and above). Tesseract is a library with a 
-command line interface. If you need GUI, please check AddOns wiki page
+command line interface. If you need a GUI, please check the AddOns wiki page.
 TODO-UPDATE-WIKI-LINKS
@ -74,7 +78,7 @@ If you are building from the sources, the recommended build platform is
 VC++ Express 2008 (optionally 2010).
 The executables are built with static linking, so they stand more chance
-of working out of the box on more windows systems.
+of working out of the box on more Windows systems.
 The executable must reside in the same directory as the tessdata 
 directory or you need to set up environment variable TESSDATA_PREFIX.
@ -82,7 +86,7 @@ Installer will set it up for you.
 The command line is:
-tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfiles...]
+    tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfiles...]
 If you need interface to other applications, please check wrapper section
 on AddOns wiki page:
@ -98,19 +102,19 @@ Non-Windows (or Cygwin)
 You have to tell Tesseract through a standard unix mechanism where to 
 find its data directory. You must either:
-./autogen.sh
+    ./autogen.sh
-./configure
+    ./configure
-make
+    make
-make install
+    make install
-sudo ldconfig
+    sudo ldconfig
 to move the data files to the standard place, or:
-export TESSDATA_PREFIX="directory in which your tessdata resides/"
+    export TESSDATA_PREFIX="directory in which your tessdata resides/"
 In either case the command line is:
-tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfiles...]
+    tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfiles...]
 New there is a tesseract.spec for making rpms. (Thanks to Andrew Ziem for
 the help.) It might work with your OS if you know how to do that.
@ -126,8 +130,8 @@ instead of `./configure` above.
 History
 =======
-The engine was developed at Hewlett Packard Laboratories Bristol and
+The engine was developed at Hewlett-Packard Laboratories Bristol and
-at Hewlett Packard Co, Greeley Colorado between 1985 and 1994, with some
+at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some
 more changes made in 1996 to port to Windows, and some C++izing in 1998.
 A lot of the code was written in C, and then some more was written in C++.
 Since then all the code has been converted to at least compile with a C++
@ -138,7 +142,7 @@ lists, but has the big negative that if you do get a segmentation violation,
 it is hard to debug.
 The most recent change is that Tesseract can now recognize 39 languages,
-including Arabic, Hindi, Vietnamese, plus 3 Fraktur variants 
+including Arabic, Hindi, Vietnamese, plus 3 Fraktur variants, 
 is fully UTF8 capable, and is fully trainable. See TrainingTesseract for
 more information on training.