2015-10-11 23:09:01 +08:00
# autotools (LINUX/UNIX , msys...)
2015-12-04 17:46:10 +08:00
If you have cloned Tesseract from GitHub, you must generate
2015-10-11 23:09:01 +08:00
the configure script.
2016-12-27 01:03:07 +08:00
If you have tesseract 4.0x installation in your system, please remove it
2015-10-11 23:09:01 +08:00
before new build.
2017-09-12 14:02:27 +08:00
You need Leptonica 1.74.2 (minimum) for Tesseract 4.0x.
2015-10-11 23:09:01 +08:00
Known dependencies for training tools (excluding leptonica):
2016-12-27 01:03:07 +08:00
* compiler with c++11 support
2017-05-26 02:54:17 +08:00
* automake
* pkg-config
2015-10-11 23:09:01 +08:00
* pango-devel
* cairo-devel
* icu-devel
So, the steps for making Tesseract are:
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install
2017-09-12 14:02:27 +08:00
$ sudo ldconfig
2015-10-11 23:09:01 +08:00
$ make training
$ sudo make training-install
2017-09-17 00:47:04 +08:00
You need to install at least English language and OSD traineddata files to
`TESSDATA_PREFIX` directory.
2017-09-12 14:02:27 +08:00
You can retrieve single file with tools like [wget ](https://www.gnu.org/software/wget/ ), [curl ](https://curl.haxx.se/ ), [GithubDownloader ](https://github.com/intezer/GithubDownloader ) or browser.
2016-09-01 02:54:43 +08:00
2017-09-12 14:02:27 +08:00
All language data files can be retrieved from git repository (useful only for packagers!).
(Repository is huge - more that 1.2 GB. You do NOT need to download traineddata files for
2017-09-17 00:47:04 +08:00
all languages).
2015-12-04 17:46:10 +08:00
2015-10-11 23:09:01 +08:00
$ git clone https://github.com/tesseract-ocr/tessdata.git tesseract-ocr.tessdata
2015-12-04 17:46:10 +08:00
2015-10-11 23:09:01 +08:00
2018-10-12 23:56:33 +08:00
You need an Internet connection and [curl ](https://curl.haxx.se/ ) to compile `ScrollView.jar`
because the build will automatically download
[piccolo2d-core-3.0.jar ](http://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-core/3.0/piccolo2d-core-3.0.jar > piccolo2d-core-3.0.jar ) and
2019-02-19 20:53:31 +08:00
[piccolo2d-extras-3.0.jar ](http://search.maven.org/remotecontent?filepath=org/piccolo2d/piccolo2d-extras/3.0/piccolo2d-extras-3.0.jar ) and
2018-10-12 23:56:33 +08:00
[jaxb-api-2.3.1.jar ](http://search.maven.org/remotecontent?filepath=javax/xml/bind/jaxb-api/2.3.1/jaxb-api-2.3.1.jar ) and place them to `tesseract/java` .
2015-10-11 23:09:01 +08:00
2017-05-26 02:54:17 +08:00
Just run:
2015-12-04 17:46:10 +08:00
2015-10-11 23:09:01 +08:00
$ make ScrollView.jar
2017-05-26 02:54:17 +08:00
and follow the instruction on [Viewer Debugging wiki ](https://github.com/tesseract-ocr/tesseract/wiki/ViewerDebugging ).
2015-10-11 23:09:01 +08:00
# CMAKE
There is alternative build system based on multiplatform [cmake ](https://cmake.org/ )
## LINUX
$ mkdir build
$ cd build & & cmake .. & & make
$ sudo make install
## WINDOWS
You need to use leptonica with cmake patch:
2015-10-11 23:16:42 +08:00
2016-03-05 02:06:14 +08:00
git clone https://github.com/DanBloomberg/leptonica.git
2015-10-11 23:09:01 +08:00
cd leptonica
mkdir build
cd build
cmake ..
cmake --build .
2015-10-11 23:16:42 +08:00
cd ..\..
2015-10-11 23:09:01 +08:00
git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
mkdir build
cd build
2015-10-11 23:16:42 +08:00
cmake .. -DLeptonica_BUILD_DIR=\abs\path\to\leptonica\build
2015-10-11 23:09:01 +08:00
cmake --build .
# WINDOWS Visual Studio
2015-10-11 23:16:42 +08:00
Please read http://vorba.ch/2014/tesseract-3.03-vs2013.html