Merge remote-tracking branch 'origin/master'

This commit is contained in:
Shree Devi Kumar 2018-06-09 08:15:31 +00:00
commit 481d7775c6
3 changed files with 29 additions and 13 deletions

View File

@ -30,6 +30,7 @@
# include "renderer.h" # include "renderer.h"
#else #else
# include "platform.h" # include "platform.h"
# include <stdbool.h>
# include <stdio.h> # include <stdio.h>
#endif #endif

View File

@ -45,12 +45,12 @@ uintmax_t streamtoumax(FILE* s, int base);
// Parse a file stream according to the given format. See the fscanf manpage // Parse a file stream according to the given format. See the fscanf manpage
// for more information, as this function attempts to mimic its behavior. // for more information, as this function attempts to mimic its behavior.
// Note that scientific loating-point notation is not supported. // Note that scientific floating-point notation is not supported.
int fscanf(FILE* stream, const char *format, ...); int fscanf(FILE* stream, const char *format, ...);
// Parse a file stream according to the given format. See the fscanf manpage // Parse a file stream according to the given format. See the fscanf manpage
// for more information, as this function attempts to mimic its behavior. // for more information, as this function attempts to mimic its behavior.
// Note that scientific loating-point notation is not supported. // Note that scientific floating-point notation is not supported.
int vfscanf(FILE* stream, const char *format, va_list ap); int vfscanf(FILE* stream, const char *format, va_list ap);
// Create a file at the specified path. See the creat manpage for more // Create a file at the specified path. See the creat manpage for more

View File

@ -1,15 +1,16 @@
How to run UNLV tests. ## How to run UNLV tests.
The scripts in this directory make it possible to duplicate the tests The scripts in this directory make it possible to duplicate the tests
published in the Fourth Annual Test of OCR Accuracy. published in the Fourth Annual Test of OCR Accuracy.
See http://www.isri.unlv.edu/downloads/AT-1995.pdf See http://www.expervision.com/wp-content/uploads/2012/12/1995.The_Fourth_Annual_Test_of_OCR_Accuracy.pdf
but first you have to get the tools and data used by UNLV: but first you have to get the tools and data used by UNLV:
Step 1: to download the images go to ### Step 1: to download the images go to
https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/ https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/
and get doe3.3B.tar.gz, bus.3B.tar.gz, mag.3B.tar.gz and news.3B.tar.gz and get doe3.3B.tar.gz, bus.3B.tar.gz, mag.3B.tar.gz and news.3B.tar.gz
spn.3B.tar.gz is incorrect in this repo, so get it from code.google spn.3B.tar.gz is incorrect in this repo, so get it from code.google
```
mkdir -p ~/isri-downloads mkdir -p ~/isri-downloads
cd ~/isri-downloads cd ~/isri-downloads
curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/bus.3B.tar.gz > bus.3B.tar.gz curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/bus.3B.tar.gz > bus.3B.tar.gz
@ -17,12 +18,15 @@ curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/do
curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/mag.3B.tar.gz > mag.3B.tar.gz curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/mag.3B.tar.gz > mag.3B.tar.gz
curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/news.3B.tar.gz > news.3B.tar.gz curl -L https://sourceforge.net/projects/isri-ocr-evaluation-tools-alt/files/news.3B.tar.gz > news.3B.tar.gz
curl -L https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/isri-ocr-evaluation-tools/spn.3B.tar.gz > spn.3B.tar.gz curl -L https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/isri-ocr-evaluation-tools/spn.3B.tar.gz > spn.3B.tar.gz
```
Step 2: extract the files. It doesn't really matter where ### Step 2: extract the files.
It doesn't really matter where
in your filesystem you put them, but they must go under a common in your filesystem you put them, but they must go under a common
root so you have directories doe3.3B, bus.3B, mag.3B and news.3B. in, for example, root so you have directories doe3.3B, bus.3B, mag.3B and news.3B. in, for example,
~/ISRI-OCRtk. ~/ISRI-OCRtk.
```
mkdir -p ~/ISRI-OCRtk mkdir -p ~/ISRI-OCRtk
cd ~/ISRI-OCRtk cd ~/ISRI-OCRtk
tar xzvf ~/isri-downloads/bus.3B.tar.gz tar xzvf ~/isri-downloads/bus.3B.tar.gz
@ -30,26 +34,37 @@ tar xzvf ~/isri-downloads/doe3.3B.tar.gz
tar xzvf ~/isri-downloads/mag.3B.tar.gz tar xzvf ~/isri-downloads/mag.3B.tar.gz
tar xzvf ~/isri-downloads/news.3B.tar.gz tar xzvf ~/isri-downloads/news.3B.tar.gz
tar xzvf ~/isri-downloads/spn.3B.tar.gz tar xzvf ~/isri-downloads/spn.3B.tar.gz
```
**** Edit ~/ISRI-OCRtk/spn.3B/pages Edit *~/ISRI-OCRtk/spn.3B/pages*
delete the line containing the following imagename as it crashes tesseract. delete the line containing the following imagename as it crashes tesseract.
7733_005.3B.tif 7733_005.3B.tif
Step 4: Download the modified ISRI toolkit and make and install the tools : ### Step 3: Download the modified ISRI toolkit, make and install the tools :
These will be installed in /usr/local/bin.
```
git clone https://github.com/Shreeshrii/ocr-evaluation-tools.git git clone https://github.com/Shreeshrii/ocr-evaluation-tools.git
cd ~/ocr-evaluation-tools cd ~/ocr-evaluation-tools
sudo make install sudo make install
```
Step 6: cd back to your main tesseract-ocr dir and Build tesseract. ### Step 4: cd back to your main tesseract-ocr dir and Build tesseract.
Step 7: run unlvtests/runalltests.sh with the root ISRI data dir and testname, tessdata-dir and language: ### Step 5: run unlvtests/runalltests.sh with the root ISRI data dir, testname, tessdata-dir and language:
```
unlvtests/runalltests.sh ~/ISRI-OCRtk 4_fast_eng ../tessdata_fast eng unlvtests/runalltests.sh ~/ISRI-OCRtk 4_fast_eng ../tessdata_fast eng
and go to the gym, have lunch etc. ```
and go to the gym, have lunch etc. It takes a while to run.
Step 8: There should be a file ### Step 6: There should be a RELEASE.summary file
unlvtests/reports/4-beta_fast.summary that contains the final summarized accuracy *unlvtests/reports/4-beta_fast.summary* that contains the final summarized accuracy
report and comparison with the 1995 results. report and comparison with the 1995 results.
### Step 7: run the test for Spanish.
```
unlvtests/runalltests.sh ~/ISRI-OCRtk 4_fast_spa ../tessdata_fast spa unlvtests/runalltests.sh ~/ISRI-OCRtk 4_fast_spa ../tessdata_fast spa
```