tesseract/testing
Stefan Weil dabf3c299f Fix file endings
Text files should end with a LF, but not additional empty lines.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-04-25 19:35:33 +02:00
..
reports Added testing results for 2.03 and 2.04 2009-06-30 01:46:29 +00:00
counttestset.sh Fix file endings 2018-04-25 19:35:33 +02:00
devatest-rotated-270.png Add unit test for OSD, update apiexample test (#1359) 2018-03-04 14:52:27 +01:00
devatest.png Add unit test for OSD, update apiexample test (#1359) 2018-03-04 14:52:27 +01:00
DuTillet1004Pg2LG.jpg Add LTR & mixed direction test files 2016-02-17 11:40:31 -05:00
eurotext.tif move testing images to testing directory 2014-03-07 12:32:10 +00:00
eurotext.txt Groundtruth for testing/eurotext.tif 2017-09-04 18:31:13 +05:30
FILES Fix file endings 2018-04-25 19:35:33 +02:00
hebrew-nikud-genesis-1-2.png Add LTR & mixed direction test files 2016-02-17 11:40:31 -05:00
hebrew.png Add LTR & mixed direction test files 2016-02-17 11:40:31 -05:00
hebtypo.jpg Add LTR & mixed direction test files 2016-02-17 11:40:31 -05:00
Makefile.am Added testing results for 2.03 and 2.04 2009-06-30 01:46:29 +00:00
phototest-rotated-180.png Add unit test for OSD, update apiexample test (#1359) 2018-03-04 14:52:27 +01:00
phototest-rotated-L.png Add unit test for OSD, update apiexample test (#1359) 2018-03-04 14:52:27 +01:00
phototest-rotated-R.png Add unit test for OSD, update apiexample test (#1359) 2018-03-04 14:52:27 +01:00
phototest.tif move testing images to testing directory 2014-03-07 12:32:10 +00:00
phototest.txt Rename unittest/testfiles/phototest.txt to testing/phototest.txt 2017-09-04 10:42:37 +05:30
README Fix file endings 2018-04-25 19:35:33 +02:00
reorgdata.sh testing: Fix warnings from shellcheck 2017-04-11 18:23:47 +02:00
runalltests.sh testing: Fix warnings from shellcheck 2017-04-11 18:23:47 +02:00
runtestset.sh testing: Fix warnings from shellcheck 2017-04-11 18:23:47 +02:00

How to run UNLV tests.

The scripts in this directory make it possible to duplicate the tests
published in the Fourth Annual Test of OCR Accuracy.
See http://www.isri.unlv.edu/downloads/AT-1995.pdf
but first you have to get the tools and data from UNLV:

Step 1: to download the images goto
http://www.isri.unlv.edu/ISRI/OCRtk
and get 3b.tgz, Bb.tgz, Mb.tgz and Nb.tgz.

Step 2: extract the files. It doesn't really matter where
in your filesystem you put them, but they must go under a common
root so you have directories 3, B, M and N in, for example,
/users/me/ISRI-OCRtk.

Step 3: Reorg the files
The lack of tif extensions on the images is inconvenient, so there
is a script to reorganize the data to match the rest of the test
scripts.
cd to /users/me/ISRI-OCRtk or wherever 3, B, M and N ended up and run
/blah/blah/tesseract-ocr/testing/reorgdata.sh 3B
This makes directories doe3.3B, bus.3B, mag.3B and news.3B.
You can now get rid of 3, B, M, and N unless you want to get some of the
other scanning resolutions out of them.

Step 4: Download the ISRI toolkit from:
http://www.isri.unlv.edu/downloads/ftk-1.0.tgz

Step 5: If they work for you, use the binaries directly from the bin
directory and put them in tesseract-ocr/testing/unlv
otherwise build the tools for yourself and put them there.

Step 6: cd back to your main tesseract-ocr dir and Build tesseract.

Step 7: run testing/runalltests.sh with the root data dir and testname:
testing/runalltests.sh /users/me/ISRI-OCRtk tess2.0
and go to the gym, have lunch etc.

Step 8: There should be a file
testing/reports/tess2.0.summary that contains the final summarized accuracy
report and comparison with the 1995 results.