zdenop@gmail.com
23f1d16037
fix fox issue 346 / GetAvailableLanguagesAsVector
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@760 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-24 05:20:23 +00:00
zdenop@gmail.com
dc8bd4682b
C-API (fix issue 362)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@759 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-24 05:14:11 +00:00
theraysmith@gmail.com
59d244b06e
More fixes for GRAPHICS_DISABLED from Zdenko and Ray
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@757 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-22 00:59:31 +00:00
zdenop@gmail.com
0ed5c67070
fix issue 757 (Solaris needs -lrt for sem_init)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@756 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 20:00:01 +00:00
zdenop@gmail.com
a877d8a5a7
fix 'make distclean' in configure.ac too
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@755 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 17:56:53 +00:00
zdenop@gmail.com
503f68966e
fix 'make distclean'
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@754 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 17:36:27 +00:00
theraysmith@gmail.com
da1047f020
Fixed typos and improved comments
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@753 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:31:20 +00:00
theraysmith@gmail.com
5e79160afb
Fixed new bug in error message
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@752 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:28:45 +00:00
theraysmith@gmail.com
32badf1c1b
fixed pageseg mode
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@751 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:28:11 +00:00
theraysmith@gmail.com
fbf7968490
Fixed problem with blank pages
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@750 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:27:25 +00:00
theraysmith@gmail.com
b301d39e2b
Added 16 bit as a valid image depth
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@749 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:26:59 +00:00
theraysmith@gmail.com
f23460bec4
Removed config_auto.h from .h files
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@748 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:26:10 +00:00
theraysmith@gmail.com
751f2ce173
Whitespace
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@747 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:24:56 +00:00
theraysmith@gmail.com
441abd35ca
Fixed bug that was introduced with GRAPHICS_DISABLED changes
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@746 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:24:15 +00:00
theraysmith@gmail.com
7b90ed28d3
Fixed problem with NULL STRINGs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@745 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:22:22 +00:00
theraysmith@gmail.com
c2dbb28376
Fixed issues 714, 608
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@744 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:21:24 +00:00
theraysmith@gmail.com
c7cef53ee3
Fixed issue 669
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@743 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:20:35 +00:00
theraysmith@gmail.com
d71045fa3a
Fixed issue 736
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@742 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-21 15:19:44 +00:00
david.eger@gmail.com
0aadbd0169
Save BLOB_CHOICE s for alternate choices saved during segmentation
...
search so we have them when trying to replace words with alternates in
the bigram correction pass.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@739 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-01 00:33:46 +00:00
zdenop@gmail.com
2a57976c41
- fix msys buil (missing -lws2_32 for library)
...
- remove old debian leptonica package
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@738 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-25 19:53:41 +00:00
zdenop@gmail.com
306a8216e1
fix creating box file from empty image (issue 516)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@737 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-03 22:32:17 +00:00
zdenop@gmail.com
60b0d10e16
fix for issue 690
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@736 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-08-01 21:57:49 +00:00
zdenop@gmail.com
b064cf511d
revert back tesseract.sln from r734
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@735 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-31 09:32:29 +00:00
zdenop
937aab009f
fix issue 636
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@734 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-31 09:22:26 +00:00
zdenop@gmail.com
eaf9d63626
Provide pkgconfig file (issue 451), improve configure.ac and INSTALL.SVN
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@733 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-28 21:17:20 +00:00
zdenop@gmail.com
8708102883
implement '--enable-debug' for ./configure; small clean up autogen.sh and configure.ac
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@732 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-22 21:36:20 +00:00
zdenop
1131e5dd2f
addition to Issue 724
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@731 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-04 15:35:26 +00:00
zdenop@gmail.com
d72a318c5c
fix Issue 724: DESTDIR not supported with make install-langs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@730 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-07-03 20:33:28 +00:00
zdenop@gmail.com
c8eedb25a6
added ocr-capabilities for hocr conformity; XHTML 1.0 Transitional conformity; improved hocr output readability
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@729 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-28 20:44:23 +00:00
david.eger@gmail.com
6a9a3ddcb2
Zdeno pointed out that ocr_line (though not ocr_word) is actually in the hocr spec.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@728 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-27 23:58:09 +00:00
david.eger@gmail.com
d9d70919bb
Conform to the hocr spec: hocr doesn't have ocr_word, but instead has ocrx_word.
...
Tested with ExactImage's hocr2pdf.
$ tesseract phototest.tif phototest hocr
$ hocr2pdf -i phototest.tif -o ./phototest.pdf < ./phototest.hocr
$ evince phototest.pdf
See: https://docs.google.com/document/preview?id=1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@726 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-25 17:36:25 +00:00
david.eger@gmail.com
eeeb4f513c
Provide better paragraph segmentation without having to run fully
...
automatic layout analysis.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@725 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-05-10 00:03:34 +00:00
zdenop@gmail.com
e606c311f5
fix issue Issue 684 : show correct line in failure message "Couldn't find a matching blob"
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@723 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-22 20:51:00 +00:00
zdenop@gmail.com
d39cb38ab8
Fix Issue 678
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@722 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-17 17:32:42 +00:00
david.eger@gmail.com
56403c6dc3
Fix an issue where we sometimes leave a dangling outline->loop pointer
...
during chopping.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@721 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-17 00:02:52 +00:00
david.eger@gmail.com
71b3200625
Fix a shapetable serialization issue -- sizeof(bool) is not portable.
...
See http://code.google.com/p/tesseract-ocr/issues/detail?id=669
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@720 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-17 00:00:26 +00:00
david.eger@gmail.com
a253ea224a
Add some documentation on how to use config files and user dictionaries.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@719 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-09 19:41:06 +00:00
zdenop@gmail.com
aa14e8b212
fix Mingw shared build
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@718 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-02 12:14:37 +00:00
zdenop@gmail.com
c2d5616a7e
add Doxyfile (doxygen config) to distribution
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@717 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-04-02 10:52:13 +00:00
zdenop@gmail.com
cd8de9157c
change comments to doxygen block comments (api)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@716 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-30 21:24:12 +00:00
zdenop@gmail.com
5958f01f5f
fix doxygen warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@715 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-30 15:42:06 +00:00
david.eger@gmail.com
4f0ff358a7
Missing close bracket.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@714 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-29 06:15:33 +00:00
david.eger@gmail.com
4ddb3e5941
Good moming, Good aftemoon.
...
During our initial chopping for each word, pay attention to whether a
dangerous ambiguity (like rn <-> m) would lead us to a dictionary word.
If so, make sure that blob gets chopped so that we can evaluate said
dictionary word during the segmentation search.
Large accuracy improvement, especially on English printed books (~9%).
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@713 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 21:02:54 +00:00
zdenop@gmail.com
ee44165d3d
improve doxygen config; fix doxygen warnings for baseapi.h
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@712 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 20:38:14 +00:00
david.eger@gmail.com
0d5e8b5cb6
Recording segmentation state for a choice at LogNewChoice() time was a
...
bad idea -- a VIABLE_CHOICE's Blob->NumChunks can be modified as we go
by a call from Dict::LogNewSplit(). Relying on the auxilury
segmentation_state makes alt choices sometimes reference the wrong
blobs.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@711 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 20:11:57 +00:00
zdenop@gmail.com
3f9032ef0c
fix 'make dist' for MinGW+MSYS
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@710 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-24 16:33:11 +00:00
zdenop@gmail.com
3115fbfdcb
another fix MinGW+MSYS
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@709 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-24 10:14:47 +00:00
zdenop@gmail.com
d4d4b8aad8
improve autools system (mingw+msys fix); implementation of --disable-tessdata-prefix
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@708 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-22 20:01:33 +00:00
david.eger@gmail.com
c0cd2cd605
Restore VC++ compatibility for paragraphs.cpp.
...
Missed a __func__ addition in the last merge.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@707 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-21 16:41:27 +00:00
david.eger@gmail.com
a91778397b
Fix Issue 645, a char signed/unsigned issue in paragraphs.cpp.
...
When constructing our debug strings, our simple UTF-8 processing should skip all non-ASCII chars.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@706 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-20 20:19:00 +00:00