Commit Graph

6000 Commits

Author SHA1 Message Date
david.eger@gmail.com
018f192fc2 Abolish populate_unichars(), fixing seg fault reported in Debian:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658634



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@675 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-15 01:37:00 +00:00
zdenop@gmail.com
53d133d83a fixed cntraning thanks to Wil Hadden; fixed installation of new manpages
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@674 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-12 16:03:05 +00:00
zdenop@gmail.com
3c4fd30bb5 Fix is isinf for VC++
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@673 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-12 14:51:28 +00:00
david.eger@gmail.com
22331c03ec Fix issue 613: assert() fail on Windows isspace() when given non-ASCII.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@671 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-10 01:44:36 +00:00
david.eger@gmail.com
58e06c8c45 Update man pages for Tesseract 3.02.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@670 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-09 22:55:47 +00:00
david.eger@gmail.com
78a8356a76 Put one last bigram correction debug statement behind a debug flag.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@669 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-09 20:08:17 +00:00
zdenop@gmail.com
1355cabe7e VS2008 - fix include path for release*
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@668 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-07 13:52:33 +00:00
zdenop@gmail.com
425c2b8205 install data files; small fix of INSTALL, README; removed ABOUT-NLS (NLS not used at the moment)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@667 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-05 16:25:40 +00:00
zdenop@gmail.com
0a50c9ca5c Another VS2008 fixes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@666 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-04 22:06:40 +00:00
zdenop@gmail.com
d0c2631ec8 VC++2008 build fix for 3.02 version
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@665 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-03 22:23:12 +00:00
david.eger@gmail.com
56bc885721 Fix some debug messaging about bigram correction -- the two lists of
alternates are not independent.



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@664 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-03 19:43:25 +00:00
theraysmith@gmail.com
09e41d32c2 Renamed RGB to ComposeRGB to fix windows macro problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@663 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-03 16:52:25 +00:00
theraysmith@gmail.com
d581ab7e12 New config for testing bigram correction.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@661 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 18:46:19 +00:00
david.eger@gmail.com
ad53f34e7c Added a missing header file for the 3.02 release.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@659 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 18:01:17 +00:00
theraysmith@gmail.com
e0d735b122 Remaining misc changes for 3.02
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@658 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:43 +00:00
theraysmith@gmail.com
23dfabcab1 Cleaned up externally used namespace by removing includes from baseapi.h
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@657 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:16 +00:00
theraysmith@gmail.com
6e273b71bd Cube trained data for fra, ita, rus, spa
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@656 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:08:26 +00:00
theraysmith@gmail.com
9206e92b0d Added simultaneous multi-language capability, Refactored top-level word recognition module, Blamer module added for error analysis, Tidied up constraints on control parameters, Added UNICHARSET to WERD_CHOICE to make mult-language handling easier, Added word bigram correction
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@655 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:06:39 +00:00
theraysmith@gmail.com
73adf693d5 Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Refactored top-level word recognition module, Added simultaneous multi-language capability.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@654 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:03:56 +00:00
theraysmith@gmail.com
e33ae59f4d Fixed training leaks and randomness
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@653 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:02:16 +00:00
theraysmith@gmail.com
01026af5a2 Refactored top-level word recognition module, Blamer module added for error analysis, Added word bigram correction
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@652 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:01:38 +00:00
theraysmith@gmail.com
3a998fe7ac Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Added paragraph detection in layout analysis/post OCR, Fixed inconsistent xheight during training and over-chopping, Added simultaneous multi-language capability, Refactored top-level word recognition module, Fixed problems with internally scaled images
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@651 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:59:49 +00:00
theraysmith@gmail.com
5bc5e2a0b4 Added simultaneous multi-language capability, Added support for ShapeTable in classifier and training, Refactored class pruner, Added new uniform classifier API, Added new training error counter
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@650 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:57:42 +00:00
theraysmith@gmail.com
fdd4ffe85e Fixed endian bug in dawg reader, Added word bigram correction,
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@649 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:56:18 +00:00
theraysmith@gmail.com
6e3d810c1d Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@648 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:53:04 +00:00
theraysmith@gmail.com
04068c7055 Removed dead memory mangagement code
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@647 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:51:56 +00:00
theraysmith@gmail.com
ac014eb27a Added experimental equation detector
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@646 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:50:01 +00:00
theraysmith@gmail.com
ef786ad29b Moved ResultIterator/PageIterator to ccmain
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@645 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:47:59 +00:00
zdenop@gmail.com
8225f5b846 removed BOM form strngs.h, updated NSIS script and COPYING
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@639 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-22 18:27:31 +00:00
theraysmith@gmail.com
aae3da5bf1 Last minute fixes for making the tarball
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@636 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-22 05:28:44 +00:00
zdenop@gmail.com
db2aa4e73f svpaint.cpp moved from include to source
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@632 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 20:23:49 +00:00
zdenop@gmail.com
67f47008c7 fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
max.markin@gmail.com
bf3ae643e5 Fixed some warnings to make the VC2010 compiler happy:
C4355: 'this' : used in base member initializer list
C4099: type name first seen using 'class' now seen using 'struct'

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@630 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-15 22:26:34 +00:00
max.markin@gmail.com
0fef845950 VC2010: add support for dynamic linking
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@629 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-15 22:17:19 +00:00
max.markin@gmail.com
cfc7de1420 fixed debug build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@628 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-07 03:49:25 +00:00
zdenop@gmail.com
ab234da926 fix for issue 540
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@627 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-09-29 21:25:57 +00:00
max.markin@gmail.com
7c4461316a fixed comment
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@626 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-09-18 05:12:37 +00:00
zdenop@gmail.com
22fc3e80be VS2010 build fix
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@625 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-09-05 10:05:36 +00:00
zdenop@gmail.com
da41b96f7f removed check for libtiff - leptonica is required; cleanup #ifdef/#ifndef HAVE_LIBLEPT
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@624 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-30 06:34:41 +00:00
joregan@gmail.com
bf4a09d72a make single/multiple libraries optional -- this needs testing!!!
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
joregan@gmail.com
fbab153409 readd m4 stuff
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@622 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 22:06:01 +00:00
joregan@gmail.com
e7d0029b65 macports installs libtoolize as glibtoolize. A cleaner solution can wait for some later date
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@621 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 22:02:29 +00:00
zdenop@gmail.com
2ded50b4d0 'make dist' improvement; removed debugwin.* from vs2008 and vs2010; decreased warning level in vs2008 project files for Release* build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@620 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 21:33:28 +00:00
theraysmith@gmail.com
f33ac09829 Fixed make dist
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@619 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 19:57:34 +00:00
joregan@gmail.com
323ee5af7a more Makefile.in
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@618 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:40:33 +00:00
joregan@gmail.com
b69d9b90b7 there is no m4 directory
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@617 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:09:29 +00:00
joregan@gmail.com
c209583793 rm reference to config.rpath
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@616 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:00:22 +00:00
theraysmith@gmail.com
47f032b0e9 Removed remaining Makefile.in
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@615 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:52:53 +00:00
theraysmith@gmail.com
42acb3b9a8 Updated ReleaseNotes, README, more minor cleaup
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@614 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:51:11 +00:00
theraysmith@gmail.com
4575c52ff5 Removed debugwin.cpp, fixing issue 448
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@613 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:45:59 +00:00