Commit Graph

3884 Commits

Author SHA1 Message Date
theraysmith@gmail.com
6e273b71bd Cube trained data for fra, ita, rus, spa
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@656 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:08:26 +00:00
theraysmith@gmail.com
9206e92b0d Added simultaneous multi-language capability, Refactored top-level word recognition module, Blamer module added for error analysis, Tidied up constraints on control parameters, Added UNICHARSET to WERD_CHOICE to make mult-language handling easier, Added word bigram correction
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@655 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:06:39 +00:00
theraysmith@gmail.com
73adf693d5 Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Refactored top-level word recognition module, Added simultaneous multi-language capability.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@654 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:03:56 +00:00
theraysmith@gmail.com
e33ae59f4d Fixed training leaks and randomness
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@653 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:02:16 +00:00
theraysmith@gmail.com
01026af5a2 Refactored top-level word recognition module, Blamer module added for error analysis, Added word bigram correction
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@652 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:01:38 +00:00
theraysmith@gmail.com
3a998fe7ac Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Added paragraph detection in layout analysis/post OCR, Fixed inconsistent xheight during training and over-chopping, Added simultaneous multi-language capability, Refactored top-level word recognition module, Fixed problems with internally scaled images
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@651 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:59:49 +00:00
theraysmith@gmail.com
5bc5e2a0b4 Added simultaneous multi-language capability, Added support for ShapeTable in classifier and training, Refactored class pruner, Added new uniform classifier API, Added new training error counter
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@650 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:57:42 +00:00
theraysmith@gmail.com
fdd4ffe85e Fixed endian bug in dawg reader, Added word bigram correction,
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@649 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:56:18 +00:00
theraysmith@gmail.com
6e3d810c1d Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@648 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:53:04 +00:00
theraysmith@gmail.com
04068c7055 Removed dead memory mangagement code
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@647 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:51:56 +00:00
theraysmith@gmail.com
ac014eb27a Added experimental equation detector
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@646 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:50:01 +00:00
theraysmith@gmail.com
ef786ad29b Moved ResultIterator/PageIterator to ccmain
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@645 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:47:59 +00:00
zdenop@gmail.com
8225f5b846 removed BOM form strngs.h, updated NSIS script and COPYING
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@639 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-22 18:27:31 +00:00
theraysmith@gmail.com
aae3da5bf1 Last minute fixes for making the tarball
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@636 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-22 05:28:44 +00:00
zdenop@gmail.com
db2aa4e73f svpaint.cpp moved from include to source
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@632 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 20:23:49 +00:00
zdenop@gmail.com
67f47008c7 fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
max.markin@gmail.com
bf3ae643e5 Fixed some warnings to make the VC2010 compiler happy:
C4355: 'this' : used in base member initializer list
C4099: type name first seen using 'class' now seen using 'struct'

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@630 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-15 22:26:34 +00:00
max.markin@gmail.com
0fef845950 VC2010: add support for dynamic linking
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@629 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-15 22:17:19 +00:00
max.markin@gmail.com
cfc7de1420 fixed debug build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@628 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-07 03:49:25 +00:00
zdenop@gmail.com
ab234da926 fix for issue 540
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@627 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-09-29 21:25:57 +00:00
max.markin@gmail.com
7c4461316a fixed comment
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@626 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-09-18 05:12:37 +00:00
zdenop@gmail.com
22fc3e80be VS2010 build fix
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@625 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-09-05 10:05:36 +00:00
zdenop@gmail.com
da41b96f7f removed check for libtiff - leptonica is required; cleanup #ifdef/#ifndef HAVE_LIBLEPT
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@624 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-30 06:34:41 +00:00
joregan@gmail.com
bf4a09d72a make single/multiple libraries optional -- this needs testing!!!
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
joregan@gmail.com
fbab153409 readd m4 stuff
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@622 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 22:06:01 +00:00
joregan@gmail.com
e7d0029b65 macports installs libtoolize as glibtoolize. A cleaner solution can wait for some later date
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@621 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 22:02:29 +00:00
zdenop@gmail.com
2ded50b4d0 'make dist' improvement; removed debugwin.* from vs2008 and vs2010; decreased warning level in vs2008 project files for Release* build
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@620 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 21:33:28 +00:00
theraysmith@gmail.com
f33ac09829 Fixed make dist
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@619 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 19:57:34 +00:00
joregan@gmail.com
323ee5af7a more Makefile.in
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@618 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:40:33 +00:00
joregan@gmail.com
b69d9b90b7 there is no m4 directory
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@617 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:09:29 +00:00
joregan@gmail.com
c209583793 rm reference to config.rpath
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@616 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 18:00:22 +00:00
theraysmith@gmail.com
47f032b0e9 Removed remaining Makefile.in
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@615 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:52:53 +00:00
theraysmith@gmail.com
42acb3b9a8 Updated ReleaseNotes, README, more minor cleaup
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@614 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:51:11 +00:00
theraysmith@gmail.com
4575c52ff5 Removed debugwin.cpp, fixing issue 448
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@613 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:45:59 +00:00
theraysmith@gmail.com
030aae9896 Removed debugwin.cpp, fixing issue 448
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@612 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:45:01 +00:00
theraysmith@gmail.com
0d969b7b3a Fixed problem of config file vs command line for pageseg mode
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@611 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:41:51 +00:00
theraysmith@gmail.com
7ab0a97180 Fixed comment re bln_numericmode
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@610 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:41:03 +00:00
theraysmith@gmail.com
7720f84fbd Moved definition of Config to commontraining, fixing issue 414
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@609 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:39:51 +00:00
theraysmith@gmail.com
360f5e4c8b Removed assert from FindBestMatch, fixing issue 504
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@608 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:38:58 +00:00
theraysmith@gmail.com
ea5bc16f38 Removed EXIT, fixing issue 144
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@607 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:38:09 +00:00
theraysmith@gmail.com
d5d15f32d7 Deleted Makefile.in from svn
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
9b7375edd6 MinGW portability solved + some code cleanup (based on cpplint)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@605 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-15 19:28:25 +00:00
zdenop@gmail.com
7ec3dca968 show page 0 for multipage tiff;
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
zdenop@gmail.com
4abdfdb8fe moved ccstruct/callcpp.cpp to cutil (to header file - see issue 414); moved vs2008/include/stdint.h to vs2008/port/stdint.h so we can use vs2008/include also for mingw; removed unused tessembedded.*
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@603 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 14:04:20 +00:00
zdenop@gmail.com
16f0481f5c renamed duplicate cube/const.h to cube/cube_const.h (issue 525)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@602 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 06:51:30 +00:00
zdenop@gmail.com
9b9efa8e4c man pages included to install script, improved windows installer script (issue 425), output format for "tesseract -v" changed to "3.00 version", README cleanup.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@601 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-08 20:33:18 +00:00
zdenop@gmail.com
411e074b4d fix for issues 479, 524 + tests for input image (there are no leptonica error messages on Windows console)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@597 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-29 21:55:49 +00:00
zdenop@gmail.com
1ad70ea8ff fixing issues 518 and 521
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@596 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-27 20:56:40 +00:00
zdenop@gmail.com
568cbe6d3f removed unnecessary directory
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@594 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-27 20:16:37 +00:00
zdenop@gmail.com
4085f1b5e9 removed unnecessary directory
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@593 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-07-27 20:16:13 +00:00