theraysmith@gmail.com
4c3475ad2e
Fixed fmemopen portability problem
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
theraysmith@gmail.com
4d514d5a60
Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
david.eger@gmail.com
0aadbd0169
Save BLOB_CHOICE s for alternate choices saved during segmentation
...
search so we have them when trying to replace words with alternates in
the bigram correction pass.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@739 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-01 00:33:46 +00:00
david.eger@gmail.com
4f0ff358a7
Missing close bracket.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@714 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-29 06:15:33 +00:00
david.eger@gmail.com
4ddb3e5941
Good moming, Good aftemoon.
...
During our initial chopping for each word, pay attention to whether a
dangerous ambiguity (like rn <-> m) would lead us to a dictionary word.
If so, make sure that blob gets chopped so that we can evaluate said
dictionary word during the segmentation search.
Large accuracy improvement, especially on English printed books (~9%).
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@713 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 21:02:54 +00:00
david.eger@gmail.com
0d5e8b5cb6
Recording segmentation state for a choice at LogNewChoice() time was a
...
bad idea -- a VIABLE_CHOICE's Blob->NumChunks can be modified as we go
by a call from Dict::LogNewSplit(). Relying on the auxilury
segmentation_state makes alt choices sometimes reference the wrong
blobs.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@711 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 20:11:57 +00:00
zdenop@gmail.com
d4d4b8aad8
improve autools system (mingw+msys fix); implementation of --disable-tessdata-prefix
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@708 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-22 20:01:33 +00:00
zdenop@gmail.com
1009a6e2f0
fopen() should use binary mode (issue 70)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@704 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-11 12:41:17 +00:00
zdenop@gmail.com
97e19443a3
install only necessary headers, fix uninstall
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@692 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 13:22:51 +00:00
zdenop@gmail.com
30a70142a0
visibility - autotools part (./configure --enable-visibility)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@690 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 23:51:33 +00:00
zdenop@gmail.com
6ccab83bd6
fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING")
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@677 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-19 21:48:45 +00:00
david.eger@gmail.com
018f192fc2
Abolish populate_unichars(), fixing seg fault reported in Debian:
...
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658634
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@675 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-15 01:37:00 +00:00
theraysmith@gmail.com
fdd4ffe85e
Fixed endian bug in dawg reader, Added word bigram correction,
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@649 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:56:18 +00:00
zdenop@gmail.com
67f47008c7
fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
joregan@gmail.com
bf4a09d72a
make single/multiple libraries optional -- this needs testing!!!
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
theraysmith@gmail.com
d5d15f32d7
Deleted Makefile.in from svn
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
9b7375edd6
MinGW portability solved + some code cleanup (based on cpplint)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@605 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-15 19:28:25 +00:00
theraysmith
664b84b3c8
Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@571 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:46:35 +00:00
theraysmith
96ca745384
Deleted lots of dead code, including PBLOB
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@565 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 22:14:53 +00:00
theraysmith
7cd3c74419
Deleted lots of dead code, including PBLOB
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@560 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:53:35 +00:00
theraysmith
b98c922391
Fixed problem with empty dawgs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@537 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:04:02 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
zdenop@gmail.com
282aa13975
*.vcproj moved to vs2008/ (bin/ and bin.dbg/ will be in vs2008/)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@506 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-06 21:38:19 +00:00
zdenop@gmail.com
3964660093
update of VC++ project file to recent changes
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@495 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 21:20:00 +00:00
joregan
e0b07948fc
disabling gettext checks - not currently used, and something about disabling is causing subsequent autoconf checks to not run
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@492 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 16:27:39 +00:00
joregan
9c53d54fe3
max.markin's patch for issue 345
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@477 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-29 23:54:18 +00:00
joregan
69f39d4bf5
fix for issue 341, thanks to max.markin
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@454 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-19 19:17:06 +00:00
joregan
75676cd644
doxygen
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@449 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-02 00:05:57 +00:00
joregan
d7924dd824
http://groups.google.com/group/tesseract-ocr/msg/16597e4f7725dfe1
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@448 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-01 16:24:31 +00:00
joregan
a18816f839
partial merge of doxygen branch (stuff without conflicts, basically)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@441 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 13:23:23 +00:00
joregan
7e8bd73aea
some casts to get rid of persistent warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@435 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 21:19:53 +00:00
joregan
cd96d8ede5
more warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@434 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 18:11:00 +00:00
joregan
edf7e7694c
silence more useless warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@432 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 15:11:19 +00:00
joregan
69d6d35f28
patch for issue 304 from max.markin
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@422 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 02:32:21 +00:00
joregan
a301f9a5c7
start of i18n
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@418 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-19 01:59:13 +00:00
joregan
ddcb98565a
update generated autoconf/make stuff
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@369 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-26 14:21:37 +00:00
joregan
34d8258049
use libtool
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@368 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-26 14:20:20 +00:00
theraysmith
aea5be1995
Fixed issue 272
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@335 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-19 18:48:59 +00:00
theraysmith
f01a33ae96
Fixed issue 260
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@326 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-17 21:19:34 +00:00
theraysmith
3a13d80d24
Changes to dict for 3.00
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@293 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:20:33 +00:00
theraysmith
bea5e04b76
Fixed compilation with GRAPHICS_DISABLED
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@250 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-06-03 17:24:08 +00:00
theraysmith
f3060abf71
Automake changes for potential RC of 2.04
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@248 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-06-03 02:50:54 +00:00
theraysmith
55891a3cdc
Fixed issue 63
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@210 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-30 18:29:42 +00:00
theraysmith
04c462007f
Fixed the dawg crash (edge_char_of/letter_is_okay) issue 128 and duplicates
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@205 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-24 01:08:34 +00:00
theraysmith
3adf29c25c
Increased max edges in squished dawg
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@194 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-11-14 04:33:18 +00:00
theraysmith
0aa4861116
Further fixes to dictionary generation that was losing words
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@184 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-08-20 17:47:05 +00:00
tmbdev
a978ccb68f
changed runautoconf instructions
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@183 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-08-18 20:18:21 +00:00
theraysmith
b950752818
Fixes to wordlist2dawg to create correct dawgs on windows
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@179 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-08-14 22:44:46 +00:00
theraysmith
520077bd41
Fixed name collision with jpeg library
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@164 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:42:51 +00:00
theraysmith
d020d91255
Added new files
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@142 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:41:51 +00:00
theraysmith
0371d16fe1
Fixed compiler warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@141 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:41:25 +00:00
theraysmith
2a678305c6
Major internationalization improvements
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@133 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:21:49 +00:00
theraysmith
aa55810b6b
Misc improvements
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@132 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:18:33 +00:00
theraysmith
166c867d84
Removed some compiler warnings on operator precedence
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@129 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:05:57 +00:00
theraysmith
eaef4c989f
Fixed crash
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@114 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:31:57 +00:00
theraysmith
f382fb56f5
Fixed various internationalization issues, mostly for training
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@106 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:18:35 +00:00
theraysmith
100942d7ed
Fixed dawg table too full error
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@105 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:16:00 +00:00
theraysmith
570af48b8b
Remaining changes for Unicodeization project
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@87 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:15:07 +00:00
theraysmith
eeaca1beba
Fixed problems with signed characters.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@85 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:05:40 +00:00
theraysmith
4df1016692
Automake changes for version 2.00.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@84 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:04:56 +00:00
theraysmith
0d9fa6a040
Fixed portability problems with VC++ 6 and VC++ express.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@83 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:01:50 +00:00
theraysmith
02d760759f
Making release 1.04
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@62 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-17 00:48:27 +00:00
theraysmith
a59e5dc791
Preparations for unicodization
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@56 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:46:09 +00:00
theraysmith
c7e9ec8f41
Misc improvements
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@55 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:44:53 +00:00
theraysmith
bc769e29b2
Preparations for unicodization
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@32 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 00:44:44 +00:00
tmbdev
6da5fdb8d0
Added Makefile.in files back in to permit building from Subversion without installed autoconf/automake tools.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@29 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-04-10 23:15:48 +00:00
tmbdev
7fa676659b
changed configuration to install header files in $(includedir)/tesseract
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@18 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-31 00:37:26 +00:00
tmbdev
9f2b3b7154
changed autoconf/automake system to use standard install paths; removed auto-generated files from repository (use runautoconf instead)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@16 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-30 23:53:34 +00:00
tmbdev
37b9f1244c
added compilation option TESSDATA_PREFIX to put the data files in an absolute location
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@14 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-30 19:43:30 +00:00
tmbdev
425d593ebe
top-skimming import from sf.net
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk/trunk@2 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-07 20:03:40 +00:00