Commit Graph

27 Commits

Author SHA1 Message Date
Ray Smith
2633fef0b6 Part 2 of separating out the unicharset from the LSTM model, fixing command line for training 2017-08-02 13:29:23 -07:00
Justin Hotchkiss Palermo
f057938069 fix filenames in comments 2017-07-02 17:35:47 -04:00
Ray Smith
8e79297dce Final part of endian improvement. Adds big-endian support to lstm and fixes issue 518 2017-05-03 16:09:44 -07:00
Stefan Weil
300841f9a7 Replace memalloc / memfree by C++ new / delete
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-01 17:26:23 +02:00
Ray Smith
1cc511188d Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here. 2017-04-27 15:48:23 -07:00
Ray Smith
3c21c14949 Fixed issue 1245 2014-08-13 18:51:28 -07:00
Ray Smith
736d327473 NOP changes from static analysis in issue 1205 2014-08-12 16:09:12 -07:00
theraysmith@gmail.com
4c3475ad2e Fixed fmemopen portability problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
theraysmith@gmail.com
4d514d5a60 Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
theraysmith@gmail.com
fdd4ffe85e Fixed endian bug in dawg reader, Added word bigram correction,
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@649 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:56:18 +00:00
theraysmith
b98c922391 Fixed problem with empty dawgs
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@537 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:04:02 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
joregan
cd96d8ede5 more warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@434 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 18:11:00 +00:00
joregan
edf7e7694c silence more useless warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@432 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 15:11:19 +00:00
theraysmith
f01a33ae96 Fixed issue 260
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@326 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-17 21:19:34 +00:00
theraysmith
3a13d80d24 Changes to dict for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@293 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:20:33 +00:00
theraysmith
55891a3cdc Fixed issue 63
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@210 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-30 18:29:42 +00:00
theraysmith
04c462007f Fixed the dawg crash (edge_char_of/letter_is_okay) issue 128 and duplicates
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@205 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-24 01:08:34 +00:00
theraysmith
0aa4861116 Further fixes to dictionary generation that was losing words
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@184 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-08-20 17:47:05 +00:00
theraysmith
b950752818 Fixes to wordlist2dawg to create correct dawgs on windows
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@179 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-08-14 22:44:46 +00:00
theraysmith
520077bd41 Fixed name collision with jpeg library
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@164 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:42:51 +00:00
theraysmith
2a678305c6 Major internationalization improvements
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@133 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:21:49 +00:00
theraysmith
f382fb56f5 Fixed various internationalization issues, mostly for training
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@106 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:18:35 +00:00
theraysmith
570af48b8b Remaining changes for Unicodeization project
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@87 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:15:07 +00:00
theraysmith
a59e5dc791 Preparations for unicodization
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@56 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:46:09 +00:00
tmbdev
37b9f1244c added compilation option TESSDATA_PREFIX to put the data files in an absolute location
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@14 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-30 19:43:30 +00:00
tmbdev
425d593ebe top-skimming import from sf.net
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk/trunk@2 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-07 20:03:40 +00:00