Commit Graph

32 Commits

Author SHA1 Message Date
Ray Smith
8e79297dce Final part of endian improvement. Adds big-endian support to lstm and fixes issue 518 2017-05-03 16:09:44 -07:00
Ray Smith
1cc511188d Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here. 2017-04-27 15:48:23 -07:00
Stefan Weil
becec34057 Fix some typos in comments (found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-03-10 19:50:17 +01:00
Ray Smith
9f5ba9105f Removed dependency on cube from the code 2016-12-14 10:55:15 -08:00
Ray Smith
13e46ae1c4 Made LSTM the default engine, pushed cube out 2016-12-13 14:37:40 -08:00
Ray Smith
7744da9b7d Fixed Android build breakage 2016-12-06 13:37:10 -08:00
Ray Smith
5deebe6c27 Fixed multilang for LSTM, pushed cube to one side without actually deleting it 2016-12-05 14:41:43 -08:00
Ray Smith
f24ef67df4 Limited max height to 48 even in variable height input, enabled neural nets via ocr engine mode 2016-11-08 14:01:04 -08:00
Ray Smith
c1c1e426b3 Added new LSTM-based neural network line recognizer 2016-11-07 15:38:07 -08:00
Ray Smith
2c837dffc3 Result of clang tidy on recent merge 2016-11-07 10:46:33 -08:00
Zdenko Podobný
41478fd5a1 implement build without cube (-DNO_CUBE_BUILD) 2015-07-24 11:51:44 +02:00
Ray Smith
a303ab9d00 Misc fixes, mostly clang formatting, but some bug fixes in matrix, werd, and tesstrain_utils. Also updates unicharset to match traineddata files. 2015-07-09 14:28:20 -07:00
Ray Smith
4a3caefd92 Add ability to build under android (without cube or scrollview). 2015-05-12 15:41:15 -07:00
zdenop
0e08cb0080 Make default language params message conditional on debug level: issue 1152
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1097 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-09 18:17:29 +00:00
theraysmith@gmail.com
372ceb8ef4 Removed dependence on IMAGE class
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@960 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:48:51 +00:00
theraysmith@gmail.com
4c3475ad2e Fixed fmemopen portability problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
theraysmith@gmail.com
4d514d5a60 Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
zdenop@gmail.com
10c1169d98 remove unused code (Windows related)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@860 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-07-08 18:21:10 +00:00
theraysmith@gmail.com
3a998fe7ac Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic, Added paragraph detection in layout analysis/post OCR, Fixed inconsistent xheight during training and over-chopping, Added simultaneous multi-language capability, Refactored top-level word recognition module, Fixed problems with internally scaled images
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@651 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:59:49 +00:00
zdenop@gmail.com
7ec3dca968 show page 0 for multipage tiff;
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
theraysmith
3e8c0bc228 Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@567 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:44:05 +00:00
theraysmith
c8465252e4 Rewrite of DENORM
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@538 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:05:48 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
theraysmith
57d669ff84 Fixed issue 229: lack of bits per sample
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@316 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-08-20 22:30:21 +00:00
theraysmith
109d1c8f21 Some changes in ccmain for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@286 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:03:51 +00:00
theraysmith
cb3b9b492f Fixed tiffio problems with 32 bit images, issue 160 and duplicates
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@204 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-24 01:02:14 +00:00
theraysmith
7870d67c21 Fixed name collision with jpeg library
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@157 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:32:14 +00:00
theraysmith
dd18aea052 Added multi-page tiff capability
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@128 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:00:46 +00:00
theraysmith
b60c6065e3 Autoconf changes for 2.01
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@110 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:25:18 +00:00
theraysmith
570af48b8b Remaining changes for Unicodeization project
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@87 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:15:07 +00:00
theraysmith
0a53f8c5bf Preparations for unicodization
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@34 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:18:59 +00:00
tmbdev
425d593ebe top-skimming import from sf.net
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk/trunk@2 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-07 20:03:40 +00:00