Commit Graph

22 Commits

Author SHA1 Message Date
Stefan Weil
85e37798cb Simplify delete operations
It is not necessary to check for null pointers.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-24 17:59:13 +01:00
Ray Smith
c1c1e426b3 Added new LSTM-based neural network line recognizer 2016-11-07 15:38:07 -08:00
Ray Smith
d74c625e52 Fixed blob division params to fix CJK training speed. 2015-06-12 10:59:26 -07:00
Ray Smith
25d0968d09 Major refactor to improve speed on difficut images, especially when running
a heap checker.
SEAM and SPLIT have been begging for a refactor for a *LONG* time.
This change does most of the work of turning them into proper classes:
  Moved relevant code into SEAM/SPLIT/TBLOB/EDGEPT etc from global helper functions.
  Made the splits full data members of SEAM in an array instead of 3 separate pointers.
    This greatly reduces the amount of new/delete happening in the chopper, which is the main goal.
  Deleted redundant files: olutil.*,  makechop.*
  Brought other code into SEAM in order to keep its data members private with only priority having accessors.
2015-05-12 14:59:14 -07:00
theraysmith@gmail.com
d11dc049e3 Fixed a lot of compiler/clang warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
2622cbd80e misc fixes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@952 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:40:32 +00:00
theraysmith@gmail.com
7ec4fd7a56 Refactorerd control functions to enable parallel blob classification
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@904 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-08 20:30:56 +00:00
theraysmith@gmail.com
4d514d5a60 Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
david.eger@gmail.com
4ddb3e5941 Good moming, Good aftemoon.
During our initial chopping for each word, pay attention to whether a
dangerous ambiguity (like rn <-> m) would lead us to a dictionary word.
If so, make sure that blob gets chopped so that we can evaluate said
dictionary word during the segmentation search.

Large accuracy improvement, especially on English printed books (~9%).



git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@713 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 21:02:54 +00:00
theraysmith@gmail.com
01026af5a2 Refactored top-level word recognition module, Blamer module added for error analysis, Added word bigram correction
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@652 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:01:38 +00:00
theraysmith
f7445867f9 Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@575 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:49:31 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
joregan
f2506871f9 move include of config_auto.h to not conflict with local types. Not finished
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@490 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 15:53:40 +00:00
joregan
a18816f839 partial merge of doxygen branch (stuff without conflicts, basically)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@441 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-27 13:23:23 +00:00
joregan
5c8ad7ee72 add config_auto.h anywhere #ifndef GRAPHICS_DISABLED is used
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@384 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-28 12:03:45 +00:00
theraysmith
b47efd2cc4 Changes to wordrec for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@304 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:46:01 +00:00
theraysmith
f04ff6145c Fixed name collision with jpeg library
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@159 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:35:16 +00:00
theraysmith
166c867d84 Removed some compiler warnings on operator precedence
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@129 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:05:57 +00:00
theraysmith
6ae6c0a042 Made some preliminary changes for improving xheights
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@107 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:20:10 +00:00
theraysmith
2f4a43b419 Improved consistency of results from floating point calculations
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@79 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 00:55:02 +00:00
theraysmith
bfd79a970e Fixed name collisions mostly with stl
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@37 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:23:42 +00:00
tmbdev
425d593ebe top-skimming import from sf.net
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk/trunk@2 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-03-07 20:03:40 +00:00