Ray Smith
c1c1e426b3
Added new LSTM-based neural network line recognizer
2016-11-07 15:38:07 -08:00
Ray Smith
2c837dffc3
Result of clang tidy on recent merge
2016-11-07 10:46:33 -08:00
Stefan Weil
6fad5fc0a9
dict/dict: Fix memory leaks at program termination
...
Avoid dynamic memory allocation for the static variable 'cache'.
Now the destructor for that variable is called automatically
when Tesseract terminates and releases all associated memory.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-10-25 17:25:55 +02:00
zdenop
a1a4575f4b
Merge pull request #6 from jimregan/gcode_issue1316
...
Issue 1316: The traineddata file must be closed after it was opened
2016-08-29 17:31:48 +02:00
Stefan Weil
e6c0d263db
Add missing argument for tprintf
...
The format string expects an int arguments.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-03-17 09:30:25 +01:00
Stefan Weil
edf765b952
Remove unneeded const qualifiers
...
This fixes compiler warnings like this one:
api/baseapi.h:739:32: warning:
type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-05 06:36:42 +01:00
Stefan Weil
97d47a406d
dict: Fix typos in comments and strings
...
All of them were found by codespell.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-14 22:16:42 +02:00
Zdenko Podobný
66a76a9477
Revert "temporary add config/*, configure and Makefile.in for release"
...
This reverts commits ec9581d8f2
, 1afe382c4e
, 4b2cfabcc1
2015-07-31 21:44:43 +02:00
Jim O'Regan
524a61452d
Doxygen
...
Squashed commit from https://github.com/tesseract-ocr/tesseract/tree/more-doxygen
closes #14
Commits:
6317305
doxygen
9f42f69
doxygen
0fc4d52
doxygen
37b4b55
fix typo
bded8f1
some more doxy
020eb00
slight tweak
524666d
doxygenify
2a36a3e
doxygenify
229d218
doxygenify
7fd28ae
doxygenify
a8c64bc
doxygenify
f5d21b6
fix
5d8ede8
doxygenify
a58a4e0
language_model.cpp
fa85709
lm_pain_points.cpp lm_state.cpp
6418da3
merge
06190ba
Merge branch 'old_doxygen_merge' into more-doxygen
84acf08
Merge branch 'master' into more-doxygen
50fe1ff
pagewalk.cpp cube_reco_context.cpp
2982583
change to relative
192a24a
applybox.cpp, take one
8eeb053
delete docs for obsolete params
52e4c77
modernise classify/ocrfeatures.cpp
2a1cba6
modernise cutil/emalloc.cpp
773e006
silence doxygen warning
aeb1731
silence doxygen warning
f18387f
silence doxygen; new params are unused?
15ad6bd
doxygenify cutil/efio.cpp
c8b5dad
doxygenify cutil/danerror.cpp
784450f
the globals and exceptions parts are obsolete; remove
8bca324
doxygen classify/normfeat.cpp
9bcbe16
doxygen classify/normmatch.cpp
aa9a971
doxygen ccmain/cube_control.cpp
c083ff2
doxygen ccmain/cube_reco_context.cpp
f842850
params changed
5c94f12
doxygen ccmain/cubeclassifier.cpp
15ba750
case sensitive
f5c71d4
case sensitive
f85655b
doxygen classify/intproto.cpp
4bbc7aa
partial doxygen classify/mfx.cpp
dbb6041
partial doxygen classify/intproto.cpp
2aa72db
finish doxygen classify/intproto.cpp
0b8de99
doxygen training/mftraining.cpp
0b5b35c
partial doxygen ccstruct/coutln.cpp
b81c766
partial doxygen ccstruct/coutln.cpp
40fc415
finished? doxygen ccstruct/coutln.cpp
6e4165c
doxygen classify/clusttool.cpp
0267dec
doxygen classify/cutoffs.cpp
7f0c70c
doxygen classify/fpoint.cpp
512f3bd
ignore ~ files
5668a52
doxygen classify/intmatcher.cpp
84788d4
doxygen classify/kdtree.cpp
29f36ca
doxygen classify/mfoutline.cpp
40b94b1
silence doxygen warnings
6c511b9
doxygen classify/mfx.cpp
f9b4080
doxygen classify/outfeat.cpp
aa1df05
doxygen classify/picofeat.cpp
cc5f466
doxygen training/cntraining.cpp
cce044f
doxygen training/commontraining.cpp
167e216
missing param
9498383
renamed params
37eeac2
renamed param
d87b5dd
case
c8ee174
renamed params
b858db8
typo
4c2a838
h2 context?
81a2c0c
fix some param names; add some missing params, no docs
bcf8a4c
add some missing params, no docs
af77f86
add some missing params, no docs; fix some param names
01df24e
fix some params
6161056
fix some params
68508b6
fix some params
285aeb6
doxygen complains here no matter what
529bcfa
rm some missing params, typos
cd21226
rm some missing params, add some new ones
48a4bc2
fix params
c844628
missing param
312ce37
missing param; rename one
ec2fdec
missing param
05e15e0
missing params
d515858
change "<" to < to make doxygen happy
b476a28
wrong place
2015-07-20 18:48:00 +01:00
Zdenko Podobný
ec9581d8f2
temporary add configure and Makefile.in for release
2015-07-11 09:42:43 +02:00
Jim O'Regan
a94943cc1f
remove unneeded comment from commit
2015-05-13 14:59:02 +01:00
oriahulrich@microvu.com
d3252f926e
Issue 1316: The traineddata file must be closed after it was opened
2015-05-13 14:53:37 +01:00
Ray Smith
84920b92b3
Font and classifier output structure cleanup.
...
Font recognition was poor, due to forcing a 1st and 2nd choice at
a character level, when the total score for the correct font is often
correct at the word level, so allowed the propagation of a full set
of fonts and scores to the word recognizer, which can now decide word
level fonts using the scores instead of simple votes.
Change precipitated a cleanup of output data structures for classifier
results, eliminating ScoredClass and INT_RESULT_STRUCT, with a few
extra elements going in UnicharRating, and using that wherever possible.
That added the extra complexity of 1-rating due to a flip between 0 is
good and 0 is bad for the internal classifier scores before they are
converted to rating and certainty.
2015-05-12 17:24:34 -07:00
Ray Smith
3c21c14949
Fixed issue 1245
2014-08-13 18:51:28 -07:00
Ray Smith
736d327473
NOP changes from static analysis in issue 1205
2014-08-12 16:09:12 -07:00
zdenop
ee73e3b107
fix issue 123: user-words (and user-patterns) file specified by command line
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1093 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-05-04 21:11:00 +00:00
theraysmith@gmail.com
07ca24aeaf
Removed upper limit on trie size, fixing issue 1020.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1044 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-02-03 19:18:23 +00:00
theraysmith@gmail.com
d11dc049e3
Fixed a lot of compiler/clang warnings
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1015 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-25 02:28:51 +00:00
theraysmith@gmail.com
60b4f8bc88
Fixed issue 743
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@978 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-10 18:25:46 +00:00
theraysmith@gmail.com
67f9af58b8
Removed dependence on IMAGE class
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@944 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:31:29 +00:00
theraysmith@gmail.com
7ec4fd7a56
Refactorerd control functions to enable parallel blob classification
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@904 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-11-08 20:30:56 +00:00
zdenop@gmail.com
53a3e0f88a
fix issue 755; add example config files from tesseract manpage
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@894 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-20 20:20:10 +00:00
theraysmith@gmail.com
4c3475ad2e
Fixed fmemopen portability problem
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
theraysmith@gmail.com
4d514d5a60
Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
david.eger@gmail.com
0aadbd0169
Save BLOB_CHOICE s for alternate choices saved during segmentation
...
search so we have them when trying to replace words with alternates in
the bigram correction pass.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@739 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-09-01 00:33:46 +00:00
david.eger@gmail.com
4f0ff358a7
Missing close bracket.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@714 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-29 06:15:33 +00:00
david.eger@gmail.com
4ddb3e5941
Good moming, Good aftemoon.
...
During our initial chopping for each word, pay attention to whether a
dangerous ambiguity (like rn <-> m) would lead us to a dictionary word.
If so, make sure that blob gets chopped so that we can evaluate said
dictionary word during the segmentation search.
Large accuracy improvement, especially on English printed books (~9%).
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@713 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 21:02:54 +00:00
david.eger@gmail.com
0d5e8b5cb6
Recording segmentation state for a choice at LogNewChoice() time was a
...
bad idea -- a VIABLE_CHOICE's Blob->NumChunks can be modified as we go
by a call from Dict::LogNewSplit(). Relying on the auxilury
segmentation_state makes alt choices sometimes reference the wrong
blobs.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@711 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-28 20:11:57 +00:00
zdenop@gmail.com
d4d4b8aad8
improve autools system (mingw+msys fix); implementation of --disable-tessdata-prefix
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@708 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-22 20:01:33 +00:00
zdenop@gmail.com
1009a6e2f0
fopen() should use binary mode (issue 70)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@704 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-11 12:41:17 +00:00
zdenop@gmail.com
97e19443a3
install only necessary headers, fix uninstall
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@692 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-03 13:22:51 +00:00
zdenop@gmail.com
30a70142a0
visibility - autotools part (./configure --enable-visibility)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@690 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-03-02 23:51:33 +00:00
zdenop@gmail.com
6ccab83bd6
fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING")
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@677 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-19 21:48:45 +00:00
david.eger@gmail.com
018f192fc2
Abolish populate_unichars(), fixing seg fault reported in Debian:
...
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658634
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@675 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-15 01:37:00 +00:00
theraysmith@gmail.com
fdd4ffe85e
Fixed endian bug in dawg reader, Added word bigram correction,
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@649 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 02:56:18 +00:00
zdenop@gmail.com
67f47008c7
fixed "one lib" build on linux; runautoconf renamed to autogen.sh;
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@631 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-10-16 19:39:54 +00:00
joregan@gmail.com
bf4a09d72a
make single/multiple libraries optional -- this needs testing!!!
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@623 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-29 21:28:28 +00:00
theraysmith@gmail.com
d5d15f32d7
Deleted Makefile.in from svn
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@606 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:32:44 +00:00
zdenop@gmail.com
9b7375edd6
MinGW portability solved + some code cleanup (based on cpplint)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@605 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-15 19:28:25 +00:00
theraysmith
664b84b3c8
Various fixes, including memory leak in fixspace, font labels on output, removed some annoying debug output, fixes to initialization of parameters, general cleanup, and added Hindi
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@571 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-21 21:46:35 +00:00
theraysmith
96ca745384
Deleted lots of dead code, including PBLOB
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@565 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 22:14:53 +00:00
theraysmith
7cd3c74419
Deleted lots of dead code, including PBLOB
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@560 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-03-18 21:53:35 +00:00
theraysmith
b98c922391
Fixed problem with empty dawgs
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@537 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 01:04:02 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
zdenop@gmail.com
282aa13975
*.vcproj moved to vs2008/ (bin/ and bin.dbg/ will be in vs2008/)
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@506 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-10-06 21:38:19 +00:00
zdenop@gmail.com
3964660093
update of VC++ project file to recent changes
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@495 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 21:20:00 +00:00
joregan
e0b07948fc
disabling gettext checks - not currently used, and something about disabling is causing subsequent autoconf checks to not run
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@492 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-30 16:27:39 +00:00
joregan
9c53d54fe3
max.markin's patch for issue 345
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@477 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-09-29 23:54:18 +00:00
joregan
69f39d4bf5
fix for issue 341, thanks to max.markin
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@454 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-19 19:17:06 +00:00
joregan
75676cd644
doxygen
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@449 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-08-02 00:05:57 +00:00