Commit Graph

28 Commits

Author SHA1 Message Date
Ray Smith
dc8745e6fd Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe. 2017-07-14 11:14:23 -07:00
Ray Smith
da03e4e910 Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion 2017-07-14 09:30:14 -07:00
Stefan Weil
079d6b9161 Improve robustness of TessdataManager
Tesseract crashes with an unhandled exception (std::bad_alloc) if it gets
a bad tessdata file where the numEntries data field is very large (also
after swapping), for example 0x77777777.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-14 21:33:56 +02:00
Stefan Weil
db8750e94e Remove unused method TessdataManager::LoadFileLater
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 13:14:47 +02:00
Stefan Weil
65b839e1aa Remove unused method TessdataManager::OverwriteEntry
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 13:14:47 +02:00
Stefan Weil
3a67ff930e Optimize code by replacing init_to_size with resize_no_init
There is no need to initialize memory with a fixed value which is
overwritten in the next step.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-12 14:34:55 +02:00
Ray Smith
8e79297dce Final part of endian improvement. Adds big-endian support to lstm and fixes issue 518 2017-05-03 16:09:44 -07:00
Stefan Weil
048cf9d06a Remove unused local variables
This fixes some compiler warnings.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-02 09:43:29 +02:00
Ray Smith
1cc511188d Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here. 2017-04-27 15:48:23 -07:00
Ray Smith
13e46ae1c4 Made LSTM the default engine, pushed cube out 2016-12-13 14:37:40 -08:00
Stefan Weil
a351dae29b ccutil/tessdatamanager: Fix resource leak
Coverity report:

CID 1340278 (#1 of 1): Resource leak (RESOURCE_LEAK)
11. leaked_storage: Variable output_file going out of scope leaks the storage it points to.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-10-24 16:00:57 +02:00
Stefan Weil
539b7fbbab ccutil: Fix typos in comments and strings
Most of them were found by codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-14 22:09:18 +02:00
Ray Smith
44122698d7 Removed debug messages, forward compatability of traineddata files, further bug fix. 2015-07-09 14:50:25 -07:00
Ray Smith
bfd2cb83d5 Fixed issue 1303 2014-10-07 09:21:17 -07:00
theraysmith@gmail.com
bb2e46830f Fixed issue 1075
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1029 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-29 20:28:42 +00:00
theraysmith@gmail.com
4c3475ad2e Fixed fmemopen portability problem
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@890 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-10-10 02:07:26 +00:00
theraysmith@gmail.com
4d514d5a60 Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
zdenop@gmail.com
b9abecfb34 Auto append dot in combine_tessdata (issue 932); provide more info for combine_tessdata utility
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@848 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-06-02 20:50:55 +00:00
zdenop@gmail.com
ad004bddf7 More info for combine_tessdata files (thanks to gmvbif - issue 917)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@845 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-21 18:34:47 +00:00
zdenop@gmail.com
7e14ade10d print error/warning messages to stderr/debug file instead of stdout (fix issue 911)
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@843 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-05-16 20:31:37 +00:00
theraysmith@gmail.com
e0d735b122 Remaining misc changes for 3.02
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@658 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:43 +00:00
theraysmith@gmail.com
42acb3b9a8 Updated ReleaseNotes, README, more minor cleaup
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@614 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-18 16:51:11 +00:00
zdenop@gmail.com
7ec3dca968 show page 0 for multipage tiff;
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;

git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
theraysmith
4c4d036ee4 Removed serialize and NEWDELETE macros
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@529 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 00:53:31 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
joregan
cd96d8ede5 more warnings
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@434 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-07-21 18:11:00 +00:00
theraysmith
45aacc077e Updated tessdatamanager/combine_tessdata to give more functionality
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@353 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-05-20 23:07:24 +00:00
theraysmith
d8b1456dd5 Changes to ccutil for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@305 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:50:24 +00:00