Commit Graph

21 Commits

Author SHA1 Message Date
Ray Smith
a912967cc3 Rewrote unicharset_extractor to use the new string normalizer and read plain text as well as box files. 2017-09-08 11:49:57 +01:00
Ray Smith
b0ead95d64 Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested. 2017-07-24 11:45:57 -07:00
Ray Smith
da03e4e910 Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion 2017-07-14 09:30:14 -07:00
Stefan Weil
69296f8d18 Clean method UNICHARSET::add_script
It increased the script_table too early, so the last element was never
used.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 11:53:43 +02:00
Ray Smith
c1c1e426b3 Added new LSTM-based neural network line recognizer 2016-11-07 15:38:07 -08:00
Stefan Weil
edf765b952 Remove unneeded const qualifiers
This fixes compiler warnings like this one:

api/baseapi.h:739:32: warning:
 type qualifiers ignored on function return type [-Wignored-qualifiers]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-05 06:36:42 +01:00
Ray Smith
44122698d7 Removed debug messages, forward compatability of traineddata files, further bug fix. 2015-07-09 14:50:25 -07:00
Ray Smith
a303ab9d00 Misc fixes, mostly clang formatting, but some bug fixes in matrix, werd, and tesstrain_utils. Also updates unicharset to match traineddata files. 2015-07-09 14:28:20 -07:00
Ray Smith
f927728169 Fixed issue 1207 2014-10-09 13:28:03 -07:00
Ray Smith
d3448c37ab Fixed issue 1264 2014-09-17 18:29:32 -07:00
theraysmith@gmail.com
dbf6197471 Major refactor of control.cpp to enable line recognition
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
theraysmith@gmail.com
f297e5d909 Misc fixes
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@942 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:27:40 +00:00
theraysmith@gmail.com
dfc1a92628 Refactored classifier to make it easier to add new ones
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@874 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:16:01 +00:00
theraysmith@gmail.com
e0d735b122 Remaining misc changes for 3.02
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@658 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:43 +00:00
theraysmith
4c4d036ee4 Removed serialize and NEWDELETE macros
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@529 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 00:53:31 +00:00
zdenop@gmail.com
4523ce9f7d 3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
theraysmith
d8b1456dd5 Changes to ccutil for 3.00
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@305 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:50:24 +00:00
theraysmith
4b5609238b Fixed name collision with jpeg library
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@156 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:23:41 +00:00
theraysmith
2a678305c6 Major internationalization improvements
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@133 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:21:49 +00:00
theraysmith
570af48b8b Remaining changes for Unicodeization project
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@87 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:15:07 +00:00
theraysmith
7be9e334cf Preparations for unicodization
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@39 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:25:41 +00:00