Ray Smith
a912967cc3
Rewrote unicharset_extractor to use the new string normalizer and read plain text as well as box files.
2017-09-08 11:49:57 +01:00
Ray Smith
b0ead95d64
Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested.
2017-07-24 11:45:57 -07:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Stefan Weil
69296f8d18
Clean method UNICHARSET::add_script
...
It increased the script_table too early, so the last element was never
used.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-13 11:53:43 +02:00
Ray Smith
c1c1e426b3
Added new LSTM-based neural network line recognizer
2016-11-07 15:38:07 -08:00
Stefan Weil
edf765b952
Remove unneeded const qualifiers
...
This fixes compiler warnings like this one:
api/baseapi.h:739:32: warning:
type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-11-05 06:36:42 +01:00
Ray Smith
44122698d7
Removed debug messages, forward compatability of traineddata files, further bug fix.
2015-07-09 14:50:25 -07:00
Ray Smith
a303ab9d00
Misc fixes, mostly clang formatting, but some bug fixes in matrix, werd, and tesstrain_utils. Also updates unicharset to match traineddata files.
2015-07-09 14:28:20 -07:00
Ray Smith
f927728169
Fixed issue 1207
2014-10-09 13:28:03 -07:00
Ray Smith
d3448c37ab
Fixed issue 1264
2014-09-17 18:29:32 -07:00
theraysmith@gmail.com
dbf6197471
Major refactor of control.cpp to enable line recognition
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@1147 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-08-11 23:23:06 +00:00
theraysmith@gmail.com
f297e5d909
Misc fixes
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@942 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 17:27:40 +00:00
theraysmith@gmail.com
dfc1a92628
Refactored classifier to make it easier to add new ones
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@874 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:16:01 +00:00
theraysmith@gmail.com
e0d735b122
Remaining misc changes for 3.02
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@658 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:14:43 +00:00
theraysmith
4c4d036ee4
Removed serialize and NEWDELETE macros
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@529 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-30 00:53:31 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
theraysmith
d8b1456dd5
Changes to ccutil for 3.00
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@305 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:50:24 +00:00
theraysmith
4b5609238b
Fixed name collision with jpeg library
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@156 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-04-22 00:23:41 +00:00
theraysmith
2a678305c6
Major internationalization improvements
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@133 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:21:49 +00:00
theraysmith
570af48b8b
Remaining changes for Unicodeization project
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@87 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:15:07 +00:00
theraysmith
7be9e334cf
Preparations for unicodization
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@39 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:25:41 +00:00