Stefan Weil
f3c4b894dc
Fix help message for unicharset_extractor ( #1206 )
...
If unicharset_extractor was called without any argument,
a help message was printed by tesseract::ParseCommandLineFlags.
Replace that by the local help message which is better.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-11-10 15:45:35 +01:00
ivanzz1001
fb359fc981
Update unicharset_extractor.cpp ( #1153 )
...
* change IsWhitespace to IsUTF8Whitespace
To solve "Phase UP: Generating unicharset and unichar properties files" ERROR #1147
please reference: [#1147 ](https://github.com/tesseract-ocr/tesseract/issues/1147 )
* Update unicharset_extractor.cpp
fix the "Phase UP: Generating unicharset and unichar properties files" ERROR
* Update unicharset_extractor.cpp
fix "Phase UP: Generating unicharset and unichar properties files" ERROR #1147
* Update unicharset_extractor.cpp
fix the encoding invalid problem and fix the comment
2017-10-13 11:46:42 +02:00
Ray Smith
a912967cc3
Rewrote unicharset_extractor to use the new string normalizer and read plain text as well as box files.
2017-09-08 11:49:57 +01:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Stefan Weil
cb6e9e0071
training: Replace NULL by nullptr
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-14 21:08:36 +01:00
John Slade
379da1f2e0
training/unicharset_extractor.cpp: Print whether WCTYPE is included
...
Character properties are autogenerated only if wctype is found on the
system. However, it is not possible to know if a version of
unicharset_extractor was compiled with this support (especially if it
was installed as a pre-compiled binary).
This commit adds a print to the usage details to output if the binary
was compiled with wctype support.
2015-10-05 11:54:24 +01:00
theraysmith@gmail.com
0e230a9d96
New training tool text2image
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@964 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2014-01-09 18:01:34 +00:00
theraysmith@gmail.com
4d514d5a60
Major refactor of beam search, elimination of dead code, misc bug fixes, updates to Makefile.am, Changelog etc.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@878 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2013-09-23 15:26:50 +00:00
zdenop@gmail.com
6ccab83bd6
fixing issue 628 (replacing __MSW32__ with _WIN32) and issue 614 (reverting "class DLLSYM STRING" to "class CCUTIL_API STRING")
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@677 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-19 21:48:45 +00:00
theraysmith@gmail.com
e33ae59f4d
Fixed training leaks and randomness
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@653 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2012-02-02 03:02:16 +00:00
zdenop@gmail.com
7ec3dca968
show page 0 for multipage tiff;
...
Windows: use binary mode for fopen (issue 70);
autotools: fixed cutil/Makefile.am, improved tessdata/Makefile.am;
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@604 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2011-08-11 21:42:13 +00:00
zdenop@gmail.com
4523ce9f7d
3.01 code from http://github.com/jimregan/tesseract-ocr with addaptions related to Linux and Windows (VC2008) compile process
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@526 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2010-11-23 18:34:14 +00:00
theraysmith
5b79487a8e
Changes to training for 3.00
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@302 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2009-07-11 02:44:07 +00:00
theraysmith
e79302cfe4
Fixed issue 81
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@215 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-12-30 21:28:08 +00:00
theraysmith
0b50f4f4a1
Major internationalization improvements
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@153 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2008-02-01 00:57:56 +00:00
theraysmith
58ade5fce6
Fixed an error in setting char properties
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@120 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 22:13:30 +00:00
theraysmith
f382fb56f5
Fixed various internationalization issues, mostly for training
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@106 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-08-30 18:18:35 +00:00
theraysmith
1943de9aa9
Fixed the extern C mismatches properly.
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@82 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-07-18 01:00:54 +00:00
theraysmith
f9a4bc092a
Preparations for unicodization
...
git-svn-id: https://tesseract-ocr.googlecode.com/svn/trunk@38 d0cd1f9f-072b-0410-8dd7-cf729c803f20
2007-05-16 01:24:06 +00:00