Ria
d751305804
fixed missing include for std::back_inserter.
...
with Visual Studio 2015 RTM:
Error C2039: 'back_inserter': is not a member of 'std'
Error C3861: 'back_inserter': identifier not found
need "iterator" with Visual Studio 2015 (vc14).
#include <iterator>
2017-11-23 11:37:35 +03:30
Stefan Weil
f3c4b894dc
Fix help message for unicharset_extractor ( #1206 )
...
If unicharset_extractor was called without any argument,
a help message was printed by tesseract::ParseCommandLineFlags.
Replace that by the local help message which is better.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-11-10 15:45:35 +01:00
ivanzz1001
fb359fc981
Update unicharset_extractor.cpp ( #1153 )
...
* change IsWhitespace to IsUTF8Whitespace
To solve "Phase UP: Generating unicharset and unichar properties files" ERROR #1147
please reference: [#1147 ](https://github.com/tesseract-ocr/tesseract/issues/1147 )
* Update unicharset_extractor.cpp
fix the "Phase UP: Generating unicharset and unichar properties files" ERROR
* Update unicharset_extractor.cpp
fix "Phase UP: Generating unicharset and unichar properties files" ERROR #1147
* Update unicharset_extractor.cpp
fix the encoding invalid problem and fix the comment
2017-10-13 11:46:42 +02:00
Stefan Weil
07f1400e6f
Revert "change type to UChar32 to fix IsValidCodepoint"
...
This reverts commit a404c9cdb3
.
That code no longer matched the specification (see code comment).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-09-18 07:42:00 +02:00
Shree Devi Kumar
a404c9cdb3
change type to UChar32 to fix IsValidCodepoint
2017-09-16 14:10:34 +05:30
amitdo
a905548ed6
Autotools build: Remove the option 'USING_MULTIPLELIBS'
...
Libtool's convenience libraries should never be installed. Fixes #985 .
2017-09-11 15:03:53 +03:00
Shree Devi Kumar
4e9c975859
fix accidental overwrite using old version
2017-09-11 14:45:25 +05:30
Shreeshrii
9a038f893a
Add merge_unicharsets to build
2017-09-10 21:51:52 +05:30
Egor Pugin
36e0d2093a
Fix windows build.
2017-09-09 21:25:25 +03:00
Ray Smith
9d258e20d3
Fixed build of unicharset_extractor
2017-09-08 15:33:03 +01:00
Ray Smith
fc6a390c6c
Added intsimdmatrix as a generic integer matrixdotvector function with AVX2 and SSE specializations
2017-09-08 15:06:19 +01:00
Ray Smith
4cf123e099
Added ability to randomly rotate images upside-down during training for training OSD
2017-09-08 12:42:57 +01:00
Ray Smith
3e63918f9d
Fixed order of characters in ligatures of RTL languages issue #648
2017-09-08 11:55:11 +01:00
Ray Smith
a912967cc3
Rewrote unicharset_extractor to use the new string normalizer and read plain text as well as box files.
2017-09-08 11:49:57 +01:00
Ray Smith
c773eb5784
Fixed rendering of Thai and units of char spacing
2017-09-08 10:29:03 +01:00
Ray Smith
e96d1df072
Fixed leaks in pango font info
2017-09-08 10:28:22 +01:00
Ray Smith
a2a72d7ca7
Clang tidy changes from sync
2017-09-08 10:13:33 +01:00
Stefan Weil
61f96981e5
training: Fix typos in comments (found by codespell)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-09-03 09:16:02 +02:00
Arkady Shapkin
d171488e21
Added CMake option to use system ICU library
2017-08-17 02:50:54 +03:00
Ray Smith
5f5e85e4a0
Fixed lack of error on non-existent traineddata
2017-08-07 09:58:43 -07:00
Ray Smith
0a91498195
Improved error message on missing optional config
2017-08-07 09:50:49 -07:00
Ray Smith
4b3c5f6c35
Added check for non-empty traineddata flag
2017-08-07 09:43:30 -07:00
Egor Pugin
c67c2e9f41
Add combine_lang_model to cmake and cppan builds.
2017-08-06 14:46:32 +03:00
Stefan Weil
cdec915e17
Fix broken build for Windows
...
Windows does not provide a mkdir function with two parameters.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-04 10:18:35 +02:00
Ray Smith
77c44cdecd
Added convert to int and directory listing to combine_tessdata
2017-08-02 14:53:07 -07:00
Ray Smith
39b168a0b6
Removed errors introduced by git merge
2017-08-02 14:12:45 -07:00
Ray Smith
4e9665debf
Added ADAM optimizer, unless git screwed it up, cos there is no diff
2017-08-02 14:03:50 -07:00
Ray Smith
2633fef0b6
Part 2 of separating out the unicharset from the LSTM model, fixing command line for training
2017-08-02 13:29:23 -07:00
Ray Smith
b0ead95d64
Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested.
2017-07-24 11:45:57 -07:00
Ray Smith
3f7735492f
Removed unnecessary using statements and cleaned up google/non-google distinction
2017-07-19 16:42:48 -07:00
Stefan Weil
5a7b7ed7e1
PangoFontInfo: Remove unused method is_italic
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:22:05 +02:00
Stefan Weil
0cd71c67c9
PangoFontInfo: Remove unused method is_bold
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:59 +02:00
Stefan Weil
fbfbf67cf9
PangoFontInfo: Remove unused method is_smallcaps
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:49 +02:00
Stefan Weil
500f913b51
PangoFontInfo: Remove unused method is_monospace
...
Remove also some macros which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:35 +02:00
Stefan Weil
059e30d4cb
PangoFontInfo: Remove unused method is_fraktur
...
That restores commit 25e0c1accb
and
partially revert commit 4907a23fea
which added the now unused Shlwapi library.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:16:38 +02:00
Egor Pugin
4907a23fea
Fix windows build.
2017-07-15 15:09:00 +03:00
Ray Smith
dc8745e6fd
Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe.
2017-07-14 11:14:23 -07:00
Ray Smith
df41eab6aa
Added script-specific validation and normalization for virama-using scripts and updated normalization for others
2017-07-14 10:05:05 -07:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Justin Hotchkiss Palermo
f057938069
fix filenames in comments
2017-07-02 17:35:47 -04:00
zdenop
59de660386
Merge pull request #969 from stweil/clean
...
PangoFontInfo: Remove some unused methods
2017-06-03 15:30:46 +02:00
Stefan Weil
2843739843
PangoFontInfo: Remove unused method is_italic
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
e420417c85
PangoFontInfo: Remove unused method is_bold
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
0d411cb5c5
PangoFontInfo: Remove unused method is_smallcaps
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
8786e56084
PangoFontInfo: Remove unused method is_monospace
...
Remove also some macros which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:19 +02:00
Egor Pugin
4ed4864dd6
Merge pull request #966 from rfschtkt/pen_color_
...
StringRenderer::pen_color_: int[3]->double[3]
2017-06-03 12:32:26 +03:00
Stefan Weil
8ec67a940d
Remove strcasestr which is no longer needed
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 10:11:44 +02:00
Stefan Weil
25e0c1accb
PangoFontInfo: Remove unused method is_fraktur
...
That allows removing a dirty hack which used the
non-portable function strcasestr.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 10:08:21 +02:00
Raf Schietekat
2981c6c585
StringRenderer::pen_color_: int[3]->double[3]
2017-06-02 09:58:26 +02:00
Raf Schietekat
8dad542f77
Fewer g++ -Wunused-variable warnings
2017-05-11 23:36:05 +02:00