Stefan Weil
61f96981e5
training: Fix typos in comments (found by codespell)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-09-03 09:16:02 +02:00
Arkady Shapkin
d171488e21
Added CMake option to use system ICU library
2017-08-17 02:50:54 +03:00
Ray Smith
5f5e85e4a0
Fixed lack of error on non-existent traineddata
2017-08-07 09:58:43 -07:00
Ray Smith
0a91498195
Improved error message on missing optional config
2017-08-07 09:50:49 -07:00
Ray Smith
4b3c5f6c35
Added check for non-empty traineddata flag
2017-08-07 09:43:30 -07:00
Egor Pugin
c67c2e9f41
Add combine_lang_model to cmake and cppan builds.
2017-08-06 14:46:32 +03:00
Stefan Weil
cdec915e17
Fix broken build for Windows
...
Windows does not provide a mkdir function with two parameters.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-04 10:18:35 +02:00
Ray Smith
77c44cdecd
Added convert to int and directory listing to combine_tessdata
2017-08-02 14:53:07 -07:00
Ray Smith
39b168a0b6
Removed errors introduced by git merge
2017-08-02 14:12:45 -07:00
Ray Smith
4e9665debf
Added ADAM optimizer, unless git screwed it up, cos there is no diff
2017-08-02 14:03:50 -07:00
Ray Smith
2633fef0b6
Part 2 of separating out the unicharset from the LSTM model, fixing command line for training
2017-08-02 13:29:23 -07:00
Ray Smith
b0ead95d64
Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested.
2017-07-24 11:45:57 -07:00
Ray Smith
3f7735492f
Removed unnecessary using statements and cleaned up google/non-google distinction
2017-07-19 16:42:48 -07:00
Stefan Weil
5a7b7ed7e1
PangoFontInfo: Remove unused method is_italic
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:22:05 +02:00
Stefan Weil
0cd71c67c9
PangoFontInfo: Remove unused method is_bold
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:59 +02:00
Stefan Weil
fbfbf67cf9
PangoFontInfo: Remove unused method is_smallcaps
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:49 +02:00
Stefan Weil
500f913b51
PangoFontInfo: Remove unused method is_monospace
...
Remove also some macros which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:35 +02:00
Stefan Weil
059e30d4cb
PangoFontInfo: Remove unused method is_fraktur
...
That restores commit 25e0c1accb
and
partially revert commit 4907a23fea
which added the now unused Shlwapi library.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:16:38 +02:00
Egor Pugin
4907a23fea
Fix windows build.
2017-07-15 15:09:00 +03:00
Ray Smith
dc8745e6fd
Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe.
2017-07-14 11:14:23 -07:00
Ray Smith
df41eab6aa
Added script-specific validation and normalization for virama-using scripts and updated normalization for others
2017-07-14 10:05:05 -07:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Justin Hotchkiss Palermo
f057938069
fix filenames in comments
2017-07-02 17:35:47 -04:00
zdenop
59de660386
Merge pull request #969 from stweil/clean
...
PangoFontInfo: Remove some unused methods
2017-06-03 15:30:46 +02:00
Stefan Weil
2843739843
PangoFontInfo: Remove unused method is_italic
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
e420417c85
PangoFontInfo: Remove unused method is_bold
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
0d411cb5c5
PangoFontInfo: Remove unused method is_smallcaps
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
8786e56084
PangoFontInfo: Remove unused method is_monospace
...
Remove also some macros which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:19 +02:00
Egor Pugin
4ed4864dd6
Merge pull request #966 from rfschtkt/pen_color_
...
StringRenderer::pen_color_: int[3]->double[3]
2017-06-03 12:32:26 +03:00
Stefan Weil
8ec67a940d
Remove strcasestr which is no longer needed
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 10:11:44 +02:00
Stefan Weil
25e0c1accb
PangoFontInfo: Remove unused method is_fraktur
...
That allows removing a dirty hack which used the
non-portable function strcasestr.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 10:08:21 +02:00
Raf Schietekat
2981c6c585
StringRenderer::pen_color_: int[3]->double[3]
2017-06-02 09:58:26 +02:00
Raf Schietekat
8dad542f77
Fewer g++ -Wunused-variable warnings
2017-05-11 23:36:05 +02:00
Raf Schietekat
7f382df5ec
Fewer g++ -Wsign-compare warnings (cont.)
2017-05-11 23:14:52 +02:00
Raf Schietekat
c335508e84
Fewer g++ -Wsign-compare warnings
2017-05-11 23:14:52 +02:00
Stefan Weil
0c88b72909
training: Fix format error and some compiler warnings
...
The size() method returns a size_type value which is an unsigned type.
As there is no portable format string for that type, a type cast is needed.
Fix also several signed / unsigned mismatches which resulted in compiler
warnings.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-11 19:32:51 +02:00
Raf Schietekat
3983d2f76a
Reviewed uses of reinterpret_cast
2017-05-11 01:58:40 +02:00
Egor Pugin
2ea946d11c
Turn on building of text2image.
2017-05-07 20:05:12 +03:00
Ray Smith
8e79297dce
Final part of endian improvement. Adds big-endian support to lstm and fixes issue 518
2017-05-03 16:09:44 -07:00
Stefan Weil
1d6dd03bfc
training: Replace memfree by free
...
free also accepts a nullptr argument, so the code can be simplified.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-01 18:14:00 +02:00
Stefan Weil
445befd3cb
Remove unused include statements for freelist.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-01 17:12:43 +02:00
Ray Smith
7a116ce8bb
More formatting fixes from clang tidy
2017-04-28 13:38:32 -07:00
Ray Smith
500bfaf315
Added std:: to some stl types
2017-04-27 17:15:35 -07:00
Ray Smith
1cc511188d
Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here.
2017-04-27 15:48:23 -07:00
Egor Pugin
0dcb6b3547
Rename cppan/cmake projects.
2017-02-23 15:39:58 +03:00
Ray Smith
f566a45b30
clang-tidy changes from sync
2017-01-25 16:20:19 -08:00
Mikhail Solomennik
e2974cf953
err -> err_exit
2017-01-20 18:50:47 +03:00
amitdo
5d627aacae
Remove code that is no longer needed
...
The code in ccutil/hashfn.h was needed for some old compilers. Now that we support MSVC >= 2010 and compilers that has good support for C++11, we can drop this code.
As a result of this file removal, we now use:
std::unordered_map
std::unordered_set
std::unique_ptr
directly in the codebase with '#include' for the needed headers.
2017-01-16 01:49:17 +02:00
Egor Pugin
442b5b731a
Fix building of training tools in shared configuration.
2016-12-17 16:19:35 +03:00
Zdenko Podobný
f8dffecf41
fix training build addition to 7c684be724
(Add missing linker flags for Leptonica)
2016-12-15 22:20:35 +01:00
Stefan Weil
7c684be724
Add missing linker flags for Leptonica
...
They were removed in commit d70f3c3663
.
The old code implicitly added `-llept` by using the `AC_CHECK_LIB` macro.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-15 17:28:01 +01:00
zdenop
831e161066
Merge pull request #569 from stweil/nullptr
...
training: Replace NULL by nullptr
2016-12-15 09:05:20 +01:00
zdenop
a0201831c3
Merge pull request #576 from stweil/shellcheck
...
Fix some issues reported by shellcheck (SC2004, SC2006)
2016-12-15 08:30:30 +01:00
zdenop
da4c064c2e
Merge pull request #531 from stweil/guards
...
Fix header file guards and replace reserved identifiers
2016-12-15 08:29:32 +01:00
Stefan Weil
cb6e9e0071
training: Replace NULL by nullptr
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-14 21:08:36 +01:00
Ray Smith
81ebba0394
More makefile changes to remove cube
2016-12-14 11:17:06 -08:00
Ray Smith
9f5ba9105f
Removed dependency on cube from the code
2016-12-14 10:55:15 -08:00
Stefan Weil
b75beda7f9
Fix some issues reported by shellcheck (SC2004, SC2006)
...
Examples:
In training/tesstrain.sh line 64:
if (( ${LINEDATA} )); then
^-- SC2004: $/${} is unnecessary on arithmetic variables.
In training/tesstrain.sh line 56:
source `dirname $0`/language-specific.sh
^-- SC2006: Use $(..) instead of legacy `..`.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-14 14:11:24 +01:00
Stefan Weil
a9b300dc1d
Use pkg-config for icu compiler and linker flags
...
The old settings are used as fallback if there is no configuration.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-13 13:29:34 +01:00
Stefan Weil
7755e05e50
training: Update Makefile for current Mingw-w64
...
Mingw-w64 no longer needs special linker options,
builds with those options fail.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-06 23:02:47 +01:00
Stefan Weil
70c6f1624c
Fix #define guards in header files
...
Some guards were missing, others were not the first statement.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 15:43:03 +01:00
Stefan Weil
4897796d57
Replace reserved identifiers used in #define guards header files
...
Use macro names as suggested by the Google C++ Style Guide
(https://google.github.io/styleguide/cppguide.html#The__define_Guard ).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 15:43:03 +01:00
Egor Pugin
afd069c219
Fix build.
2016-12-01 12:51:03 +03:00
Egor Pugin
68aa285dcc
Update CMakeLists.txt
2016-12-01 12:38:45 +03:00
Ray Smith
ce76d1c569
Fixes to training process to allow incremental training from a recognition model
2016-11-30 15:51:17 -08:00
Ray Smith
9d9056716f
Added std:: to vector
2016-11-30 15:45:36 -08:00
Ray Smith
53003f9074
Formatting changes from clang_tidy on latest pull
2016-11-30 15:44:25 -08:00
Stefan Weil
6158f7eae2
Simplify calls of free
...
It is not necessary to check for null pointers.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-24 17:59:13 +01:00
Egor Pugin
67deea5703
Fix unix build.
2016-11-24 17:39:16 +03:00
Egor Pugin
644469595c
Fix windows build.
2016-11-24 17:32:23 +03:00
zdenop
ac3b40de2f
Merge pull request #478 from stweil/w
...
Fix some compiler warnings
2016-11-22 08:30:57 +01:00
Ray Smith
5913d7344f
Added missing license headers
2016-11-18 15:53:11 -08:00
Stefan Weil
4f45940050
training: Fix compiler warnings (deprecated register keyword)
...
training/commontraining.cpp:824:3: warning:
'register' storage class specifier is deprecated and incompatible with C++1z [-Wdeprecated-register]
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-14 22:34:15 +01:00
Ray Smith
f24ef67df4
Limited max height to 48 even in variable height input, enabled neural nets via ocr engine mode
2016-11-08 14:01:04 -08:00
Ray Smith
c1c1e426b3
Added new LSTM-based neural network line recognizer
2016-11-07 15:38:07 -08:00
Ray Smith
5d21ecfad3
Rendering/hash map changes part 2
2016-11-07 11:56:07 -08:00
Ray Smith
a987e6d87c
Major bug fixes to pango renderer and resolved issue of hash_map vs unordered_map
2016-11-07 11:35:45 -08:00
Ray Smith
2c837dffc3
Result of clang tidy on recent merge
2016-11-07 10:46:33 -08:00
Stefan Weil
34af6155eb
training: Remove unnecessary const qualifiers
...
This fixes several gcc warnings:
warning:
type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-10-08 11:28:22 +02:00
Zdenko Podobný
61032d9b14
set fonts_dir to system default font location. Fixes #409
2016-09-01 18:27:00 +02:00
Zdenko Podobný
916897da1b
print text2image info to stdout instead of strerr
2016-09-01 13:38:06 +02:00
Stefan Weil
6ec1a0a09b
fileio: Replace assert with tprintf() and exit(1)
...
Assertions are good for programming errors, but not for wrong user input.
The new code no longer needs File::ReadFileToStringOrDie, so remove that
method.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-30 07:13:56 +02:00
Stefan Weil
1950fec7a2
tlog: Remove unused macro TLOG_FATAL
...
The implementation was also wrong because it did not use __VA_ARGS__.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-29 19:11:01 +02:00
Stefan Weil
3420acabe5
text2image: Add linefeed to error message
...
This changes the error message for a missing font from
Could not find font named Times New Roman.Please correct --font arg.
(missing space after first sentence) to
Could not find font named Times New Roman.
Please correct --font arg.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-28 21:33:52 +02:00
Stefan Weil
34ed8ddf62
stringrenderer: Fix compiler warning (-Wwrite-strings)
...
gcc reported this warning:
../training/stringrenderer.cpp:
In member function ‘void tesseract::StringRenderer::SetLayoutProperties()’:
../training/stringrenderer.cpp:211:42: warning:
ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
set_features("liga, clig, dlig, hlig");
^
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-28 14:07:03 +02:00
zdenop
939023ffb9
Merge pull request #391 from vidiecan/issue_390
...
fixed #390 by introducing new rotate_image flag
2016-08-15 20:04:30 +02:00
jm
b69561c802
fixed #390 by introducing new rotate_image flag
2016-08-15 18:16:35 +02:00
jm
941e1c4c84
fixes #388 by using raw bytes utf8 encoding
2016-08-15 18:11:01 +02:00
jm
8d2d94e4ed
fixes some of the windows issue with text2image, see #380
2016-08-05 20:11:01 +02:00
zdenop
5ca73cca26
Merge pull request #355 from amitdo/pango-name-is-empty
...
Check that pango's suggested font name is not an empty string
2016-06-20 10:26:11 +02:00
Stefan Weil
ed053aab94
Fix Cygwin compatibility – part III
...
Commit 65504c8cd2
misplaced the #endif.
The definition of _GNU_SOURCE is only needed for Cygwin.
Defining _GNU_SOURCE on Linux results in compiler warnings because this
macro is already defined by the compiler.
Fix this by moving the #endif to the right place. In addition the code
for Cygwin is made more robust: If a future Cygwin compiler defines
_GNU_SOURCE, too, the code will still work.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-06-19 22:38:03 +02:00
amitdo
724fb894ac
Check that pango's suggested font name is not an empty string
...
On msys2 pango seems to always returns empty string for the suggested
font. It's a good idea to check that the string is not empty before
printing it - on all platforms.
2016-06-19 13:40:17 +03:00
Amit
96720c785d
Merge pull request #351 from amitdo/cygwin-compat
...
Fix Cygwin compatibility
2016-06-19 12:43:35 +03:00
Stefan Weil
65504c8cd2
Fix Cygwin compatibility - Part II
2016-06-19 11:59:58 +03:00
Amit Dovev
13d789d4df
Merge pull request #288 from nickjwhite/opentypeligatures
...
Enable all ligatures available in a font for text2image rendering
2016-06-19 03:33:32 +03:00
Amit Dovev
034d666e7a
Replace use of TLOG_FATAL() with tprintf() and exit(1) ( #349 )
...
Asserts should not be used for missing or invalid input in the command
line! This leads to a bad UX.
2016-06-16 12:10:53 +03:00
Shreeshrii
c3a7fab349
Replace asserts with tprintf() and exit(1)
...
Asserts should not be used for missing or invalid input in the command
line! This leads to a bad UX.
2016-06-14 14:35:05 +03:00
amitdo
cd1a14450c
Training tools: Print help message when (argv == 1)
2016-05-22 11:16:42 +03:00
Zdenko Podobný
cab6de1740
remove unused GlyphLessFont files
2016-05-20 21:19:00 +02:00
Nick White
76ed9decb3
Only enable extra ligatures with recent Pango versions
...
Pango's opentype feature selection functions are only available
from version 1.38+, which is still quite new, so ensure it's just
ignored if using an older version.
2016-03-21 13:03:03 +00:00