Egor Pugin
c67c2e9f41
Add combine_lang_model to cmake and cppan builds.
2017-08-06 14:46:32 +03:00
Stefan Weil
cdec915e17
Fix broken build for Windows
...
Windows does not provide a mkdir function with two parameters.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-08-04 10:18:35 +02:00
Ray Smith
77c44cdecd
Added convert to int and directory listing to combine_tessdata
2017-08-02 14:53:07 -07:00
Ray Smith
39b168a0b6
Removed errors introduced by git merge
2017-08-02 14:12:45 -07:00
Ray Smith
4e9665debf
Added ADAM optimizer, unless git screwed it up, cos there is no diff
2017-08-02 14:03:50 -07:00
Ray Smith
2633fef0b6
Part 2 of separating out the unicharset from the LSTM model, fixing command line for training
2017-08-02 13:29:23 -07:00
Ray Smith
b0ead95d64
Changed the way unicharsets are handled to allow support for the ™ character. Can find the issue where it was requested.
2017-07-24 11:45:57 -07:00
Ray Smith
3f7735492f
Removed unnecessary using statements and cleaned up google/non-google distinction
2017-07-19 16:42:48 -07:00
Stefan Weil
5a7b7ed7e1
PangoFontInfo: Remove unused method is_italic
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:22:05 +02:00
Stefan Weil
0cd71c67c9
PangoFontInfo: Remove unused method is_bold
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:59 +02:00
Stefan Weil
fbfbf67cf9
PangoFontInfo: Remove unused method is_smallcaps
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:49 +02:00
Stefan Weil
500f913b51
PangoFontInfo: Remove unused method is_monospace
...
Remove also some macros which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:21:35 +02:00
Stefan Weil
059e30d4cb
PangoFontInfo: Remove unused method is_fraktur
...
That restores commit 25e0c1accb
and
partially revert commit 4907a23fea
which added the now unused Shlwapi library.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-07-16 07:16:38 +02:00
Egor Pugin
4907a23fea
Fix windows build.
2017-07-15 15:09:00 +03:00
Ray Smith
dc8745e6fd
Move LSTM unicharset and recoder to traineddata with version string part1. Backwards compatible - maybe.
2017-07-14 11:14:23 -07:00
Ray Smith
df41eab6aa
Added script-specific validation and normalization for virama-using scripts and updated normalization for others
2017-07-14 10:05:05 -07:00
Ray Smith
da03e4e910
Fixes from pull of cleanups: clang tidied, reviewed, fixed new bugs, undeleted needed code. Probably breaks the build, due to some inclusion of changes in utf8/32 conversion
2017-07-14 09:30:14 -07:00
Justin Hotchkiss Palermo
f057938069
fix filenames in comments
2017-07-02 17:35:47 -04:00
zdenop
59de660386
Merge pull request #969 from stweil/clean
...
PangoFontInfo: Remove some unused methods
2017-06-03 15:30:46 +02:00
Stefan Weil
2843739843
PangoFontInfo: Remove unused method is_italic
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
e420417c85
PangoFontInfo: Remove unused method is_bold
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
0d411cb5c5
PangoFontInfo: Remove unused method is_smallcaps
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:44 +02:00
Stefan Weil
8786e56084
PangoFontInfo: Remove unused method is_monospace
...
Remove also some macros which are no longer needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 11:42:19 +02:00
Egor Pugin
4ed4864dd6
Merge pull request #966 from rfschtkt/pen_color_
...
StringRenderer::pen_color_: int[3]->double[3]
2017-06-03 12:32:26 +03:00
Stefan Weil
8ec67a940d
Remove strcasestr which is no longer needed
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 10:11:44 +02:00
Stefan Weil
25e0c1accb
PangoFontInfo: Remove unused method is_fraktur
...
That allows removing a dirty hack which used the
non-portable function strcasestr.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-06-03 10:08:21 +02:00
Raf Schietekat
2981c6c585
StringRenderer::pen_color_: int[3]->double[3]
2017-06-02 09:58:26 +02:00
Raf Schietekat
8dad542f77
Fewer g++ -Wunused-variable warnings
2017-05-11 23:36:05 +02:00
Raf Schietekat
7f382df5ec
Fewer g++ -Wsign-compare warnings (cont.)
2017-05-11 23:14:52 +02:00
Raf Schietekat
c335508e84
Fewer g++ -Wsign-compare warnings
2017-05-11 23:14:52 +02:00
Stefan Weil
0c88b72909
training: Fix format error and some compiler warnings
...
The size() method returns a size_type value which is an unsigned type.
As there is no portable format string for that type, a type cast is needed.
Fix also several signed / unsigned mismatches which resulted in compiler
warnings.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-11 19:32:51 +02:00
Raf Schietekat
3983d2f76a
Reviewed uses of reinterpret_cast
2017-05-11 01:58:40 +02:00
Egor Pugin
2ea946d11c
Turn on building of text2image.
2017-05-07 20:05:12 +03:00
Ray Smith
8e79297dce
Final part of endian improvement. Adds big-endian support to lstm and fixes issue 518
2017-05-03 16:09:44 -07:00
Stefan Weil
1d6dd03bfc
training: Replace memfree by free
...
free also accepts a nullptr argument, so the code can be simplified.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-01 18:14:00 +02:00
Stefan Weil
445befd3cb
Remove unused include statements for freelist.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2017-05-01 17:12:43 +02:00
Ray Smith
7a116ce8bb
More formatting fixes from clang tidy
2017-04-28 13:38:32 -07:00
Ray Smith
500bfaf315
Added std:: to some stl types
2017-04-27 17:15:35 -07:00
Ray Smith
1cc511188d
Added extra Init that takes a memory buffer or a filereader function pointer to enable read of traineddata from memory or foreign file systems. Updated existing readers to use TFile API instead of FILE. This does not yet add big-endian capability to LSTM, but it is very easy from here.
2017-04-27 15:48:23 -07:00
Egor Pugin
0dcb6b3547
Rename cppan/cmake projects.
2017-02-23 15:39:58 +03:00
Ray Smith
f566a45b30
clang-tidy changes from sync
2017-01-25 16:20:19 -08:00
Mikhail Solomennik
e2974cf953
err -> err_exit
2017-01-20 18:50:47 +03:00
amitdo
5d627aacae
Remove code that is no longer needed
...
The code in ccutil/hashfn.h was needed for some old compilers. Now that we support MSVC >= 2010 and compilers that has good support for C++11, we can drop this code.
As a result of this file removal, we now use:
std::unordered_map
std::unordered_set
std::unique_ptr
directly in the codebase with '#include' for the needed headers.
2017-01-16 01:49:17 +02:00
Egor Pugin
442b5b731a
Fix building of training tools in shared configuration.
2016-12-17 16:19:35 +03:00
Zdenko Podobný
f8dffecf41
fix training build addition to 7c684be724
(Add missing linker flags for Leptonica)
2016-12-15 22:20:35 +01:00
Stefan Weil
7c684be724
Add missing linker flags for Leptonica
...
They were removed in commit d70f3c3663
.
The old code implicitly added `-llept` by using the `AC_CHECK_LIB` macro.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-15 17:28:01 +01:00
zdenop
831e161066
Merge pull request #569 from stweil/nullptr
...
training: Replace NULL by nullptr
2016-12-15 09:05:20 +01:00
zdenop
a0201831c3
Merge pull request #576 from stweil/shellcheck
...
Fix some issues reported by shellcheck (SC2004, SC2006)
2016-12-15 08:30:30 +01:00
zdenop
da4c064c2e
Merge pull request #531 from stweil/guards
...
Fix header file guards and replace reserved identifiers
2016-12-15 08:29:32 +01:00
Stefan Weil
cb6e9e0071
training: Replace NULL by nullptr
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-14 21:08:36 +01:00
Ray Smith
81ebba0394
More makefile changes to remove cube
2016-12-14 11:17:06 -08:00
Ray Smith
9f5ba9105f
Removed dependency on cube from the code
2016-12-14 10:55:15 -08:00
Stefan Weil
b75beda7f9
Fix some issues reported by shellcheck (SC2004, SC2006)
...
Examples:
In training/tesstrain.sh line 64:
if (( ${LINEDATA} )); then
^-- SC2004: $/${} is unnecessary on arithmetic variables.
In training/tesstrain.sh line 56:
source `dirname $0`/language-specific.sh
^-- SC2006: Use $(..) instead of legacy `..`.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-14 14:11:24 +01:00
Stefan Weil
a9b300dc1d
Use pkg-config for icu compiler and linker flags
...
The old settings are used as fallback if there is no configuration.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-13 13:29:34 +01:00
Stefan Weil
7755e05e50
training: Update Makefile for current Mingw-w64
...
Mingw-w64 no longer needs special linker options,
builds with those options fail.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-06 23:02:47 +01:00
Stefan Weil
70c6f1624c
Fix #define guards in header files
...
Some guards were missing, others were not the first statement.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 15:43:03 +01:00
Stefan Weil
4897796d57
Replace reserved identifiers used in #define guards header files
...
Use macro names as suggested by the Google C++ Style Guide
(https://google.github.io/styleguide/cppguide.html#The__define_Guard ).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-12-04 15:43:03 +01:00
Egor Pugin
afd069c219
Fix build.
2016-12-01 12:51:03 +03:00
Egor Pugin
68aa285dcc
Update CMakeLists.txt
2016-12-01 12:38:45 +03:00
Ray Smith
ce76d1c569
Fixes to training process to allow incremental training from a recognition model
2016-11-30 15:51:17 -08:00
Ray Smith
9d9056716f
Added std:: to vector
2016-11-30 15:45:36 -08:00
Ray Smith
53003f9074
Formatting changes from clang_tidy on latest pull
2016-11-30 15:44:25 -08:00
Stefan Weil
6158f7eae2
Simplify calls of free
...
It is not necessary to check for null pointers.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-24 17:59:13 +01:00
Egor Pugin
67deea5703
Fix unix build.
2016-11-24 17:39:16 +03:00
Egor Pugin
644469595c
Fix windows build.
2016-11-24 17:32:23 +03:00
zdenop
ac3b40de2f
Merge pull request #478 from stweil/w
...
Fix some compiler warnings
2016-11-22 08:30:57 +01:00
Ray Smith
5913d7344f
Added missing license headers
2016-11-18 15:53:11 -08:00
Stefan Weil
4f45940050
training: Fix compiler warnings (deprecated register keyword)
...
training/commontraining.cpp:824:3: warning:
'register' storage class specifier is deprecated and incompatible with C++1z [-Wdeprecated-register]
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-11-14 22:34:15 +01:00
Ray Smith
f24ef67df4
Limited max height to 48 even in variable height input, enabled neural nets via ocr engine mode
2016-11-08 14:01:04 -08:00
Ray Smith
c1c1e426b3
Added new LSTM-based neural network line recognizer
2016-11-07 15:38:07 -08:00
Ray Smith
5d21ecfad3
Rendering/hash map changes part 2
2016-11-07 11:56:07 -08:00
Ray Smith
a987e6d87c
Major bug fixes to pango renderer and resolved issue of hash_map vs unordered_map
2016-11-07 11:35:45 -08:00
Ray Smith
2c837dffc3
Result of clang tidy on recent merge
2016-11-07 10:46:33 -08:00
Stefan Weil
34af6155eb
training: Remove unnecessary const qualifiers
...
This fixes several gcc warnings:
warning:
type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-10-08 11:28:22 +02:00
Zdenko Podobný
61032d9b14
set fonts_dir to system default font location. Fixes #409
2016-09-01 18:27:00 +02:00
Zdenko Podobný
916897da1b
print text2image info to stdout instead of strerr
2016-09-01 13:38:06 +02:00
Stefan Weil
6ec1a0a09b
fileio: Replace assert with tprintf() and exit(1)
...
Assertions are good for programming errors, but not for wrong user input.
The new code no longer needs File::ReadFileToStringOrDie, so remove that
method.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-30 07:13:56 +02:00
Stefan Weil
1950fec7a2
tlog: Remove unused macro TLOG_FATAL
...
The implementation was also wrong because it did not use __VA_ARGS__.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-29 19:11:01 +02:00
Stefan Weil
3420acabe5
text2image: Add linefeed to error message
...
This changes the error message for a missing font from
Could not find font named Times New Roman.Please correct --font arg.
(missing space after first sentence) to
Could not find font named Times New Roman.
Please correct --font arg.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-28 21:33:52 +02:00
Stefan Weil
34ed8ddf62
stringrenderer: Fix compiler warning (-Wwrite-strings)
...
gcc reported this warning:
../training/stringrenderer.cpp:
In member function ‘void tesseract::StringRenderer::SetLayoutProperties()’:
../training/stringrenderer.cpp:211:42: warning:
ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
set_features("liga, clig, dlig, hlig");
^
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-08-28 14:07:03 +02:00
zdenop
939023ffb9
Merge pull request #391 from vidiecan/issue_390
...
fixed #390 by introducing new rotate_image flag
2016-08-15 20:04:30 +02:00
jm
b69561c802
fixed #390 by introducing new rotate_image flag
2016-08-15 18:16:35 +02:00
jm
941e1c4c84
fixes #388 by using raw bytes utf8 encoding
2016-08-15 18:11:01 +02:00
jm
8d2d94e4ed
fixes some of the windows issue with text2image, see #380
2016-08-05 20:11:01 +02:00
zdenop
5ca73cca26
Merge pull request #355 from amitdo/pango-name-is-empty
...
Check that pango's suggested font name is not an empty string
2016-06-20 10:26:11 +02:00
Stefan Weil
ed053aab94
Fix Cygwin compatibility – part III
...
Commit 65504c8cd2
misplaced the #endif.
The definition of _GNU_SOURCE is only needed for Cygwin.
Defining _GNU_SOURCE on Linux results in compiler warnings because this
macro is already defined by the compiler.
Fix this by moving the #endif to the right place. In addition the code
for Cygwin is made more robust: If a future Cygwin compiler defines
_GNU_SOURCE, too, the code will still work.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-06-19 22:38:03 +02:00
amitdo
724fb894ac
Check that pango's suggested font name is not an empty string
...
On msys2 pango seems to always returns empty string for the suggested
font. It's a good idea to check that the string is not empty before
printing it - on all platforms.
2016-06-19 13:40:17 +03:00
Amit
96720c785d
Merge pull request #351 from amitdo/cygwin-compat
...
Fix Cygwin compatibility
2016-06-19 12:43:35 +03:00
Stefan Weil
65504c8cd2
Fix Cygwin compatibility - Part II
2016-06-19 11:59:58 +03:00
Amit Dovev
13d789d4df
Merge pull request #288 from nickjwhite/opentypeligatures
...
Enable all ligatures available in a font for text2image rendering
2016-06-19 03:33:32 +03:00
Amit Dovev
034d666e7a
Replace use of TLOG_FATAL() with tprintf() and exit(1) ( #349 )
...
Asserts should not be used for missing or invalid input in the command
line! This leads to a bad UX.
2016-06-16 12:10:53 +03:00
Shreeshrii
c3a7fab349
Replace asserts with tprintf() and exit(1)
...
Asserts should not be used for missing or invalid input in the command
line! This leads to a bad UX.
2016-06-14 14:35:05 +03:00
amitdo
cd1a14450c
Training tools: Print help message when (argv == 1)
2016-05-22 11:16:42 +03:00
Zdenko Podobný
cab6de1740
remove unused GlyphLessFont files
2016-05-20 21:19:00 +02:00
Nick White
76ed9decb3
Only enable extra ligatures with recent Pango versions
...
Pango's opentype feature selection functions are only available
from version 1.38+, which is still quite new, so ensure it's just
ignored if using an older version.
2016-03-21 13:03:03 +00:00
Nick White
9100adcbde
Enable all ligatures available in a font for text2image rendering
...
This enables all OpenType ligatures for a specific font, where
available. Specifically, it explicitly enables the OpenType
features liga (standard ligatures), hlig (historical ligatures),
clig (contextual ligatures), and dlig (discretionary ligatures).
This feature requires Pango 1.38 or newer.
2016-03-21 11:41:36 +00:00
Amit Dovev
96c2f637fd
Add missing % char from format specifier in tlog()
...
- In training/ango_font_info.cpp
2016-03-17 01:09:46 +02:00
Egor Pugin
4d4bfb552c
Add inactivity timeout for icu download on windows
2016-03-04 12:34:01 +03:00
Ryan Baumann
bd5452d40c
Add Junicode to neo-Latin fonts
2016-01-13 10:15:57 -05:00
Ryan Baumann
5b40277d08
Use different font list and exposures for "lat" language training
2016-01-04 11:48:02 -05:00