Commit Graph

246 Commits

Author SHA1 Message Date
Stefan Weil
16553014e0 Replace references to the old wiki by new URLs
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-02-03 11:37:41 +01:00
Stefan Weil
3d1f82d0e2 tesstrain.sh: Fix command line flag --help
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-01-05 10:10:55 +01:00
Stefan Weil
d2a2292f32 mftraining: Fix compiler warning
powerpc64le-linux-gnu-g++ warning:

    src/training/mftraining.cpp:209:5: warning:
        ‘%04d’ directive output may be truncated writing between 4 and 10 bytes
        into a region of size 8 [-Wformat-truncation=]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-01-03 10:13:58 +01:00
amitdo
502ebe8ca9 Autotools: Pango, Cairo and ICU only required by training tools 2019-12-16 17:23:06 +02:00
Stefan Weil
6181acf367 automake: Flat build for src/cutil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
cafb1bbfd7 automake: Flat build for src/api
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Shreeshrii
99dfa8a680 Add separator and training_iteration to checkpoint name (#2752)
* Add separator and training_iteration to checkpoint name
* specify modelname_N.NN_NN_NN.checkpoint for intermediate checkpoint
2019-11-09 12:22:40 +01:00
maungd@battelle.org
3d7afb69ea Exposed the text2image option --ptsize to tesstrain.sh. Text2image has the
option --ptsize which defaults to 12.  This option is not exposed through
tesstrain.sh; thus, you cannot use tesstrain.sh to explore training with
different font sizes.  I made a small modification to expose the --ptsize
option to tesstrain.sh.  It defaults to 12 if not specified.
2019-11-01 15:10:58 -04:00
Egor Pugin
2bcc9d8093 Remove cppan build. 2019-10-30 21:37:38 +03:00
Egor Pugin
2a37f5dd62 Update includes to use <>. 2019-10-29 14:50:11 +03:00
amitdo
2f8884a64e Fix autotools build 2019-10-28 21:23:58 +02:00
amitdo
e1bae15547 Fix #include path of public headers 2019-10-28 19:10:30 +02:00
zdenop
fc629eae3b Subject: training: show error description for open/delete file 2019-10-21 16:31:57 +02:00
zdenop
36dc2ccf75 fix memory leak at PangoFontInfo::CanRenderString 2019-10-20 16:43:04 +02:00
zdenop
1ec34378d9 test for synthesized font faces. 2019-10-19 15:05:28 +02:00
zdenop
cbbe45d94b cmake: add minimum required version for pango and icu based on autotools 2019-10-19 15:00:49 +02:00
zdenop
37c7a5dd82 text2image: show pango version 2019-10-19 14:52:06 +02:00
Stefan Weil
994ec697d8 Remove member functions STRING::string and StringParam::string
They were redundant because there exist member functions 'c_str' which do the same.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-23 08:33:08 +02:00
Stefan Weil
a730b5c4ff Remove STRING from the public Tesseract API
Removing STRING from genericvector.h allows eliminating the proprietary
STRING data type from the public Tesseract API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-22 20:32:28 +02:00
Stefan Weil
8cb677d6a2 Replace STRING arguments for LoadDataFromFile and SaveDataToFile
This is a step to eliminate the proprietary STRING data type
from the public Tesseract API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-22 20:32:28 +02:00
Stefan Weil
97dda3d535 Fix CID 1386099 (Uninitialized pointer field)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 15:43:50 +02:00
Stefan Weil
951f442303 Fix CID 1386105 (Logically dead code)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 15:43:50 +02:00
Stefan Weil
64fc205e78 Fix CID 1402767 (Invalid type in argument to printf format specifier)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 15:43:50 +02:00
Stefan Weil
43b2e9513b lstmtrainer: Fix diagnostic message
Signed character values must be converted to unsigned integers for %x.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-15 14:31:32 +02:00
Stefan Weil
100d8cd29b lstmtester: Add missing space in log messages
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-14 14:12:47 +02:00
Stefan Weil
e84cb24def Move source files which are used for training only to src/training
They are moved from src/classify and src/lstm to src/training.

This reduces the size of the Tesseract library.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-12 17:08:08 +02:00
Stefan Weil
315dd9df3f cmake: Don't link pthread on Windows
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-08-07 15:24:00 +02:00
Zdenko Podobný
c5a50b93ce move fileio.cpp and fileio.h to training (this fix android build) 2019-08-04 21:26:39 +02:00
Egor Pugin
c58efee4ba Use pangocairo-1.43 for the moment. Remove private pango header. 2019-08-01 11:55:18 +03:00
Egor Pugin
f1a567e814
Try to fix #2599 2019-08-01 11:35:15 +03:00
Stefan Weil
23ef93ac4d cmake: Add missing pthread library
It is needed for C++ threads since commit 85068be405.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-26 07:45:51 +02:00
Stefan Weil
a2b13b49ff Simplify shell code (fixes warning from Codacy)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-17 21:33:24 +02:00
Stefan Weil
467f8f4140 Fix training script for macOS (issue #2578)
Bash on macOS does not support "|&":

    tesstrain_utils.sh: line 80: syntax error near unexpected token `&'

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-17 17:18:44 +02:00
Stefan Weil
fcfdb7e56f Remove unused include statements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-15 14:48:31 +02:00
Stefan Weil
85068be405 lstmtester: Replace SVSync::StartThread by std::thread
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-15 14:30:51 +02:00
Stefan Weil
93427391c1 Replace SVAutoLock by std::lock_guard
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-15 12:01:28 +02:00
Stefan Weil
36026e3c35 Replace SVMutex by std::mutex
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-15 12:01:28 +02:00
Stefan Weil
bdc7abf518 Fix format strings for size_t arguments (CID 1402762, 1402767)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-10 16:57:19 +02:00
Egor Pugin
3b6f071ee8 Implement CMake+SW build.
Currently only Windows is supported.
You could try it as following:

    mkdir build_sw && cd build_sw && cmake .. -DSW_BUILD=1
2019-07-08 18:50:30 +03:00
zhuangzhuang1988
18c67f4989 fix tesstrain.py error 2019-07-08 14:35:17 +08:00
Stefan Weil
1c1eb76c36 Use C++-11 code instead of TessCallback for Dawg::iterate_words
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-04 16:03:30 +02:00
Stefan Weil
eeec9c66d4 training: Use C++-11 code for TestCallback
This allows removing more code from tesscallback.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-04 16:03:30 +02:00
zhuangzhuang1988
99cb088708 close log file handle before move it. 2019-07-01 10:53:12 +08:00
zhuangzhuang1988
a3a361f73d fix logger file encoding error. 2019-06-28 18:29:52 +08:00
Stefan Weil
ea20bf0373 Remove dummy code from LSTMTrainer::InitTensorFlowNetwork
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-22 21:01:40 +02:00
Stefan Weil
41f91c96c8 cmake: Build training tools also on Linux and macOS
This enables CI tests for the code in src/training on Linux and macOS.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-22 20:27:56 +02:00
Stefan Weil
df98bb7368 Move LSTMTrainer from libtesseract to libtesseract_training
LSTMTrainer is only used for training, so the shared library for
Tesseract can be made smaller.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-22 16:23:51 +02:00
Stefan Weil
bd13069fe8 Simplify class LSTMTrainer
The function pointers and callbacks file_reader_, file_writer_,
checkpointer_reader_ and checkpoint_writer_ are always set to
the same values. Replacing them by direct function calls
simplifies the code and allows removing more code from tesscallback.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-06-22 09:18:13 +02:00
zdenop
60b4c68d31 tesstrain_utils.sh: remove redundant code 2019-06-20 18:42:29 +02:00
zdenop
60aee9f821 create OUTPUT_DIR did not exist; fixes #2497 2019-06-16 15:07:16 +02:00