Stefan Weil
9cf170cb7a
Revert "Change default width for images output by text2image"
...
This reverts commit fdc243b363
because
it caused a regression reported in issue #1798 .
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-27 07:29:30 +02:00
Egor Pugin
57224bc9b5
Merge pull request #1805 from kant/patch-3
...
Minor formatting proposals
2018-07-26 20:06:02 +03:00
Egor Pugin
51c1950129
Merge pull request #1806 from stweil/training
...
training: Add new flag --workspace_dir to tesstraining_utils.sh
2018-07-26 20:05:34 +03:00
Stefan Weil
b19e69086c
training: Add new flag --workspace_dir to tesstraining_utils.sh
...
By default, that script creates two new temporary directories with random
names in /tmp.
The new command line flag --workspace_dir PATH uses the given path as
a base directory for all temporary files.
That allows better reproducable training results (no random directory
names in log files).
Signed-off-by: Stefan Weil <stweil@ub-backup.bib.uni-mannheim.de>
2018-07-26 17:14:19 +02:00
Darío Hereñú
b50073ec48
Minor formatting proposals
2018-07-26 12:00:14 -03:00
Egor Pugin
fbff323d6a
Merge pull request #1802 from noahmetzger/winfix
...
Added a feature to enrich the hOCR output with glyph confidences
2018-07-26 12:29:47 +03:00
zdenop
fc6d6fb25d
Merge pull request #1803 from kant/patch-2
...
Minor formatting proposals
2018-07-26 07:51:55 +02:00
Darío Hereñú
2315fe2a77
Minor formatting proposals
2018-07-25 22:13:50 -03:00
Noah Metzger
91c7504a35
Added a feature to enrich the hOCR output with glyph confidences
...
By using the parameter -c glyph_confidences=true the user is able to enrich
the hOCR output with additional information. Tesseract then lists additionally
the timesteps with all glyphs that were considered with their confidence
for every timestep of the LSTM.
The format of the hOCR output is slightly changed: There is now a linebreak
after every word for better readability by humans.
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-07-25 18:18:58 +02:00
zdenop
607e8fd85c
Merge pull request #1795 from stweil/fix
...
Fix regression (shared libraries no longer supported)
2018-07-21 13:01:15 +02:00
zdenop
390f9ed55b
Merge pull request #1796 from stweil/limit
...
Increase limit for deserialization of large arrays
2018-07-21 13:00:37 +02:00
Stefan Weil
132c540c85
Increase limit for deserialization of large arrays
...
The last limit was still too small.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-21 11:10:09 +02:00
Stefan Weil
b15624eb2f
Fix regression (shared libraries no longer supported)
...
The first usage of AC_CHECK_HEADERS must be unconditional,
otherwise configure fails to detect support for shared libraries.
This fixes a regression introduced by commit a07025c993
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-21 11:06:38 +02:00
Egor Pugin
0e1e68d843
Merge pull request #1794 from stweil/fix
...
Increase limit and add assertions for deserialization of large arrays
2018-07-20 14:04:33 +03:00
Stefan Weil
f577e292c2
Increase limit and add assertions for deserialization of large arrays
...
One of the checks was too restrictive, as lstmeval deserializes
char arrays with 14000000 elements, so raise the limit to 30000000.
That check was added in commit 992031e824
.
Add also assertions which help finding such problems in debug mode.
Signed-off-by: Stefan Weil <stweil@ub-backup.bib.uni-mannheim.de>
2018-07-20 11:47:49 +02:00
zdenop
364ffeb0ab
Merge pull request #1792 from stweil/mode
...
Add missing execute permission for script files
2018-07-19 20:53:36 +02:00
zdenop
62be158fd0
Merge pull request #1790 from stweil/configure
...
Clean configuration code
2018-07-19 20:53:19 +02:00
Stefan Weil
ca25d88538
Add missing execute permission for script files
...
It is needed for running the training tutorial on Linux.
The correct mode was lost when moving the files in
commit 104fe7931c
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 20:25:41 +02:00
Stefan Weil
58208522f0
configure: Clean code for --enable-visibility
...
* Remove unneeded arguments for AC_ARG_ENABLE
* Use [] instead of () for default in help text
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
a07025c993
configure: Clean code for --enable-opencl
...
* Remove unneeded arguments for AC_ARG_ENABLE
* Use AS_HELP_STRING
* Use [] instead of () for default in help text
* Run AC_CHECK_HEADERS, AC_CHECK_LIB only if OpenCL support is enabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
0ad6e3e77f
configure: Clean code for --enable-legacy
...
* Remove unneeded arguments for AC_ARG_ENABLE
* Fix formatting of help text
* Remove help text for --enable-legacy
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
e47a9272d7
configure: Clean code for --enable-graphics
...
* Remove unneeded arguments for AC_ARG_ENABLE
* Remove help text for --enable-graphics
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
cfc5ef65a2
configure: Clean code for --enable-embedded
...
* Remove unneeded arguments for AC_ARG_ENABLE
* Use AS_HELP_STRING
* Use [] instead of () for default in help text
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
11cafd7673
configure: Clean code for --enable-debug
...
* Remove unneeded arguments for AC_ARG_ENABLE (needs renaming of macro)
* Use [] instead of () for default in help text
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:33:28 +02:00
Stefan Weil
11d9d8e59a
configure: Remove macro AC_SYS_INTERPRETER
...
The macro sets interpval which is not used by Tesseract.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:19:58 +02:00
Stefan Weil
0a4edf618a
configure: Remove large file support
...
Tesseract does not handle large files (more than 2 GiB).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:19:58 +02:00
Stefan Weil
4bbebd3f7e
Remove tests for function getline
...
The Tesseract code does not use getline.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-19 16:19:58 +02:00
Egor Pugin
3a7f5e4de4
Merge pull request #1786 from stweil/serialize
...
Use new serialization API
2018-07-18 23:28:30 +03:00
Stefan Weil
b7b8dba5db
LSTMTrainer: Use new serialization API
...
Improve also portability by using int32_t instead of int
for a serialized member variable.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 19:28:05 +02:00
Stefan Weil
1dcda1aa8a
LSTMRecognizer: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 19:28:05 +02:00
Stefan Weil
45a7ccf2d2
LSTM: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 19:28:05 +02:00
Stefan Weil
f4449ba41a
Convolve: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 19:28:05 +02:00
Stefan Weil
dfc3e9691f
SquishedDawg: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 19:28:05 +02:00
zdenop
790e115d1e
Merge pull request #1785 from stweil/serialize
...
Use new serialization API
2018-07-18 18:17:53 +02:00
Stefan Weil
6cf508960a
UnicharAndFonts, Shape: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:31:37 +02:00
Stefan Weil
07b363fec0
MasterTrainer: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:29:10 +02:00
Stefan Weil
88b3d940be
TessdataManager: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:28:13 +02:00
Stefan Weil
da0217fa75
STRING: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:17:22 +02:00
Stefan Weil
5e05f2cb84
IndexMap: Use new serialization API and optimize code
...
By changing the type of sparse_size_ from int to int32_t,
a local copy can be removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:12:44 +02:00
Stefan Weil
edff1d1882
BitVector: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:07:03 +02:00
Stefan Weil
bb6c0123cc
ICOORD: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 17:02:12 +02:00
Stefan Weil
66bc012d27
UNICHARSET: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 16:22:02 +02:00
Stefan Weil
eb90068b5f
RecodedCharID: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 16:22:01 +02:00
Stefan Weil
0ca7cdd2c8
WordFeature, ImageData: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 16:22:01 +02:00
Stefan Weil
7133a6f43c
GENERIC_2D_ARRAY: Use new serialization API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 16:22:01 +02:00
Stefan Weil
ea660f83a3
fontinfo: Use new serialization API and optimize code
...
Combine several calls of Serialize in write_spacing_info and in write_set.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 16:22:01 +02:00
zdenop
daba37f4d4
Merge pull request #1784 from stweil/serialize
...
Simplify API for serialization and add first users
2018-07-18 15:54:05 +02:00
Egor Pugin
252661a4d3
Merge pull request #1783 from stweil/clean
...
IntFeatureSpace: Remove unused DeSerialize method
2018-07-18 13:31:19 +03:00
Stefan Weil
6ef267c432
Use TFile::Serialize, TFile::DeSerialize
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 11:19:37 +02:00
Stefan Weil
c383b1aaca
TFile: Add helper functions for serialization of simple data types
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-18 11:19:37 +02:00