Egor Pugin
1f3acca03a
Merge pull request #1850 from Shreeshrii/new-branch-name
...
add option --save_box_tiff to save box/tiff pairs with lstmf files
2018-08-20 12:39:52 +03:00
Noah Metzger
663be426f6
Added the option for character accumulated glyph confidences.
...
The parameter glyph_confidences is changed from bool to int.
An execution with value 1 outputs the hOCR file enriched with glyph confidences
for every timestep like before. An execution with value 2 outputs the timesteps
accumulated over the recognized characters.
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-08-20 10:43:58 +02:00
Shree Devi Kumar
43e3f24bb0
add variable --save_box_tiff to Save box/tiff pairs along with lstmf files.
2018-08-20 08:24:09 +00:00
Egor Pugin
115fe7662c
Merge pull request #1844 from Shreeshrii/new-branch-name
...
Updates to Javanese Script Validation and Training
2018-08-17 13:24:28 +03:00
zdenop
debe3da36d
remove duplicate include
2018-08-16 20:50:28 +02:00
Shree Devi Kumar
b34cf9d424
Javanese script training
2018-08-16 12:15:10 +00:00
Stefan Weil
e1c387c9b3
Fix typo in comments and variable name
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-16 11:38:36 +00:00
Stefan Weil
bf33301114
Fix typo in function name
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-16 11:38:36 +00:00
zdenop
e731324a08
Merge pull request #1841 from stweil/typo
...
Fix some typos
2018-08-14 16:51:35 +02:00
Stefan Weil
641237495a
Fix typo in comments and variable name
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-14 16:20:27 +02:00
Stefan Weil
95ed924d81
Fix typo in function name
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-14 16:20:27 +02:00
zdenop
4671e2b0bf
Merge pull request #1840 from stweil/scrollview
...
scrollview: Clean include statements
2018-08-14 13:22:36 +02:00
Stefan Weil
ce135de37c
scrollview: Clean include statements
...
cstring was included twice (reported by Martin Strunz).
Use C++ header files and sort them alphabetically.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-14 13:12:51 +02:00
Zdenko Podobný
296309c1f1
remove duplicate include. Fixes #1837
2018-08-14 13:06:14 +02:00
zdenop
fd492062d0
Merge pull request #1825 from atuyosi/fix-tesserocr-129
...
Revert Makefile.am to beta.2
2018-08-06 18:23:53 +02:00
Atsuyoshi Suzuki
4cda775d73
Revert Makefile.am to beta.2
...
thesserocr needs `osdetect.h'.
2018-08-06 23:21:20 +09:00
Egor Pugin
3b723ba102
Merge pull request #1823 from Shreeshrii/javanese
...
Add support for Javanese script - aksara Jawa
2018-08-04 16:08:23 +03:00
Shree Devi Kumar
7957288fd5
chamge validate javanese similar to indic
2018-08-04 09:43:53 +00:00
Shree Devi Kumar
f93f9e8a09
fix typo re Javanese
2018-08-03 14:33:24 +00:00
Shree Devi Kumar
0eb7be1cd1
Initial COmmit to add Aksara Jawa - Javanese script
2018-08-03 13:59:27 +00:00
zdenop
e9b4e21e6f
Merge pull request #1822 from stweil/clean
...
ColPartition: Rename median_size_ -> median_height_
2018-08-03 10:06:03 +02:00
Stefan Weil
6a0f8e8c07
ColPartition: Rename median_size_ -> median_height_
...
This implements a TODO. Rename also some related items.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-03 08:46:38 +02:00
Egor Pugin
4370714779
Merge pull request #1819 from stweil/ocl
...
Fix ImageThresholder::OtsuThresholdRectToPix for OpenCL
2018-08-02 02:01:32 +03:00
Stefan Weil
8af80b7ba6
Fix ImageThresholder::OtsuThresholdRectToPix for OpenCL
...
The ThresholdRectToPix OpenCL kernel only supports 4 channels.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-01 22:49:28 +02:00
zdenop
c044b8c916
Merge pull request #1818 from stweil/psm
...
Fix potential crash with --psm 0 and use osd.traineddata automatically
2018-08-01 16:56:56 +02:00
zdenop
d22ca6bb06
Merge pull request #1817 from noahmetzger/winfix
...
Fix issue detected by Coverity Scan
2018-08-01 16:55:56 +02:00
Stefan Weil
27ce472666
Fix potential crash with --psm 0 and use osd.traineddata automatically
...
Page segmentation mode "OSD only" requires osd.traineddata,
so use it automatically.
Report a warning if the user specified a different language.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-01 16:52:37 +02:00
Noah Metzger
65997bed16
Fix issue detected by Coverity Scan
...
CID: 1340285 (Division or modulo by zero)
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-08-01 15:56:19 +02:00
zdenop
b23568f3d1
Merge pull request #1816 from noahmetzger/winfix
...
Fix issues detected by Coverity Scan
2018-08-01 14:45:00 +02:00
Noah Metzger
d28631a274
Fix issues detected by Coverity Scan
...
CID: 1164604 (Nesting level does not match indentation)
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-08-01 14:30:13 +02:00
Egor Pugin
8bb8b75692
Merge pull request #1815 from stweil/whitespace
...
Fix whitespace issues
2018-08-01 14:54:35 +03:00
Stefan Weil
6a28cce96b
Fix whitespace issues
...
* Remove whitespace (blanks, tabs, cr) at line endings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-01 13:19:52 +02:00
zdenop
3af2773d0e
Merge pull request #1814 from noahmetzger/winfix
...
Fix issue detected by Coverity Scan
2018-08-01 11:20:13 +02:00
Noah Metzger
2d96c66126
Fix issue detected by Coverity Scan
...
CID: 1164533 (Logically dead code)
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-08-01 10:30:52 +02:00
zdenop
10259698d8
Merge pull request #1813 from stweil/fix
...
TessPDFRenderer: Improve robustness of API (issue #1804 )
2018-08-01 09:17:55 +02:00
Stefan Weil
eb69dd0201
TessPDFRenderer: Improve robustness of API (issue #1804 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-08-01 09:11:04 +02:00
Egor Pugin
9ce4d05188
Merge pull request #1812 from noahmetzger/winfix
...
Fix issue reported by Coverity Scan
2018-07-31 13:52:05 +03:00
Noah Metzger
d4490af06d
Fix issue reported by Coverity Scan
...
CID: 1375395 (Dereference after null check)
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-07-31 10:43:39 +02:00
zdenop
7d99cb4e28
Merge pull request #1811 from noahmetzger/winfix
...
Fix issue reported by Coverity Scan
2018-07-31 09:53:33 +02:00
Noah Metzger
83a4eb3b44
Fix issue reported by Coverity Scan
...
CID: 1391264 (Improper use of negative value)
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-07-31 09:43:30 +02:00
zdenop
18787ea12b
Merge pull request #1808 from stweil/fix
...
Revert "Change default width for images output by text2image"
2018-07-27 08:10:37 +02:00
Stefan Weil
9cf170cb7a
Revert "Change default width for images output by text2image"
...
This reverts commit fdc243b363
because
it caused a regression reported in issue #1798 .
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-07-27 07:29:30 +02:00
Egor Pugin
57224bc9b5
Merge pull request #1805 from kant/patch-3
...
Minor formatting proposals
2018-07-26 20:06:02 +03:00
Egor Pugin
51c1950129
Merge pull request #1806 from stweil/training
...
training: Add new flag --workspace_dir to tesstraining_utils.sh
2018-07-26 20:05:34 +03:00
Stefan Weil
b19e69086c
training: Add new flag --workspace_dir to tesstraining_utils.sh
...
By default, that script creates two new temporary directories with random
names in /tmp.
The new command line flag --workspace_dir PATH uses the given path as
a base directory for all temporary files.
That allows better reproducable training results (no random directory
names in log files).
Signed-off-by: Stefan Weil <stweil@ub-backup.bib.uni-mannheim.de>
2018-07-26 17:14:19 +02:00
Darío Hereñú
b50073ec48
Minor formatting proposals
2018-07-26 12:00:14 -03:00
Egor Pugin
fbff323d6a
Merge pull request #1802 from noahmetzger/winfix
...
Added a feature to enrich the hOCR output with glyph confidences
2018-07-26 12:29:47 +03:00
zdenop
fc6d6fb25d
Merge pull request #1803 from kant/patch-2
...
Minor formatting proposals
2018-07-26 07:51:55 +02:00
Darío Hereñú
2315fe2a77
Minor formatting proposals
2018-07-25 22:13:50 -03:00
Noah Metzger
91c7504a35
Added a feature to enrich the hOCR output with glyph confidences
...
By using the parameter -c glyph_confidences=true the user is able to enrich
the hOCR output with additional information. Tesseract then lists additionally
the timesteps with all glyphs that were considered with their confidence
for every timestep of the LSTM.
The format of the hOCR output is slightly changed: There is now a linebreak
after every word for better readability by humans.
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-07-25 18:18:58 +02:00