Egor Pugin
a86292b111
Merge pull request #1944 from stweil/psm
...
Allow orientation detection with any traineddata
2018-10-04 18:29:45 +03:00
Stefan Weil
26bfd2b9d3
Allow orientation detection with any traineddata
...
While orientation and script detection (OSD) normally requires
osd.traineddata to detect both, it must also be possible to do
only orientation detection with eng.traineddata or any other
traineddata.
Enforce osd.traineddata only if there was no `-l` command line option.
Commit 27ce472666
was too restrictive.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-04 17:07:14 +02:00
zdenop
6b9f1f100b
Merge pull request #1943 from stweil/psm
...
Don't set page segmentation mode for hocr, pdf and tsv configs
2018-10-04 16:24:52 +02:00
Stefan Weil
ecfee53bac
Don't set page segmentation mode for hocr, pdf and tsv configs
...
Setting the page segmentation mode in those config files gives unexpected
results: the text recognized when no config or only txt is given changes
if both txt and any of hocr, pdf or tsv is chosen.
In a test set of nearly 200 pages from historical books, using
segmentation mode 1 is typically slightly better than the default,
but there are also cases where it is much worse. Therefore the user
should be able to decide which page segmentation mode is best.
Old results for hocr, pdf or tsv now need an explicit `--psm 1` for
reproduction.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-04 12:05:49 +02:00
zdenop
b15fbf1d0f
Merge pull request #1941 from Shreeshrii/master
...
Update man page and readme reg two OCR engines in Tesseract 4
2018-10-04 07:49:08 +02:00
Shree Devi Kumar
d160067308
Update README about both OCR engines in tesseract 4
2018-10-04 04:17:49 +00:00
Shree Devi Kumar
0c39d3446b
Update tesseract man page about both OCR engines in tesseract 4
2018-10-04 04:01:26 +00:00
zdenop
1beeeee215
fix version info in VERSION
2018-10-03 23:51:41 +02:00
zdenop
423798722f
Merge pull request #1938 from stweil/coverity
...
Fix two reports from CoverityScan and clean related code
2018-10-02 12:34:08 +02:00
Stefan Weil
04703ca8df
Fix CID 1164579 (Explicit null dereferenced)
...
The report from Coverity Scan is a false positive.
Nevertheless the code can be rewritten and optimized
a little bit to fix that report.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-02 11:48:28 +02:00
Zdenko Podobný
7dbf5a030f
print help for tesstrain.sh; fixes #1469
2018-10-02 11:35:10 +02:00
Stefan Weil
9a1f14f2aa
Fix CID 1395882 (Uninitialized scalar variable)
...
The implementation for ICOORD only allows division by scale != 0.
Do the same for FCOORD by asserting that scale != 0.0f,
so undefined program behaviour will be caught.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-02 11:34:14 +02:00
Stefan Weil
ce6ff20939
Fix comments
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-02 11:26:36 +02:00
Stefan Weil
8c56b8f58c
Move content of ipoints.h to points.h and remove ipoints.h
...
Both include files depended on each other, so it did not make sense
to separate them.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-02 11:21:27 +02:00
zdenop
57a6f1d22e
remove duplicate help from combine_lang_model
2018-10-01 21:22:51 +02:00
Egor Pugin
6ee7f4eac2
Fix typo.
2018-09-29 17:04:25 +03:00
zdenop
14b83d3090
use tprintf instead of printf to be able disable messages by quiet option
...
(issue #1240 )
2018-09-29 13:49:08 +02:00
zdenop
d9372662ec
add "sudo ldconfig" to install instruction. fixes #1212
2018-09-29 13:33:36 +02:00
zdenop
d5b6222856
Merge pull request #1935 from stweil/style
...
Format code and fix some style issues
2018-09-29 09:32:56 +02:00
Stefan Weil
4ec9c86226
unittest: Replace NULL by nullptr
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-29 09:27:12 +02:00
Stefan Weil
9e66fb918f
unittest: Format code
...
It was formatted with clang-format-7 -i unittest/*.{c*,h}.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-29 09:19:13 +02:00
zdenop
1a096441d0
tesseract app: check if input file exists; fixes #1023
2018-09-29 08:51:00 +02:00
Stefan Weil
0f3206d5fe
Format code (replace ( xxx ) by (xxx))
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-29 08:21:25 +02:00
Stefan Weil
63f87cac90
Simplify boolean expressions
...
Remove "? true : false" which is not needed for boolean expressions.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-29 08:21:14 +02:00
zdenop
abe40f17c9
Win32: use the ISO C and C++ conformant name "_putenv" instead of deprecated "putenv"
2018-09-28 20:53:57 +02:00
zdenop
a0564fd4ec
Allow user to specify dpi for input image
2018-09-28 20:28:52 +02:00
zdenop
345e5ee1f3
prefer to use FreeType for pango_cairo_font_map
2018-09-28 11:07:26 +02:00
zdenop
5fe1390748
remove alpha channel from png: issue #1914
2018-09-27 19:40:15 +02:00
zdenop
971fe50031
fixed #714 : use binary mode when generating pdf to stdout on Windows
2018-09-27 18:35:15 +02:00
Zdenko Podobný
5dfce7471c
fix #1889 : part 2
2018-09-26 09:28:22 +02:00
zdenop
e1245f5c54
Merge pull request #1929 from DevelopAlex/patch-1
...
Minor: Only print "Merging rows..." in debug mode
2018-09-24 14:12:17 +02:00
DevelopAlex
f69af96dbe
Only print "Merging rows..." in debug mode
...
Only print "Merging rows..." if textord_debug_blob==true (like all the other debug messages).
Otherwise, there are a lot of "Merging rows..." messages in console output.
2018-09-24 11:43:47 +02:00
Zdenko Podobný
ea007d5b33
fix version info for cmake build
2018-09-23 20:00:28 +02:00
Zdenko Podobný
c003a60410
remove outdated scripts/contrib dir
2018-09-22 23:29:34 +02:00
Zdenko Podobný
01cf7402df
add header guard
2018-09-22 18:44:26 +02:00
zdenop
02f9d8d95e
Merge pull request #1923 from stweil/errhandling
...
Don't trigger a deliberate SIGSEGV for fatal errors in release code
2018-09-20 21:58:45 +02:00
zdenop
63674d3285
Merge branch 'master' of https://github.com/tesseract-ocr/tesseract
2018-09-20 21:58:24 +02:00
Stefan Weil
5338a5a8d5
Don't trigger a deliberate SIGSEGV for fatal errors in release code
...
The error message "segmentation fault" confuses most users,
so enforce a segmentation fault only in debug code.
Release code simply calls the abort function.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-20 21:50:13 +02:00
zdenop
4ca179d3fa
remove condition because fontsize is always > 0
2018-09-20 21:48:44 +02:00
zdenop
cefb62b644
Merge pull request #1920 from stweil/errhandling
...
Don't call exit when parameter in file is unknown
2018-09-20 10:37:38 +02:00
zdenop
5f45f73de5
Merge pull request #1919 from stweil/clean
...
Remove duplicate include statements
2018-09-20 10:36:20 +02:00
Stefan Weil
741ea00d70
Don't call exit when parameter in file is unknown
...
Wrong or old parameters in traineddata files should not terminate
the program, so make that a warning instead of a fatal error.
This fixes issue #1520 .
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-20 08:37:33 +02:00
Stefan Weil
d586b97854
Remove duplicate include statements
...
One of them was reported in issue #1843 .
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-19 22:33:29 +02:00
Egor Pugin
a2612f2830
Merge pull request #1918 from stweil/doc
...
Add documentation for lists of images to the tesseract man page
2018-09-19 13:44:54 +03:00
Egor Pugin
3038d66c7f
Merge pull request #1917 from cyanfish/master
...
Fix typo in cppan.yml affecting /openmp
2018-09-19 13:44:37 +03:00
Stefan Weil
a387e1f71e
Add documentation for lists of images to the tesseract man page
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-19 09:32:02 +02:00
Ben Olden-Cooligan
d6d6f3d08e
Fix typo in cppan.yml affecting /openmp
2018-09-19 02:55:23 -04:00
Zdenko Podobný
5d22fdfeed
replace deprecated C++ headers (reported by clan-tidy) - partially supersedes PR #1605
2018-09-18 18:51:11 +02:00
zdenop
62a5e8cfc3
Merge pull request #1265 from picturae/jpg_quality_option
...
Added JPEG quality option parameter (-c jpg_quality=n)
2018-09-18 11:37:37 +02:00
Jeff Breidenbach
c98391d3d7
fix #1192 bbox as the entire page
2018-09-18 08:09:11 +02:00