Commit Graph

542 Commits

Author SHA1 Message Date
zdenop
cbef2ebe12 implement patches vcpkg tesseract 2018-11-08 21:37:47 +01:00
zdenop
7a7f226228 ocrclass: Remove unused macros
Signed-off-by: Stefan Weil <sw@weilnetz.de>

# Conflicts:
#	src/ccutil/ocrclass.h
2018-11-08 20:23:36 +01:00
Zdenko Podobný
2dd753ee4c replace VS implementation of gettimeofday with std::chrono::steady_clock::now(); fixes #2038 2018-11-08 19:43:46 +01:00
chrismamo1
439dfaaf8b un-fix one of the warnings 2018-10-30 18:10:48 -06:00
chrismamo1
30be5aaaac fix a couple minor compiler warnings 2018-10-30 18:00:32 -06:00
Stefan Weil
6f8bd340d9 Remove chopper.h
It is no longer needed after some reordering of code in chopper.cpp.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-29 19:51:44 +01:00
Stefan Weil
286dfb031a Remove unused include statements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-29 19:46:58 +01:00
Stefan Weil
2098bb6daf Remove unused function ComputeOrientation
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-29 19:43:56 +01:00
Stefan Weil
cad6ebb5ff LIST: Remove old comments
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-29 19:43:56 +01:00
zdenop
99054f10c7
Merge pull request #2027 from stweil/warn
Fix compiler warning
2018-10-24 07:31:15 +02:00
Stefan Weil
eefb8348f7 Fix compiler warning
Compiler warning on macOS:

    tesscallback.h:29:7: warning:
      'TessClosure' has no out-of-line virtual method definitions;
      its vtable will be emitted in every translation unit [-Wweak-vtables]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-23 17:01:53 +02:00
Noah Metzger
f7f5f41073 Fixed a mac compiler warning in recodebeam.cpp
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-10-23 16:57:39 +02:00
zdenop
e60318f9c0 set PANGOCAIRO_BACKEND=fc to avoid crash; fixes #736 2018-10-23 13:22:38 +02:00
Zdenko Podobný
3d508a65a7 set unlv_tilde_crunching to false; fixes #1449 #948 2018-10-23 09:26:32 +02:00
Stefan Weil
7ebbb7370a ColPartition: Fix CID 1164543 (Division or modulo by float zero)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 22:14:15 +02:00
Stefan Weil
eaabe4a3ce ErrorCounter: Fix CID 1164538 (Division or modulo by float zero)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 22:14:15 +02:00
Stefan Weil
8f615d44f1 osdetect: Fix CID 1164539 (Division or modulo by float zero)
Avoid also a conversion from int16_t to double to float.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 22:14:15 +02:00
Stefan Weil
be0cf03778 tesseractmain: Fix memory leak
Commit 49d7df6dc3 introduced a memory leak
when the output file could not be created.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 18:50:47 +02:00
Stefan Weil
9c0799314e Add parenthesis in boolean expression
This fixes a compiler warning:

    scanutils.cpp:444:32: warning:
        '&&' within '||' [-Wlogical-op-parentheses]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 17:48:17 +02:00
Stefan Weil
0f973e1d62 Add missing 'static' keyword
This fixes a compiler warning:

    globaloc.cpp:33:6: warning: no previous extern declaration for
      non-static variable 'global_crash_pixes'
      [-Wmissing-variable-declarations]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 17:48:17 +02:00
Stefan Weil
a71ad455be Remove unused macros
This fixes some compiler warnings:

    mainblk.cpp:28:9: warning: macro is not used [-Wunused-macros]
    mainblk.cpp:29:9: warning: macro is not used [-Wunused-macros]
    [...]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 17:48:17 +02:00
zdenop
dba7f456d5
Merge pull request #2018 from stweil/sort
Get sorted list of available languages
2018-10-22 16:06:42 +02:00
Matthias Geerdsen
eac2880c24 avoid unbound variable TESSDATA_PREFIX
set TESSDATA_PREFIX as empty, if not defined in environment to avoid an
unbound variable
2018-10-22 14:28:14 +02:00
Stefan Weil
d75ef80f12 Get sorted list of available languages
TessBaseAPI::GetAvailableLanguagesAsVector returned the list of languages
without sorting, so the result was random and not user friendly.

Now `tesseract --list-langs` shows the available languages and scripts
in alphabetic order.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-22 14:07:03 +02:00
Matthias Geerdsen
95d9c8c57a set default values for unset variables
setting default values for posibly unset variables avoids unbount
variabe errors
2018-10-21 21:30:52 +02:00
Matthias Geerdsen
7b32e64564 add shebang 2018-10-21 21:30:13 +02:00
zdenop
32c1e4f433 FLAGS_webtext_prefix: unbound variable; issue #2005 2018-10-21 14:00:06 +02:00
Stefan Weil
34a89e54db Fix function ScrollViewCommand
The format string which builds the command only takes one or two
string arguments, so the function allocated too much memory and
passed too many arguments to snprintf.

This also fixes a compiler warning (clang).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-21 08:13:16 +02:00
zdenop
4d3b0bc798 use <cstdio> instead of <stdio.h> 2018-10-20 21:46:40 +02:00
zdenop
8103d17c72 use _strdup instead of strdup in MSVC 2018-10-20 21:43:38 +02:00
zdenop
a033261f63 add info about used backend in text2image 2018-10-20 21:41:09 +02:00
Stefan Weil
e232114089 Fix use of undefined macro USE_DEVICE_SELECTION
This fixes compiler warnings.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-20 13:58:12 +02:00
Zdenko Podobný
486940687c Exit training script if run command failed; fixes #2005 2018-10-20 13:00:39 +02:00
Egor Pugin
5a4288f2fc
Merge pull request #2011 from stweil/fix
Small fix and optimization
2018-10-20 13:48:51 +03:00
Zdenko Podobný
1a523006a6 install training script with autotools. 2018-10-20 12:33:07 +02:00
Stefan Weil
b0ace0e850 ScrollView: Optimize local table_colors
It is constant, and the values are in the range 0...255,
so its size can be reduced.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-20 12:05:38 +02:00
Stefan Weil
d364750cb3 Remove type cast and fix compiler warning (-Wcast-qual)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-20 12:04:46 +02:00
Zdenko Podobný
1b2bda65e0 Revert "prefer to use FreeType for pango_cairo_font_map"
This reverts commit 345e5ee1f3.
2018-10-20 11:30:07 +02:00
Zdenko Podobný
276c6845ae Revert "free PangoFontMap; fixes #1999"
This reverts commit d1d73b9888.
2018-10-20 11:28:20 +02:00
Zdenko Podobný
a03f23e05e Merge branch 'master' of https://github.com/tesseract-ocr/tesseract 2018-10-20 11:26:23 +02:00
Marco Atzeri
ebbd4e3efc fixes #426; define NOUNDEFINED for cygwin 2018-10-20 11:25:28 +02:00
Stefan Weil
b40151c200 training: Don't hide global variables
This fixes two warnings from LGTM:

    Parameter feature_defs hides a global variable with the same name.
    Parameter Config hides a global variable with the same name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-19 22:37:37 +02:00
Stefan Weil
bb181ec8d3 Rename API function from GetBestLSTMChoices to GetBestLSTMSymbolChoices
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-19 10:50:38 +02:00
Stefan Weil
df7d1e1f97 Rename API function for getting LSTM choices
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-19 10:50:38 +02:00
Stefan Weil
830b9c715a BLOBNBOX: Declare signed bit field
This fixes a warning from LGTM:

    Bit field area of type int should have explicitly unsigned integral,
    explicitly signed integral, or enumeration type.

Maybe area should be unsigned, but that would require lots of other
changes, so for now signedness is not changed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-19 10:30:05 +02:00
Stefan Weil
d9c472b988 cluster: Fix some potential overflows
This fixes several issues reported by LGTM:

    Multiplication result may overflow 'int'
    before it is converted to 'size_type'.

    Multiplication result may overflow 'float'
    before it is converted to 'double'.

    Multiplication result may overflow 'int'
    before it is converted to 'unsigned long'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-19 10:23:17 +02:00
Zdenko Podobný
d1d73b9888 free PangoFontMap; fixes #1999 2018-10-19 00:48:20 +02:00
zdenop
bbe7a4cc10
Merge pull request #2002 from stweil/err
Show error message when output file could not be created
2018-10-18 19:27:01 +02:00
Stefan Weil
49d7df6dc3 tesseractmain: Show error message when output file could not be created
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 19:22:49 +02:00
Stefan Weil
b0b8dfbc81 TessResultRenderer: Extend API to access status of renderer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 19:22:48 +02:00
Stefan Weil
f0c9b753c6 BlamerBundle: Add declaration for copy assignment operator
It does not need an implementation as it is currently not used.

This fixes a warning from LGTM:

    No matching copy assignment operator in class BlamerBundle.
    It is good practice to match a copy constructor
    with a copy assignment operator.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 15:36:32 +02:00
Stefan Weil
e3658bbc78 C_OUTLINE_FRAG: Add declaration for copy constructor
It does not need an implementation as it is currently not used.

This fixes a warning from LGTM:

    No matching copy constructor in class C_OUTLINE_FRAG.
    It is good practice to match a copy assignment operator
    with a copy constructor.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 15:31:45 +02:00
Stefan Weil
5585ed8d85 ROW: Add declaration for copy constructor
It does not need an implementation as it is currently not used.

This fixes a warning from LGTM:

    No matching copy constructor in class ROW.
    It is good practice to match a copy assignment operator
    with a copy constructor.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 15:31:10 +02:00
Stefan Weil
a1f0c66be1 BLOB_CHOICE: Add copy assignment operator
This fixes a warning from LGTM:

    No matching copy assignment operator in class BLOB_CHOICE.
    It is good practice to match a copy constructor
    with a copy assignment operator.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 15:29:07 +02:00
Stefan Weil
7100a14636 ParamsTrainingHypothesis: Add copy assignment operator
This fixes a warning from LGTM:

    No matching copy assignment operator in class ParamsTrainingHypothesis.
    It is good practice to match a copy constructor
    with a copy assignment operator.

Use also a simpler expression for the size of features.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 15:28:12 +02:00
Stefan Weil
0bbd5c5d1c LineHypothesis: Add copy assignment operator
This fixes a warning from LGTM:

    No matching copy assignment operator in class LineHypothesis.
    It is good practice to match a copy constructor
    with a copy assignment operator.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-18 15:23:28 +02:00
Noah Metzger
c13371d6e0 Renamed GetGlyphConfidences() to GetChoices() and glyph_confidences to lstm_choice_mode
Renamed the global attribute glyph_confidences to lstm_choice_mode and the method GetGlyphConfidences() to GetChoices(). All Variables and comments contained in related methods were renamed as well.

Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2018-10-17 16:43:39 +02:00
zdenop
e93e8f063f
Merge pull request #1994 from stweil/lgtm
Fix several warnings from LGTM
2018-10-16 18:18:43 +02:00
Stefan Weil
4b800ccaa7 Fix sum computation in higher precision
This also fixes two warnings from LGTM:

    Multiplication result may overflow 'float'
    before it is converted to 'double'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 18:01:27 +02:00
Stefan Weil
fd84f7b666 LLSQ: Replace sqrt by std::sqrt
This should fix warnings from LGTM:

    Multiplication result may overflow 'float'
    before it is converted to 'double'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 17:57:26 +02:00
Stefan Weil
7c2af45713 Fix sum computation in higher precision
This also fixes two warnings from LGTM:

    Multiplication result may overflow 'float'
    before it is converted to 'double'.

Replace also FALSE / TRUE by false / true for bool return value.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 17:50:12 +02:00
Stefan Weil
1730b8ccbe classify/cluster: Replace Emalloc by std::vector
This should fix a warning from LGTM:

    Multiplication result may overflow 'int' before it is
    converted to 'unsigned long'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 17:14:51 +02:00
Stefan Weil
5fb461a563 SVNetwork: Handle failed socket call (CID 1164597)
This fixes a warning from Coverity Scan.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 16:53:24 +02:00
Stefan Weil
2d2b269e02 OpenclDevice: Catch negative index (CID 1395110)
This fixes a warning from CoverityScan.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 16:53:24 +02:00
Stefan Weil
146d2caa9d Classify: Fix new resource leak (CID 1396163)
This fixes a warnings from Coverity Scan.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 16:53:23 +02:00
Stefan Weil
edbd07a5f9 lstmtraining: Handle failed remove syscall (CID 1396166)
This fixes a warning from Coverity Scan.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 16:53:23 +02:00
Stefan Weil
32e1e4b6b4 TessPDFRenderer: Remove unused member variable jpg_quality_ (CID 1396172)
This fixes a warning from Coverity Scan

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 16:53:23 +02:00
Stefan Weil
d89ec15571 Revert "Fix CID 1396172 (Uninitialized members)"
This reverts commit cbd09de7fe.
The variable can be removed as it is not used.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-16 16:53:23 +02:00
Zdenko Podobný
cbd09de7fe Fix CID 1396172 (Uninitialized members) 2018-10-16 12:24:10 +02:00
Stefan Weil
d0d73da65a commontraining: Fix two comments
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-15 11:15:49 +02:00
Zdenko Podobný
10f2c45c00 fix "mkdir -dt" for bds, mac and cygwin 2018-10-14 18:08:50 +02:00
zdenop
524c23de53
Merge pull request #1987 from tfmorris/1986_errno_include
Add missing cerrno includes - fixes #1986
2018-10-13 22:06:00 +02:00
Tom Morris
14af3f720b Add missing cerrno includes - fixes #1986 2018-10-13 16:02:48 -04:00
zdenop
83f80054f6
Merge pull request #1985 from stweil/win32
win32: Show TIFF errors on console
2018-10-13 20:51:26 +02:00
Stefan Weil
6ffb53f815 win32: Show TIFF errors on console
Showing them in a window (default) is not acceptable for a console
application like Tesseract which must be able to work in batch mode.

Such error messages can be triggered by TIFF files which include
vendor specific tags.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-13 20:42:14 +02:00
zdenop
4734317499 fixes #408 - text2image: comma in font name 2018-10-13 15:23:40 +02:00
zdenop
5f4f9372e9 revert debug message commited by mistake 2018-10-13 11:20:25 +02:00
Tom Morris
f6fd9b3a00 Handle null raw_choice - fixes #235, fixes #246 2018-10-13 11:14:26 +02:00
Stefan Weil
de6a759744 unittest: Add paragraphs_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-12 16:23:10 +02:00
Stefan Weil
d86d520fd0 Remove tab character in source files
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-12 11:31:10 +02:00
Stefan Weil
d59f14c70a Remove gradechop.h
It only defines the macro partial_split_priority which is only used in
findseam.cpp, so move it to that file.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-12 11:31:10 +02:00
Zdenko Podobný
5fac51173b Merge branch 'master' of https://github.com/tesseract-ocr/tesseract
* 'master' of https://github.com/tesseract-ocr/tesseract:
  remove insight.io badge
  Use env variable in AppVeyor configuration
  Fix integer overflow in overlap calculation
  hocr: add ocrp_wconf to unconditional ocr-capabilities; fixes #1470
  fix uninitialized variable, remove unused variable
  Remove virtual specifiers
2018-10-10 00:38:24 +02:00
Egor Pugin
d93094b397
Merge pull request #1971 from stweil/fix
Fix integer overflow in overlap calculation
2018-10-09 19:59:09 +03:00
Stefan Weil
7f911ac5e0 Fix integer overflow in overlap calculation
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-09 16:43:31 +02:00
zdenop
ca5d285a28 hocr: add ocrp_wconf to unconditional ocr-capabilities; fixes #1470 2018-10-09 16:34:50 +02:00
zdenop
956525f5a4 fix uninitialized variable, remove unused variable 2018-10-09 15:47:20 +02:00
Zdenko Podobný
67b6b02e2d Merge branch 'master' of https://github.com/tesseract-ocr/tesseract
* 'master' of https://github.com/tesseract-ocr/tesseract:
  Remove code for _MSC_VER < 1900
  keep API compatibility with #1265
  Update googletest submodule to release v1.8.1
  Update test submodule
  Always use isascii() with isspace()
  Avoid crash with --psm 0 and LSTM traineddata
  SVPaint: Remove empty block
  Classify: Don't hide debug parameter
  UNICHARMAP: Remove comparison which is always false
  svpaint: Change a variable from global to local
  pgedit: remove unused declaration of display_bln_lines
  Plumbing: Remove comparison which is always false
  Release candidate 2
  use pdf L_FLATE_ENCODE only for png input; fixes #1961
2018-10-09 15:37:40 +02:00
Stefan Weil
128422e75c Remove virtual specifiers
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-09 15:23:59 +02:00
Stefan Weil
f94b3fd9fc Remove code for _MSC_VER < 1900
Tesseract does not support Visual C++ older than Visual Studio 2015.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-09 14:05:21 +02:00
zdenop
c375f4fbf7 keep API compatibility with #1265 2018-10-09 11:22:15 +02:00
zdenop
272ebf995f
Merge pull request #1965 from stweil/isspace
Always use isascii() with isspace()
2018-10-08 18:47:39 +02:00
Stefan Weil
dcd0377bf0 Always use isascii() with isspace()
isspace() must only used with an unsigned char or EOF argument,
and even then its result can depend on the current locale settings.

While this is not a problem for C/C++ executables which use the default
"C" locale, it becomes a problem when the Tesseract API is called from
languages like Python or Java which don't use the "C" locale.

By calling isasci() before calling isspace() this uncertainty can be
avoided, because any locale will hopefully give identical results for
the basic ASCII character set.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 17:25:09 +02:00
Stefan Weil
32e92def49 Avoid crash with --psm 0 and LSTM traineddata
Orientation and script detect only worked with legacy models
and crashed with LSTM models.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 16:03:54 +02:00
Stefan Weil
1eeca175f7 SVPaint: Remove empty block
This fixes a warning from LGTM:

    Empty block without comment

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 14:25:05 +02:00
Stefan Weil
9c857ab962 Classify: Don't hide debug parameter
Fix a warning from LGTM:

    Local variable 'debug' hides a parameter of the same name.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 14:22:31 +02:00
Stefan Weil
30b75cfc05 UNICHARMAP: Remove comparison which is always false
Warning from LGTM:

    Comparison is always false because index <= 0 and 1 <= length.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 14:15:17 +02:00
Stefan Weil
3ae765ecca svpaint: Change a variable from global to local
This fixes a warning from LGTM:

    Poor global variable name 'rgb'. Prefer longer, descriptive
    names for globals (eg. kMyGlobalConstant, not foo).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 13:53:09 +02:00
Stefan Weil
7b5955920d pgedit: remove unused declaration of display_bln_lines
This fixes a warning from LGTM:

    This parameter of type ScrollView is 144 bytes
    - consider passing a pointer/reference instead.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 13:49:59 +02:00
Stefan Weil
ae93b65b1f Plumbing: Remove comparison which is always false
Warning from LGTM:

    Comparison is always false because index >= 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-08 13:47:16 +02:00
zdenop
f794571195 use pdf L_FLATE_ENCODE only for png input; fixes #1961 2018-10-07 20:57:19 +02:00