Commit Graph

217 Commits

Author SHA1 Message Date
Egor Pugin
4fc467a922 Inherit GenericVector from std::vector. Inherit kdpairs from std::pair. Rewrite some move ctors to modern C++ style. 2020-12-26 03:23:09 +03:00
Egor Pugin
79a86f2582 Move all tesseract symbols into tesseract namespace. Fix include order in many places. 2020-12-26 00:55:30 +03:00
amitdo
b378ebff2e Improve disabled legacy engine build 2020-10-10 04:49:52 +03:00
Stefan Weil
ac14ab32c6 Remove dummy functions from globaloc.cpp and related code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-04 12:24:26 +02:00
Le Duc Nam
eb8f1674bf Correct "NoImages" in debug pdf file
Issues:
  Debug information for "NoImages" just be binary image,
  it don't show up the result of photo_mask_pix to developer

Fix:
  Substract binary image to photo_mask_pix, the result
  are "NoImages" binary pix
2020-09-06 23:31:30 +07:00
Robin Watts
150e2e54fe Squash some warnings in MSVC build.
In particular, "defined but not used" (caused by GRAPHICS_DISABLED),
double constants being truncated to floats, and implicit casts.
2020-07-16 10:08:40 +01:00
Stefan Weil
cb3880fb15 Disable more code and data with GRAPHICS_DISABLED
Some runtime parameters which are only relevant with graphics enabled
were now removed from builds when graphics was disabled.

TableFinder::DisplayColSegmentGrid is never used, so remove it completely.

Builds with --disable-graphics significantly reduce the code size and avoid
some function calls which might be important for certain applications:

   text	   data	    bss	    dec	    hex	filename
3219230	  41136	  13920	3274286	 31f62e	.libs/libtesseract.so (--disable-graphics, old)
3211347	  40976	  13600	3265923	 31d583	.libs/libtesseract.so (--disable-graphics, new)
3360942	  43656	  15392	3419990	 342f56	.libs/libtesseract.so (default)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-09 11:23:33 +02:00
Stefan Weil
8137cf35a6 Use const char* for filename parameters
This replaces the proprietary STRING data type
(801 instead of 838 lines remaining).

It also removes STRING from osdetect.h and serialis.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-07 14:20:09 +02:00
amitdo
efae270dea Disabled legacy build: Disable more unused code 2020-06-24 22:02:52 +03:00
Stefan Weil
62b085cb8d ScrollView: Remove C API callcpp.{cpp,h}
Use C++ class ScrollView directly instead of using an intermediate C API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-22 09:14:26 +02:00
Stefan Weil
4a10bb68c7 Fix conversion of images with 16 bpp or 24 bpp to grey
The old code used pixConvertRGBToLuminance which only converts 32 bpp images.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-21 09:09:49 +02:00
Stefan Weil
d4cf77c92b Don't check for limits.h (now unused)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-20 10:39:13 +02:00
Stefan Weil
a06d0d8449 Add missing include statements for config_auto.h
They are required to get the macro DISABLED_LEGACY_ENGINE.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-05-22 16:34:28 +02:00
Robin Watts
43437a540b Fix OEM_DEFAULT in DISABLED_LEGACY_ENGINE builds.
If api->Init is called with OEM_DEFAULT in DISABLED_LEGACY_ENGINE
build modes, the engine mode is never set, resulting in no
words being found.
2020-05-15 14:56:41 +01:00
Julian Gilbey
ca5735efcb Destroy box before potentially exiting function 2020-05-12 15:25:16 +01:00
Robert Sachunsky
cdc8e44a20 ChoiceIterator: skip symbol without choices 2020-01-24 09:19:14 +01:00
Stefan Weil
6f2f310fdf Remove redundant method from class GenericVector
length() is not needed: it can be replaced by size().

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-01-18 11:30:14 +01:00
Robert Sachunsky
4b0c9f3373 BlockPolygon: clip to image rectangle 2019-12-18 13:29:43 +01:00
Robert Sachunsky
5751a408c9 BlockPolygon: unrotate from internal to image coordinates 2019-12-18 13:29:43 +01:00
Stefan Weil
9745a9d111 automake: Flat build for src/ccmain
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-26 16:20:46 +01:00
Stefan Weil
ac46b286a4 Fix issue #2748
Commit 94d0f77f56 tried to fix issue #2741
but created a new problem.

This commit should fix both old and new issue.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-08 17:12:20 +01:00
Stefan Weil
a306cd7370 Fail if no valid lstmf file was written (fix issue #2741)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 21:52:45 +01:00
Stefan Weil
94d0f77f56 Don't create an empty lstmf file
If Tesseract cannot find text in the input image, it should not write
an empty lstmf file. This problem was reported in issue #2741.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-11-01 21:43:26 +01:00
Egor Pugin
2a37f5dd62 Update includes to use <>. 2019-10-29 14:50:11 +03:00
Stefan Weil
629b05d978 Update README.md and other documentation for new include file structure
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-29 12:26:41 +01:00
amitdo
2f8884a64e Fix autotools build 2019-10-28 21:23:58 +02:00
amitdo
e1bae15547 Fix #include path of public headers 2019-10-28 19:10:30 +02:00
amitdo
dfede8ac01 Move all public headers to include/tesseract 2019-10-28 18:50:31 +02:00
Nat
52bc15acd9 Add pageseg_apply_music_mask option to allow disabling the music mask 2019-10-24 11:44:05 -05:00
wshwang
71e291bae5 Remove warning C4312 2019-10-22 13:06:44 +02:00
Stefan Weil
a209a6b4b5 Copy resolution of source image (fix issue #1702)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-20 20:45:35 +02:00
zdenop
a3cfd66f37 do not exit if not existing parameter is used. fixes #1334 2019-10-15 07:56:22 +02:00
zdenop
0150fc57cc Report when tesseract legacy engine not present. (fix issue #2053) 2019-10-14 22:55:47 +02:00
jm
fb150265ef speed optimisation - add the option to disable automatic inverting of line images 2019-10-04 10:09:52 +02:00
Johannes Künsebeck
aa2ab68e29 Removed unused parameters
The following parameters are not used anywhere anymore:

 * use_definite_ambigs_for_classifier
 * max_viterbi_list_size
 * word_to_debug_lengths
 * fragments_debug
 * tessedit_redo_xheight
 * debug_acceptable_wds
 * tessedit_matcher_log
 * tessedit_test_adaption_mode
 * docqual_excuse_outline_errs
 * crunch_pot_garbage
 * suspect_space_level
 * tessedit_consistent_reps
 * wordrec_display_all_words
 * wordrec_no_block
 * wordrec_worst_state
 * fragments_guide_chopper
 * segment_adjust_debug
 * classify_adapt_feature_thresh (classify_adapt_feature_threshold still exists)
 * classify_adapt_proto_thresh (classify_adapt_proto_threshold still exists)
 * classify_min_norm_scale_x
 * classify_max_norm_scale_x
 * classify_min_norm_scale_y
 * classify_max_norm_scale_y
 * il1_adaption_test
 * textord_blob_size_bigile
 * textord_blob_size_smallile
 * editor_debug_config_file
 * textord_tabfind_show_color_fit

The list was generated by a python script and each parameter occurence checked
manually.
2019-10-03 09:18:29 +02:00
Stefan Weil
7bddad59d1 Optimize class ChoiceIterator
Re-order a class variable to avoid memory holes and
remove unused class variables.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-25 09:43:57 +02:00
Noah Metzger
ff4c1d204d Fixed minor bug with the Choice iterator when lstm_choice_mode is not active.
Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
2019-09-24 15:38:28 +02:00
Stefan Weil
994ec697d8 Remove member functions STRING::string and StringParam::string
They were redundant because there exist member functions 'c_str' which do the same.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-23 08:33:08 +02:00
amitdo
1e13d1d4d5 Disable legacy build: Disable more unneeded code 2019-09-22 20:55:24 +03:00
zdenop
39a63c2837
Merge pull request #2663 from bertsky/fix-lstm-user-patterns
fix langdata (user words/patterns) file suffixes for LSTMs:
2019-09-20 15:32:54 +02:00
Robert Schubert
5b976bfb55 fix langdata (user words/patterns) file suffixes for LSTMs:
- add another constructor for LSTMRecognizer
  which takes the language_data_path_prefix configured/selected
  at runtime and passes it to the internal CCUtil
- use this in Tesseract::init_tesseract_lang_data when LSTMs
  are available

(this was missing from 297d7d86ce)
2019-09-19 19:30:54 +02:00
amitdo
479a7b1ca0 Disabled legacy build: Disable more unneeded code 2019-09-19 19:00:13 +03:00
amitdo
2134cd7867 Disabled legacy engine build: Disable code related to ambigs. 2019-09-15 19:11:30 +02:00
Egor Pugin
6a9584fbc2
Merge pull request #2650 from stweil/cid
Fix several issues reported by Coverity Scan
2019-09-14 21:18:37 +03:00
Stefan Weil
6fd58d2897 Fix CID 1164659 (Uninitialized scalar field)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 19:20:14 +02:00
Stefan Weil
46f21a4182 Fix CID 1164633 (Uninitialized pointer field)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 15:43:50 +02:00
Stefan Weil
9ea579bf1b Fix CID 1164628 ff (Uninitialized pointer field) and optimize class ParamContent
Only one of bIt, dIt, iIt and sIt is used, so put all four in a union.
This fixes CID 1164628, CID 1164629, CID 1164630 and CID 1164631.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 15:43:50 +02:00
Stefan Weil
5b1f0dbd4b Fix CID 1164620 (Uninitialized pointer field)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 15:43:50 +02:00
Stefan Weil
f62a895f74 Remove unused italic, bold in class BLOCK_RES and class WORD_RES
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-14 11:53:58 +02:00
Stefan Weil
4a2d5a2e8d OSResults: Fix runtime errors detected by UndefinedBehaviorSanitizer
Fix this runtime error in osd_test and textlineprojection_test:

    src/ccmain/osdetect.cpp:109:14: runtime error: division by zero

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-10 15:56:32 +02:00