Egor Pugin
cad8cb31bb
Add missing includes.
2020-12-31 17:58:36 +03:00
Egor Pugin
65e230f1a2
Fix linux build.
2020-12-31 17:46:49 +03:00
Egor Pugin
a4daf19dd3
Merge branch 'master' of github.com-egorpugin:tesseract-ocr/tesseract
2020-12-31 17:37:37 +03:00
Stefan Weil
96fbe776ea
Partially revert cad0eb4d26
(fix layout_test)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-31 15:36:28 +01:00
Egor Pugin
a32c8b2d93
Remove GenericVector::compare_callback. This fixes several tests after previous commit.
2020-12-31 17:26:40 +03:00
Egor Pugin
c86325e2f7
Use TESS_API for every public symbol. Public symbol is exported from the library. This also applies to unit test and training symbols. Users will be limited to public api, but set of exported symbols will be wider still.
...
Remove TESS_LOCAL.
Fix several symbol issues that made visible with these changes.
All build systems must set -fvisibility-hidden for *nix systems.
2020-12-31 16:32:29 +03:00
Egor Pugin
4d817d09a5
Remove custom string hasher.
2020-12-31 14:26:23 +03:00
Egor Pugin
250fc0023e
Misc.
2020-12-31 14:24:52 +03:00
Egor Pugin
3a66282e92
Remove GOOGLE_TESSERACT ifdefs.
2020-12-31 14:23:52 +03:00
Egor Pugin
d0a730e3d0
Misc.
2020-12-31 13:25:10 +03:00
Egor Pugin
c812d9d894
Use template instead of overloads.
2020-12-31 13:20:21 +03:00
Stefan Weil
cad0eb4d26
Replace more GenericVector by std::vector
...
This fixes two LGTM alerts and might improve the performance:
This parameter of type GenericVector<STRING> is 80 bytes -
consider passing a const pointer/reference instead.
This parameter of type GenericVectorEqEq<const ParagraphMode*> is 80 bytes -
consider passing a const pointer/reference instead.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-31 09:28:35 +01:00
Stefan Weil
fc4002dda8
Remove helpers.h from public API
...
Remove also outdated references to apitypes.h which no longer exists.
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-31 09:06:16 +01:00
Egor Pugin
dfbd394a72
Export all simd matrices.
2020-12-31 03:27:18 +03:00
Egor Pugin
2c054b531c
Fix linux build.
2020-12-31 03:06:39 +03:00
Egor Pugin
4ddc919ed0
Correctly use DEBUG macro. C++ compilers do not define it. Instead they define NDEBUG in optimized compilations.
2020-12-31 02:50:07 +03:00
Egor Pugin
3af30419db
Move MAX_PATH def out from public header.
2020-12-31 02:35:28 +03:00
Egor Pugin
a0509b2feb
Use std::swap instead of manual function.
2020-12-31 02:17:54 +03:00
Egor Pugin
89273c915d
Remove empty DLLSYM macro.
2020-12-31 02:10:46 +03:00
Stefan Weil
4366d811d4
Fix TFile::DeSerialize, TFile::Serialize for empty vectors
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 19:15:56 +01:00
Stefan Weil
30eeb7f01a
Replace some old-style type casts
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 17:56:59 +01:00
Stefan Weil
faf0407dff
Remove RecognizeForChopTest from public API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 17:55:40 +01:00
Stefan Weil
588ac3fed2
Remove TessTruthCallback from public API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 15:38:11 +01:00
Stefan Weil
ebafb19a43
Replace GenericVector<ParamsTrainingHypothesis> by std::vector<ParamsTrainingHypothesis>
...
This fixes an LGTM alert:
This parameter of type ParamsTrainingHypothesis is 136 bytes -
consider passing a const pointer/reference instead.
It might also improve the performance.
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 13:26:44 +01:00
Stefan Weil
688ef20f62
Replace GenericVector<RowInfo> by std::vector<RowInfo>
...
This fixes an LGTM alert:
This parameter of type RowInfo is 144 bytes -
consider passing a const pointer/reference instead.
It might also improve the performance.
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 12:14:43 +01:00
Stefan Weil
536a676250
Replace GenericVector<WordData> by std::vector<WordData>
...
This fixes an LGTM alert:
This parameter of type WordData is 112 bytes -
consider passing a const pointer/reference instead.
It might also improve the performance.
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 12:14:43 +01:00
Stefan Weil
fbc807ce99
Remove unused local function CharCoverageMapToBitmap
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 12:14:43 +01:00
Stefan Weil
83d97ffc80
Remove redundant comparison
...
This fixes an LGTM alert:
Comparison is always true because i >= 2.
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 12:14:43 +01:00
Stefan Weil
f3acab507d
Fix arguments for tprintf
...
This fixes two LGTM alerts:
This argument should be of type 'int' but is of type '_Bit_reference'
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 12:14:43 +01:00
Stefan Weil
53503b34be
Fix declaration for C_BLOB
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 11:33:29 +01:00
Stefan Weil
7866677a0c
avx2: Remove unused local variables
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 11:33:29 +01:00
Stefan Weil
96e3b52936
Remove unused function CompareSTRING
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 11:33:29 +01:00
Stefan Weil
2cf70d6164
Replace more GenericVector by std::vector
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 10:51:12 +01:00
Stefan Weil
3a34f17037
Order and clean include statements
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 10:50:39 +01:00
Stefan Weil
3603c740e7
Fix ShapeTable::AddUnicharToResults (fix mastertrainer_test)
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 07:10:29 +01:00
Stefan Weil
4c94d09047
Replace more GenericVector by std::vector
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 07:10:29 +01:00
Stefan Weil
deec8ef46f
Replace std::list by std::vector
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 07:10:29 +01:00
Stefan Weil
4043204c2b
Use old genericvector.h
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 07:10:29 +01:00
Egor Pugin
482824c109
Fix trie's word sort comparator.
2020-12-30 02:37:53 +03:00
Egor Pugin
37e760d9c2
[test] Fix unicharset. 21->18 failed tests remaining.
2020-12-30 02:11:58 +03:00
Stefan Weil
f4e380f64a
Remove serialis.h from public API
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-29 11:28:50 +01:00
Stefan Weil
e2683e17fc
Remove unused DocumentData::SaveToBuffer
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-29 10:43:00 +01:00
Egor Pugin
f190c85682
Update src/api/tesseractmain.cpp
...
Co-authored-by: Stefan Weil <sw@weilnetz.de>
2020-12-29 00:22:28 +03:00
Stefan Weil
c8be22f313
Fix nullptr assignment in TessBaseAPI
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
90af3e7b5c
Remove strngs.h from public API
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
03884c370c
Replace STRING by std::string in ResultIterator
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
2369aa5604
Use std::vector, std::string in baseapi.h
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
72663a9a81
Use std::vector, std::string in baseapi.h
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
fec9c11c8c
Use std::vector, std::string in baseapi.h
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
64e902ddf7
Remove genericvector.h from public API
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
f462389673
renderer for TessPDFRenderer
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
d55e5f4803
Replace more GenericVector by std::vector
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
4a28d33c58
Replace GenericVector by std::vector in strngs.h and more places
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
3ddc88cccb
Use std::vector in TessPDFRenderer
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
7c679e777d
Use std::vector for allowed_scripts
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
32d53479ae
Use std::vector for vars_vec, vars_values
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
085f6b2572
Use std::list for paragraph models
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
4ebba72919
Use std::vector for paragraph models
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
524fc67165
Fix tesseract --list-langs
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Egor Pugin
986b57dd4e
Export symbol for unit test.
2020-12-28 04:58:26 +03:00
Egor Pugin
3187f2ef08
Move doubleptr.h to unittests as it is used only there.
2020-12-28 02:32:27 +03:00
Egor Pugin
4175679da6
Revert kdpair, genericheap changes.
2020-12-28 02:31:45 +03:00
Stefan Weil
289a34a40a
Add const attribute for pdf_ttf
...
That moves its data into the text segment and reduces the total size
slightly:
text data bss dec hex filename
39788 693 0 40481 9e21 old/libtesseract_la-pdfrenderer.o
40360 88 0 40448 9e00 new/libtesseract_la-pdfrenderer.o
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-26 17:51:56 +01:00
Stefan Weil
7dca63caf1
More fixes for namespace tesseract
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-26 17:41:53 +01:00
Stefan Weil
7188b160ae
Fix build with --disable-graphics
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-26 17:36:24 +01:00
Egor Pugin
aecbf79791
Add missing merge_unicharsets training tool to cmake and sw build.
2020-12-26 15:57:22 +03:00
Stefan Weil
317ef988a0
Add missing namespace prefix for GlobalParams() (fix build for some unit tests)
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-26 13:44:43 +01:00
Stefan Weil
418064f639
Add missing namespace prefix (fix build for merge_unicharsets)
...
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-26 13:09:39 +01:00
Egor Pugin
c8b8d266d6
Fix some of vector<bool> cases for msvc.
2020-12-26 04:17:13 +03:00
Egor Pugin
6b22972bc2
Fix linux build.
2020-12-26 04:15:42 +03:00
Egor Pugin
c3e04abe1e
Inherit STRING from std::string.
2020-12-26 03:48:35 +03:00
Egor Pugin
4fc467a922
Inherit GenericVector from std::vector. Inherit kdpairs from std::pair. Rewrite some move ctors to modern C++ style.
2020-12-26 03:23:09 +03:00
Egor Pugin
04d3cfcf2f
Merge branch 'master' of github.com-egorpugin:tesseract-ocr/tesseract
2020-12-26 00:55:37 +03:00
Egor Pugin
79a86f2582
Move all tesseract symbols into tesseract namespace. Fix include order in many places.
2020-12-26 00:55:30 +03:00
zdenop
ceadc4ddb8
remove inline declaration
2020-12-25 16:28:00 +01:00
Egor Pugin
14d52a79ba
Remove .rc files. No need to add them into dll/exe.
2020-12-25 18:06:35 +03:00
zdenop
044921267f
embed pdf.ttf to tesseract library #2551
2020-12-25 13:20:36 +01:00
Stefan Weil
cc133aa394
Fix text for fonts_dir parameter
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 21:32:05 +01:00
Stefan Weil
34abba8698
Add terminating linefeed to fonts.conf
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 21:32:05 +01:00
Stefan Weil
17a64eef1e
Simplify code for PangoFontInfo::HardInitFontConfig
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 21:32:05 +01:00
Stefan Weil
707ee70966
Use deprecated pango_fc_font_get_glyph for old Pango versions
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 12:02:37 +01:00
Stefan Weil
f759142c95
Remove buggy Windows implementation for getting glyph from font
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 09:07:09 +01:00
Stefan Weil
7669d36a37
Use HarfBuzz instead of deprecated pango_fc_font_get_glyph
...
This fixes the crash on MacOS with M1.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 09:03:05 +01:00
Stefan Weil
8c859a7329
Fix type cast from PangoFont to PangoFcFont
...
The original code crashes in pango_fc_font_get_glyph on MacOS with M1.
Replacing the type cast with the macro made for that conversion
gives at least an error message before crashing:
(process:12546): GLib-GObject-WARNING **: 08:38:02.472: invalid cast from 'PangoCairoCoreTextFont' to 'PangoFcFont'
zsh: segmentation fault ./pango_font_info_test
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-22 08:45:11 +01:00
Stefan Weil
3efedabda3
automake: Flat build for src/training
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-19 15:25:21 +01:00
Stefan Weil
6fcf8d23bc
Use more compiler and linker flags from pkg-config
...
This fixes some build issues with Homebrew on MacOS.
Signed-off-by: Stefan Weil <stefan@Sabines-Mac-mini.fritz.box>
2020-12-13 13:24:46 +01:00
Stefan Weil
490bd3ec8f
Fix build with enabled TensorFlow
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-04 10:56:23 +01:00
Stefan Weil
ac116d1b28
Fix regression in Network::Serialize (fix issue #3167 )
...
The regression was caused by a wrong string serialization in
commit 4613738a5e
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-03 19:36:58 +01:00
zdenop
279b0b2e37
Merge pull request #3160 from stweil/string2
...
Replace more occurrences of STRING by std::string of char*
2020-11-27 18:24:17 +01:00
Stefan Weil
65b11a1e12
Pack class SVMenuNode
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Stefan Weil
a1849bc65c
Pack struct CLASS_STRUCT
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Stefan Weil
0bb46ac2e0
Pack struct BlamerBundle
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Stefan Weil
bf3774cc91
Use more const char*
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:01:17 +01:00
Stefan Weil
4613738a5e
Use const char* for filename and network_spec parameters
...
This replaces the proprietary STRING data type
(764 instead of 838 lines remaining).
It also removes STRING from osdetect.h and serialis.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:01:17 +01:00
Stefan Weil
fbc4c809d9
Replace STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-31 14:08:39 +01:00
Stefan Weil
92b6c652f3
Use std::vector for scales_
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-29 08:00:11 +01:00
Stefan Weil
c15dd26b84
Don't pass scales_ to IntSimdMatrix::Init
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-28 20:35:53 +01:00
Stefan Weil
fe76142a3d
Remove GenericVector::scale() again
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-28 16:24:59 +01:00
Stefan Weil
eaf72ace31
Prefer result from inverted image if the mean confidence is better
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-26 20:37:47 +01:00
Stefan Weil
cfb1fb2540
Try OCR on inverted line only if mean confidence is below 50 %
...
The old code looked for the minimum confidence which triggered
very often a 2nd OCR without improving the result.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-26 09:32:09 +01:00