Commit Graph

211 Commits

Author SHA1 Message Date
Stefan Weil
4b84a56d8d Replace STRING by std::string for function read_unlv_file
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:46:12 +01:00
Stefan Weil
71fb535427 Remove unneeded include statement for strngs.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-23 17:29:57 +01:00
Stefan Weil
209c1df599 Fix some format strings
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2021-01-08 18:49:21 +01:00
Egor Pugin
9cc7bdeaa6 Use std::bitset<16> instead of custom BITS16. 2021-01-07 14:14:27 +03:00
Egor Pugin
9710bc0465 More std::vector. 2021-01-07 13:57:57 +03:00
Egor Pugin
4ed601956e More std::vector. 2021-01-05 14:46:11 +03:00
Egor Pugin
664a718a63 Rename platform.h to export.h. 2021-01-01 00:18:36 +03:00
Stefan Weil
061f088b77 Replace C headers by C++ headers and remove old unused C code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-31 18:26:33 +01:00
Egor Pugin
cad8cb31bb Add missing includes. 2020-12-31 17:58:36 +03:00
Egor Pugin
a32c8b2d93 Remove GenericVector::compare_callback. This fixes several tests after previous commit. 2020-12-31 17:26:40 +03:00
Egor Pugin
c86325e2f7 Use TESS_API for every public symbol. Public symbol is exported from the library. This also applies to unit test and training symbols. Users will be limited to public api, but set of exported symbols will be wider still.
Remove TESS_LOCAL.
Fix several symbol issues that made visible with these changes.

All build systems must set -fvisibility-hidden for *nix systems.
2020-12-31 16:32:29 +03:00
Stefan Weil
fc4002dda8 Remove helpers.h from public API
Remove also outdated references to apitypes.h which no longer exists.

Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-31 09:06:16 +01:00
Egor Pugin
2c054b531c Fix linux build. 2020-12-31 03:06:39 +03:00
Egor Pugin
a0509b2feb Use std::swap instead of manual function. 2020-12-31 02:17:54 +03:00
Egor Pugin
89273c915d Remove empty DLLSYM macro. 2020-12-31 02:10:46 +03:00
Stefan Weil
ebafb19a43 Replace GenericVector<ParamsTrainingHypothesis> by std::vector<ParamsTrainingHypothesis>
This fixes an LGTM alert:

    This parameter of type ParamsTrainingHypothesis is 136 bytes -
    consider passing a const pointer/reference instead.

It might also improve the performance.

Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-30 13:26:44 +01:00
Stefan Weil
4043204c2b Use old genericvector.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-30 07:10:29 +01:00
Stefan Weil
f4e380f64a Remove serialis.h from public API
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-29 11:28:50 +01:00
Stefan Weil
e2683e17fc Remove unused DocumentData::SaveToBuffer
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-29 10:43:00 +01:00
Stefan Weil
90af3e7b5c Remove strngs.h from public API
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
64e902ddf7 Remove genericvector.h from public API
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Stefan Weil
4a28d33c58 Replace GenericVector by std::vector in strngs.h and more places
Signed-off-by: Stefan Weil <sw@weil.de>
2020-12-28 21:03:29 +01:00
Egor Pugin
c3e04abe1e Inherit STRING from std::string. 2020-12-26 03:48:35 +03:00
Egor Pugin
4fc467a922 Inherit GenericVector from std::vector. Inherit kdpairs from std::pair. Rewrite some move ctors to modern C++ style. 2020-12-26 03:23:09 +03:00
Egor Pugin
79a86f2582 Move all tesseract symbols into tesseract namespace. Fix include order in many places. 2020-12-26 00:55:30 +03:00
Stefan Weil
0bb46ac2e0 Pack struct BlamerBundle
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Robin Watts
150e2e54fe Squash some warnings in MSVC build.
In particular, "defined but not used" (caused by GRAPHICS_DISABLED),
double constants being truncated to floats, and implicit casts.
2020-07-16 10:08:40 +01:00
zdenop
7fa200bfb7
Merge pull request #3064 from robinwatts/pushback12
Fix Memory leak when using TESSERACT_IMAGEDATA_AS_PIX
2020-07-15 19:08:58 +02:00
Robin Watts
7f45b719d1 Fix Memory leak when using TESSERACT_IMAGEDATA_AS_PIX
If building with TESSERACT_IMAGEDATA_AS_PIX, then tesseract
doesn't compress/decompress images, but rather holds the
data as internal Pix structures. Unfortunately, I forgot to
make the ImageData destructor free these, so memory leaked
during use. Fixed here.
2020-07-15 12:35:35 +01:00
Stefan Weil
cb3880fb15 Disable more code and data with GRAPHICS_DISABLED
Some runtime parameters which are only relevant with graphics enabled
were now removed from builds when graphics was disabled.

TableFinder::DisplayColSegmentGrid is never used, so remove it completely.

Builds with --disable-graphics significantly reduce the code size and avoid
some function calls which might be important for certain applications:

   text	   data	    bss	    dec	    hex	filename
3219230	  41136	  13920	3274286	 31f62e	.libs/libtesseract.so (--disable-graphics, old)
3211347	  40976	  13600	3265923	 31d583	.libs/libtesseract.so (--disable-graphics, new)
3360942	  43656	  15392	3419990	 342f56	.libs/libtesseract.so (default)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-09 11:23:33 +02:00
Stefan Weil
8137cf35a6 Use const char* for filename parameters
This replaces the proprietary STRING data type
(801 instead of 838 lines remaining).

It also removes STRING from osdetect.h and serialis.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-07 14:20:09 +02:00
Stefan Weil
2269a500ef Fix runtime error with null pointer argument
Runtime error reported by sanitizer:

    src/ccstruct/coutln.cpp:1018:19: runtime error: null pointer passed as argument 2, which is declared to never be null
    /usr/include/string.h:48:14: note: nonnull attribute specified here
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/ccstruct/coutln.cpp:1018:19 in

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-29 19:13:39 +02:00
zdenop
4ef709554b
Update imagedata.cpp
stop PreScale if pixScale failed (fixes #3025)
2020-06-25 20:32:51 +02:00
Stefan Weil
62b085cb8d ScrollView: Remove C API callcpp.{cpp,h}
Use C++ class ScrollView directly instead of using an intermediate C API.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-22 09:14:26 +02:00
Stefan Weil
ea1f597fc1 Fix insecure call of tprintf
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-21 19:03:03 +02:00
Stefan Weil
bc61038dd4 SPLIT: Make function bounding_box inline for better performance
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-16 17:21:36 +02:00
Stefan Weil
0e7701bc3c SEAM: More inline functions for better performance
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-16 17:20:14 +02:00
Stefan Weil
e45100ebf7 TBOX: Use inline constructor for better performance
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-16 17:17:55 +02:00
Stefan Weil
c110958ffa Fix undefined shift with negative value (oss-fuzz issue 14658)
This fixes a bug reported by OSS Fuzz:
https://oss-fuzz.com/issue/5697280134348800

The old code passed a negative value (-1) as argument to step_dir
when destindex was 0.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-16 13:25:32 +02:00
Stefan Weil
6ee3698958 Remove old unused code from imagedata.h
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-14 16:02:27 +02:00
Stefan Weil
d8500adcf4 Fix crash caused by missing thread synchronization (issues #757, #1168 and #2191)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-06-14 15:53:17 +02:00
Stefan Weil
a06d0d8449 Add missing include statements for config_auto.h
They are required to get the macro DISABLED_LEGACY_ENGINE.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-05-22 16:34:28 +02:00
zdenop
b5d639dcc5
Merge pull request #2965 from robinwatts/pushback1
thanks.
2020-05-16 20:35:19 +02:00
zdenop
064b4403de
Merge pull request #2966 from robinwatts/pushback2 2020-05-16 20:06:31 +02:00
Julian Gilbey
e7e6999d3b Move comment about swap meaning for DeSerialize to correct function 2020-05-13 07:02:59 +01:00
Robin Watts
80d4af6ecf Add a mechanism to avoid creating debug fonts.
If TESSERACT_DISABLE_DEBUG_FONTS is defined, tesseract doesn't
atetmpt to create any debug fonts. This not only saves memory,
but it (combined with the change to optionally use Pix as
internal storage for the ImageData) allows us to use an
embedded Leptonica library with no format handlers at all.
2020-05-05 00:22:23 +01:00
Robin Watts
6bcb941bcf Avoid tesseract writing Pix out/reading them back.
By default, when we ImageData::SetPix, we write the data out as a
PNG, just to read it back in to get a compressed buffer of data.
We then use this to generate a new Pix.

In builds of Tesseract on systems where we don't have temp files,
writing files out is problematic.

Not only that, but compressing/uncompressing is slow, and on minimal
builds of leptonica, where we've disabled the format writers to reduce
memory footprint, we get no compression anyway.

In such cases, it'd be far nicer just to keep the original Pix as
the internal data.

Also, when recovering the pixmap from the ImageData, if we know we're
only going to read from the data, we can avoid duplicating it and
just use the original. This is exactly the case when GRAPHICS_DISABLED
is set.

So, introduce a TESSERACT_IMAGEDATA_AS_PIX predefine that we can use
to cause the internal data to be a Pix rather than a compressed
buffer.



Given we don't do compression, and they were writing to memory,
this was all just more effort than we needed.

Also, if we're using GRAPHICS_DISABLED, we might as well just
pixCopy rather than pixClone as only the scaler uses this.
2020-05-04 21:01:22 +01:00
Stefan Weil
6f2f310fdf Remove redundant method from class GenericVector
length() is not needed: it can be replaced by size().

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-01-18 11:30:14 +01:00
Stefan Weil
cfd39dc2c7 pageres: Fix compiler warnings
clang warnings:

    src/ccstruct/pageres.cpp:903:20: warning:
      implicit conversion from 'int' to 'float' changes value from
      2147483647 to 2147483648 [-Wimplicit-int-float-conversion]
    src/ccstruct/pageres.cpp:904:23:
      warning: implicit conversion from 'int' to 'float' changes value from
      -2147483647 to -2147483648 [-Wimplicit-int-float-conversion]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-01-04 09:46:10 +01:00
Stefan Weil
fc84f84b5b Remove Emacs C modeline in comment line 1
Those files are C++, and the wrong modeline is not needed at all.
Remove also some empty descriptions and old history in the comments.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-12-05 13:57:50 +01:00