Commit Graph

4645 Commits

Author SHA1 Message Date
Robin Watts
db10c7b577 intsimdmatrixneon.cpp: Do biasing in SIMD. 2020-10-12 04:30:46 -07:00
Robin Watts
d1e49d6dd2 intsimdmatrixavx2: Do biasing in SIMD.
We also move to relying on both scales and output having been
padded to accomodate us writing more results than are actually
needed here. This was allowed for a few commits back.
2020-10-12 04:30:46 -07:00
Robin Watts
872816897a Rejig intsimdmatrix to reduce FP ops.
Avoid 1) floating point division by 127, 2) conversion of
bias to double, 3) FP addition, in favour of 1) integer
multiplication by 127, and 2) integer addition.

(Also costs extra work in the serialisation/deserialisation of
the scale values, and conversion of weights to int formats, but
these are all one offs).
2020-10-12 04:30:46 -07:00
Robin Watts
aba1800f69 Round output buffers for intSimdMatrix.
In order to allow intSimdMatrix implementations to 'overwrite'
their outputs, ensure that the output buffers are always padded
to the next block size.

This doesn't make any difference yet, but it enables optimisations
further down the line, especially when the biasing is pulled into
the SIMD.
2020-10-12 11:47:16 +01:00
Robin Watts
9dfdac51c6 Tweak scales array for intSimdMatrix case.
Currently, the size of the scales array is not rounded up
in the same way as the weights are. This blocks us pushing
the scale calculations into the SIMD, as when we "overread"
the end of the scale array, we potentially get errors.

Here, we adjust the intSimdMatrix stuff to ensure that the
scales array reserves enough entries to allow such overreads
to work.

This doesn't make any difference for now, but opens the way
for future optimisations.
2020-10-12 11:47:16 +01:00
Shatur95
5a377707e0 Generate imported target automatically 2020-10-12 11:47:16 +01:00
Shatur95
8dad1e24a2 Modernize CMake config files 2020-10-12 11:47:16 +01:00
amitdo
958f23453e Improve disabled legacy engine build 2020-10-12 11:47:16 +01:00
amitdo
06154e028b Improve disabled legacy engine build 2020-10-12 11:47:16 +01:00
amitdo
e81b485066 Improve disabled legacy engine build 2020-10-12 11:47:15 +01:00
amitdo
7df4918644 Improve disabled legacy engine build 2020-10-12 11:47:15 +01:00
Shatur95
ec8766ce74 Use DESTINATION instead of TYPE
For compatibility with older CMake.
2020-10-12 11:47:15 +01:00
Stefan Weil
ac14ab32c6 Remove dummy functions from globaloc.cpp and related code
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-04 12:24:26 +02:00
zdenop
0ded9f3573
Merge pull request #3113 from stweil/pango
Remove unused functions FontUtils::GetAllRenderableCharacters
2020-10-03 18:07:42 +02:00
Stefan Weil
7c4ef88dab Remove unused functions FontUtils::GetAllRenderableCharacters
They used the function pango_coverage_max which does nothing and
which has been deprecated since pango version 1.44.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-03 12:04:40 +02:00
Egor Pugin
45413e6c42
Merge pull request #3112 from Shatur95/fix-cmake-install-configs
Fix CMake install configs
2020-10-03 00:32:05 +03:00
Shatur95
72779fb185 Fix CMake install configs 2020-10-01 22:05:02 +03:00
Egor Pugin
b19e3ee63c
Update appveyor.yml 2020-09-10 15:40:07 +03:00
Egor Pugin
76ead638e3
Update sw.yml 2020-09-10 02:05:29 +03:00
zdenop
f5561c4c42
Merge pull request #3090 from nam-leduc/correct-debug-find-images
Correct "NoImages" in debug pdf file
2020-09-07 09:22:43 +02:00
Le Duc Nam
eb8f1674bf Correct "NoImages" in debug pdf file
Issues:
  Debug information for "NoImages" just be binary image,
  it don't show up the result of photo_mask_pix to developer

Fix:
  Substract binary image to photo_mask_pix, the result
  are "NoImages" binary pix
2020-09-06 23:31:30 +07:00
Stefan Weil
162f3707e2
Merge pull request #3082 from bertsky/fix-line-detector
Fix separator line detector
2020-08-29 20:33:09 +02:00
Robert Sachunsky
640c14e080 AutoPageSeg/FindBlocks/GridRemoveUnderlinePartitions: avoid self-deletion
When checking horizontal line partitions for
possible interpretation as underline formatting,
avoid confusing the hline partition itself with
an overlapping neighbour (which would delete it).
2020-08-24 19:13:48 +02:00
Robert Sachunsky
65a077d3e9 FindAndRemoveLines/FindVerticalAlignment: decrease fixed vline min length
When detecting vertical separators, the blob aligner is used to glue
line segments (often segmented due to artificial cracks).
But (unlike LineFinder) it has many parameters that are not
relative to pixel density/resolution.
This change decreases the minimum absolute length in pixels
for vertical separators.
2020-08-24 19:13:36 +02:00
Robert Sachunsky
0228d93684 textord debugging: invert default top/bottom bounaries, improve description 2020-08-24 19:13:25 +02:00
Stefan Weil
d33edbc4b1
Merge pull request #3066 from robinwatts/pushback14
Remove unused char constant that causes a warning.
2020-07-17 15:55:51 +02:00
Robin Watts
578462109b Remove unused char constant that causes a warning.
The kDictWildcard is never actually used, so removing it makes
no difference. It causes warnings in MSVC builds as MSVC doesn't
know how to pack a unicode value into chars.
2020-07-17 14:22:37 +01:00
zdenop
749851d39d
Merge pull request #3065 from robinwatts/pushback13
Squash some warnings in MSVC build.
2020-07-16 14:41:43 +02:00
Robin Watts
150e2e54fe Squash some warnings in MSVC build.
In particular, "defined but not used" (caused by GRAPHICS_DISABLED),
double constants being truncated to floats, and implicit casts.
2020-07-16 10:08:40 +01:00
zdenop
7fa200bfb7
Merge pull request #3064 from robinwatts/pushback12
Fix Memory leak when using TESSERACT_IMAGEDATA_AS_PIX
2020-07-15 19:08:58 +02:00
Robin Watts
7f45b719d1 Fix Memory leak when using TESSERACT_IMAGEDATA_AS_PIX
If building with TESSERACT_IMAGEDATA_AS_PIX, then tesseract
doesn't compress/decompress images, but rather holds the
data as internal Pix structures. Unfortunately, I forgot to
make the ImageData destructor free these, so memory leaked
during use. Fixed here.
2020-07-15 12:35:35 +01:00
zdenop
135c8a49b5
Merge pull request #3061 from stweil/neon
Always use NEON by default for ARMv8
2020-07-11 09:11:54 +02:00
zdenop
875bd48bd5
Merge pull request #3058 from stweil/scrollview
Disable more code and data with GRAPHICS_DISABLED
2020-07-11 09:11:27 +02:00
Stefan Weil
548a832b98 Use strtok_s for MSVC in class SVNetwork
strtok_s can be used with MSVC as a replacement for strtok_r, so less
special handling is needed in the code and class SVNetwork can be
made smaller by removing member has_content.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-10 17:47:05 +02:00
Stefan Weil
636c37fa01
Merge pull request #3060 from edwinnyawoli/patch-1 2020-07-10 16:06:25 +02:00
Edwin Nyawoli
317495ecb8
Link 'traineddata' word to its documentation
This is to help make it clearer to users
2020-07-10 14:01:37 +00:00
Stefan Weil
2db2223b39 Always use NEON by default for ARMv8
Signed-off-by: Stefan Weil <stefan.weil@bib.uni-mannheim.de>
2020-07-10 15:27:09 +02:00
Edwin Nyawoli
1fb6c41e0f
Fix typo in README.md 2020-07-10 12:06:48 +00:00
Stefan Weil
cb3880fb15 Disable more code and data with GRAPHICS_DISABLED
Some runtime parameters which are only relevant with graphics enabled
were now removed from builds when graphics was disabled.

TableFinder::DisplayColSegmentGrid is never used, so remove it completely.

Builds with --disable-graphics significantly reduce the code size and avoid
some function calls which might be important for certain applications:

   text	   data	    bss	    dec	    hex	filename
3219230	  41136	  13920	3274286	 31f62e	.libs/libtesseract.so (--disable-graphics, old)
3211347	  40976	  13600	3265923	 31d583	.libs/libtesseract.so (--disable-graphics, new)
3360942	  43656	  15392	3419990	 342f56	.libs/libtesseract.so (default)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-09 11:23:33 +02:00
Stefan Weil
22e6c2e5a7 Fix division by 0.0 in BaselineRow::PerpDistanceFromBaseline
It was reported by oss-fuzz (issue 23962).

Add log output to find real images which trigger that issue.
Avoid also some conversions from float to double by always using float.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-08 17:59:02 +02:00
zdenop
b67736cd6b
Merge pull request #3055 from stweil/string
Use const char* for filename parameters
2020-07-07 18:15:04 +02:00
Stefan Weil
8137cf35a6 Use const char* for filename parameters
This replaces the proprietary STRING data type
(801 instead of 838 lines remaining).

It also removes STRING from osdetect.h and serialis.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-07 14:20:09 +02:00
Stefan Weil
d01b2e43b8 unittest: Update comments in normstrngs_test.cc
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-07 11:29:48 +02:00
zdenop
36985fcc03
Merge pull request #3052 from stweil/msvc
Fix cmake build for MSVC
2020-07-03 21:09:52 +02:00
Stefan Weil
0e79daed42 Fix cmake build for MSVC
MSVC does not support /arch:FMA or /arch:SSE4.1.
For /arch:AVX and /arch:AVX2 no check is needed because they are supported since a long time.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-07-01 21:54:51 +02:00
Stefan Weil
e910b3c20b
Merge pull request #3050 from zdenop/cmake_AVX
Update cmake builds to set right flags for AVX, ...
2020-07-01 08:52:20 +02:00
zdenop
511189b069
Update CMakeLists.txt
thanks.

Co-authored-by: Stefan Weil <sw@weilnetz.de>
2020-07-01 08:43:55 +02:00
zdenop
2538989ef5 cmake: NEON build is not supported on Mac OS X 2020-07-01 00:12:10 +02:00
zdenop
3c3e7b913f cmake: check compiler flags for AVX,AVX2,FMA,SSE4.1 support 2020-06-30 23:09:36 +02:00
zdenop
33f1e1371b cmake: eliminate OptimizeForArchitecture 2020-06-30 22:35:05 +02:00