softana
bb832d491e
Update Dockerfile
...
Change double hyphen "--" to single hyphen "-" to prevent build errors:
Fix invalid option no-ri-no-rdoc
> ERROR: While executing gem ... (OptionParser::InvalidOption) invalid option: --no-ri
2020-12-07 11:33:09 -06:00
Stefan Weil
43e13ea6f4
Merge pull request #3171 from stweil/lsan
...
Suppress some LeakSanitizer errors in unit tests
2020-12-05 10:20:54 +01:00
Merlijn Wajer
5ff273675c
tesseract.1.asc: sync with languages available in tessdata-fast
...
cos, div, fao, fyr, gla, hye are available in Ubuntu's 'tesseract-ocr-*'
packages but not mentioned in the manpage.
2020-12-04 18:16:45 +01:00
Stefan Weil
b303dd6ac2
Add more patterns to suppress memory leaks from libfontconfig
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-04 13:30:58 +01:00
Stefan Weil
490bd3ec8f
Fix build with enabled TensorFlow
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-04 10:56:23 +01:00
Stefan Weil
5eb5e6ea23
Suppress some LeakSanitizer errors in unit tests
...
The fontconfig library has some (intentional) memory leaks which
must be suppressed for unit tests with the LeakSanitizer.
This fixes the issues #3156 and #3157 .
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-04 07:25:49 +01:00
Stefan Weil
ac116d1b28
Fix regression in Network::Serialize (fix issue #3167 )
...
The regression was caused by a wrong string serialization in
commit 4613738a5e
.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-12-03 19:36:58 +01:00
Stefan Weil
69ed480a9a
Merge pull request #3165 from MerlijnWajer/master
...
Remove references to "kur" and "tgl", add "fil" to man page
2020-12-03 13:42:55 +01:00
Merlijn Wajer
58f7a72f00
Remove references to "kur" and "tgl", add "fil" to man page
...
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is
now "kmr", which is actually Latin) now, but "kur" is not present in
tessdata_fast nor in tessdata_best. [1] [2]
"tgl" (Tagalo) is now named "fil" (Filipino) [3]
[1] https://github.com/tesseract-ocr/langdata/issues/124
[2] https://github.com/tesseract-ocr/tessdata_best/issues/23
[3] https://github.com/tesseract-ocr/langdata/issues/84
2020-12-01 23:43:50 +01:00
zdenop
a06c61cc90
Merge pull request #3128 from acoder77/patch-1
...
Create .gitattributes for cross os contributors
2020-11-27 18:27:26 +01:00
zdenop
279b0b2e37
Merge pull request #3160 from stweil/string2
...
Replace more occurrences of STRING by std::string of char*
2020-11-27 18:24:17 +01:00
zdenop
6bc42464af
Merge pull request #3159 from stweil/pack
...
Pack BlamerBundle, CLASS_STRUCT and SVMenuNode
2020-11-27 18:23:14 +01:00
Stefan Weil
65b11a1e12
Pack class SVMenuNode
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Stefan Weil
a1849bc65c
Pack struct CLASS_STRUCT
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Stefan Weil
0bb46ac2e0
Pack struct BlamerBundle
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:17:27 +01:00
Stefan Weil
bf3774cc91
Use more const char*
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:01:17 +01:00
Stefan Weil
4613738a5e
Use const char* for filename and network_spec parameters
...
This replaces the proprietary STRING data type
(764 instead of 838 lines remaining).
It also removes STRING from osdetect.h and serialis.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-26 17:01:17 +01:00
Amit D
4c35f51a5c
Merge pull request #3158 from Shreeshrii/master
...
fixes issue #3099
2020-11-25 03:48:42 +02:00
Shree Devi Kumar
31710098e3
fixes issue 3099
2020-11-23 13:30:26 +00:00
Egor Pugin
dea08c34f8
Merge pull request #3155 from Shatur95/fix-cmake-targets-path
...
Fix CMake targets path
2020-11-18 04:26:10 +03:00
Shatur95
80147735db
Fix CMake targets path
2020-11-18 02:01:55 +02:00
zdenop
e20ffdd719
Merge pull request #3153 from stweil/scale
...
Remove GenericVector::scale() again and replace more STRING by std::string
2020-11-12 20:01:46 +01:00
Stefan Weil
fbc4c809d9
Replace STRING by std::string
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-31 14:08:39 +01:00
Stefan Weil
92b6c652f3
Use std::vector for scales_
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-29 08:00:11 +01:00
Stefan Weil
c15dd26b84
Don't pass scales_ to IntSimdMatrix::Init
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-28 20:35:53 +01:00
Stefan Weil
fe76142a3d
Remove GenericVector::scale() again
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-28 16:24:59 +01:00
zdenop
5761880676
Merge pull request #3141 from stweil/invert
...
Modify OCR for inverted text
2020-10-27 08:57:21 +01:00
Stefan Weil
eaf72ace31
Prefer result from inverted image if the mean confidence is better
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-26 20:37:47 +01:00
Stefan Weil
cfb1fb2540
Try OCR on inverted line only if mean confidence is below 50 %
...
The old code looked for the minimum confidence which triggered
very often a 2nd OCR without improving the result.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-10-26 09:32:09 +01:00
zdenop
11297c983e
Merge pull request #3130 from robinwatts/pushback15
...
Tweak SIMDDetect for ANDROID Neon.
2020-10-19 18:21:00 +02:00
Robin Watts
436008bd37
Tweak SIMDDetect for ANDROID Neon.
...
cpufeatures.h should be cpu-features.h, with the latest NDK
at least. The #if 0'd section is not required because armv8
always includes NEON.
2020-10-19 12:04:29 +01:00
zdenop
514a7893f4
Merge pull request #2994 from robinwatts/pushback11
...
Improve speed of tesseract by optimising for intSimdMatrix case
2020-10-17 17:19:49 +02:00
acoder77
ac661414b5
Create .gitattributes for cross os contributors
...
With this set, Windows users will have text files converted from Windows style line endings (\r\n) to Unix style line endings (\n) when they’re added to the repository.
https://www.edwardthomson.com/blog/git_for_windows_line_endings.html
2020-10-17 11:23:42 +05:30
Robin Watts
db10c7b577
intsimdmatrixneon.cpp: Do biasing in SIMD.
2020-10-12 04:30:46 -07:00
Robin Watts
d1e49d6dd2
intsimdmatrixavx2: Do biasing in SIMD.
...
We also move to relying on both scales and output having been
padded to accomodate us writing more results than are actually
needed here. This was allowed for a few commits back.
2020-10-12 04:30:46 -07:00
Robin Watts
872816897a
Rejig intsimdmatrix to reduce FP ops.
...
Avoid 1) floating point division by 127, 2) conversion of
bias to double, 3) FP addition, in favour of 1) integer
multiplication by 127, and 2) integer addition.
(Also costs extra work in the serialisation/deserialisation of
the scale values, and conversion of weights to int formats, but
these are all one offs).
2020-10-12 04:30:46 -07:00
Robin Watts
aba1800f69
Round output buffers for intSimdMatrix.
...
In order to allow intSimdMatrix implementations to 'overwrite'
their outputs, ensure that the output buffers are always padded
to the next block size.
This doesn't make any difference yet, but it enables optimisations
further down the line, especially when the biasing is pulled into
the SIMD.
2020-10-12 11:47:16 +01:00
Robin Watts
9dfdac51c6
Tweak scales array for intSimdMatrix case.
...
Currently, the size of the scales array is not rounded up
in the same way as the weights are. This blocks us pushing
the scale calculations into the SIMD, as when we "overread"
the end of the scale array, we potentially get errors.
Here, we adjust the intSimdMatrix stuff to ensure that the
scales array reserves enough entries to allow such overreads
to work.
This doesn't make any difference for now, but opens the way
for future optimisations.
2020-10-12 11:47:16 +01:00
Shatur95
5a377707e0
Generate imported target automatically
2020-10-12 11:47:16 +01:00
Shatur95
8dad1e24a2
Modernize CMake config files
2020-10-12 11:47:16 +01:00
amitdo
958f23453e
Improve disabled legacy engine build
2020-10-12 11:47:16 +01:00
amitdo
06154e028b
Improve disabled legacy engine build
2020-10-12 11:47:16 +01:00
amitdo
e81b485066
Improve disabled legacy engine build
2020-10-12 11:47:15 +01:00
amitdo
7df4918644
Improve disabled legacy engine build
2020-10-12 11:47:15 +01:00
Shatur95
ec8766ce74
Use DESTINATION instead of TYPE
...
For compatibility with older CMake.
2020-10-12 11:47:15 +01:00
zdenop
ec01b51a0f
Merge pull request #3119 from Shatur95/modernize-cmake-config
...
Modernize CMake Config files
2020-10-10 12:52:53 +02:00
zdenop
e5d6e90440
Merge pull request #3120 from amitdo/legacy
...
Improve disabled legacy engine build
2020-10-10 11:06:46 +02:00
amitdo
b378ebff2e
Improve disabled legacy engine build
2020-10-10 04:49:52 +03:00
amitdo
50ca49a917
Improve disabled legacy engine build
2020-10-10 02:53:38 +03:00
amitdo
f4744de78b
Improve disabled legacy engine build
2020-10-10 02:20:51 +03:00