Commit Graph

1742 Commits

Author SHA1 Message Date
Alexander Alekhin
d2cacac07a Merge pull request #15573 from alalek:build_cxx11_warnings 2019-09-24 22:08:55 +00:00
Wenzhao Xiang
c2096771cb Merge pull request #15371 from Wenzhao-Xiang:gsoc_2019
[GSoC 2019] Improve the performance of JavaScript version of OpenCV (OpenCV.js)

* [GSoC 2019]

Improve the performance of JavaScript version of OpenCV (OpenCV.js):
1. Create the base of OpenCV.js performance test:
     This perf test is based on benchmark.js(https://benchmarkjs.com). And first add `cvtColor`, `Resize`, `Threshold` into it.
2. Optimize the OpenCV.js performance by WASM threads:
     This optimization is based on Web Worker API and SharedArrayBuffer, so it can be only used in browser.
3. Optimize the OpenCV.js performance by WASM SIMD:
     Add WASM SIMD backend for OpenCV Universal Intrinsics. It's experimental as WASM SIMD is still in development.

* [GSoC2019] 

1. use short license header
2. fix documentation node issue
3. remove the unused `hasSIMD128()` api

* [GSoC2019]

1. fix emscripten define
2. use fallback function for f16

* [GSoC2019]

Fix rebase issue
2019-09-24 16:30:42 +03:00
Alexander Alekhin
3cf9185159 Merge pull request #15538 from terfendail:wui_checkany 2019-09-23 15:52:24 +00:00
Maksim Shabunin
c8abf2ad14 backport: fixed warnings produced by clang-9.0.0
ea3dc78986
83fc27cb99
2019-09-23 18:36:18 +03:00
mipsopen-fwu
b1ea91d8bd Merge pull request #15422 from mipsopen-fwu:msa-dev
* Added MSA implementations for mips platforms. Intrinsics for MSA and build scripts for MIPS platforms are added.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Removed some unused code in mips.toolchain.cmake.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Added comments for mips toolchain configuration and disabled compiling warnings for libpng.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Fixed the build error of unsupported opcode 'pause' when mips isa_rev is less than 2.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* 1. Removed FP16 related item in MSA option defines in OpenCVCompilerOptimizations.cmake.
2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h.
3. Removed hasSIMD128() in intrin_msa.hpp.
4. Define CPU_MSA as 150.
Signed-off-by: Fei Wu <fwu@wavecomp.com>

* 1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp.
2. Removed unnecessary CV_MSA related code block in dotProd_8u().

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* 1. Defined CPU_MSA_FLAGS_ON as "-mmsa".
2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h.

Signed-off-by: Fei Wu <fwu@wavecomp.com>
2019-09-20 19:52:48 +03:00
Vitaly Tuzov
66842f5a18 Extended v_check_any/v_check_all universal intrinsics to support 64-bit integer 2019-09-19 18:31:31 +03:00
Paul E. Murphy
b465c82696 core: workaround old gcc vec_mul{e,o} (Issue #15506)
ISA 2.07 (aka POWER8) effectively extended the expanding multiply
operation to word types. The altivec intrinsics prior to gcc 8 did
not get the update.

Workaround this deficiency similar to other fixes.

This was exposed by commit 33fb253a66
which leverages the int -> dword expanding multiply.

This fixes Issue #15506
2019-09-12 09:54:02 -05:00
Alexander Alekhin
9ef5373776 Merge pull request #15435 from alalek:update_version_3.4.8-pre 2019-09-03 12:04:23 +00:00
Alexander Alekhin
abd7d63b74 Merge pull request #15424 from mshabunin:add-cmake-docs 2019-09-03 10:50:45 +00:00
Alexander Alekhin
0fda243a05 pre: OpenCV 3.4.8 (version++) 2019-09-02 14:20:49 +03:00
Alexander Alekhin
048ddbf9ee Merge pull request #15339 from pmur:dotprod-32s-vsx 2019-08-31 11:16:04 +00:00
Maksim Shabunin
f3aab47f94 Assorted documentation fixes
* removed private flann documentation
* common tutorial images moved to doc/images
* grouping issues
2019-08-31 01:50:11 +03:00
Alexander Alekhin
f224d740a3 Merge pull request #15414 from kuzi117:instr 2019-08-30 12:03:19 +00:00
Braedy Kuzma
9bf8b496d6 Use commonly supported instruction mnemonic. 2019-08-29 10:00:40 -06:00
Braedy Kuzma
d4120dd2fe Disambiguate vecpopcnt for (u)dword2. 2019-08-29 09:54:56 -06:00
Alexander Alekhin
ca7640e10f Merge pull request #15401 from ChipKerchner:vectorReduceInt8Bug 2019-08-27 19:59:39 +00:00
ChipKerchner
70b883cfeb Fix macro bug with v_reduce_min and v_reduce_max for chars in VSX 2019-08-27 11:38:53 -05:00
Vitaly Tuzov
1b40528e1a Fix for AVX2 implementation of v_check_any(), v_check_all() intrinsics 2019-08-27 14:31:23 +03:00
Alexander Alekhin
d7409604b5 core: handle empty Mat in Mat_ assignment operators 2019-08-23 16:54:24 +03:00
Alexander Alekhin
8a0b93bc4d core: update fastmath.hpp 2019-08-22 16:43:07 +03:00
Zyrin
869ea22f34 Use std::move in Mat_<T> move constructors 2019-08-21 11:12:00 +02:00
Zyrin
8ef8088686 Fix stack overflow on gcc with c++17 (#15343) 2019-08-21 10:57:03 +02:00
Paul E. Murphy
33fb253a66 core: vectorize dotProd_32s
Use 4x FMA chains to sum on SIMD 128 FP64 targets. On
x86 this showed about 1.4x improvement.

For PPC, do a full multiply (32x32->64b), convert to DP
then accumulate. This may be slightly less precise for
some inputs. But is 1.5x faster than the above which
is about 1.5x than the FMA above for ~2.5x speedup.
2019-08-20 15:28:36 -05:00
luz.paz
fcc7d8dd4e Fix modules/ typos
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`

backporting of commit: ec43292e1e
2019-08-16 17:34:29 +03:00
Alexander Alekhin
13ecd5bb25 Merge pull request #15122 from pmur:fast-math-improvements 2019-08-14 19:28:05 +00:00
Alexander Alekhin
32772a5436 3.4: backported changes from 'master' branch 2019-08-14 16:36:08 +03:00
Paul E. Murphy
f38a61c66d fast_math: implement optimized PPC routines
Implement cvRound using inline asm. No compiler support
exists today to properly optimize this. This results in
about a 4x speedup over the default rounding. Likewise,
simplify the growing number of rounding function overloads.

For P9 enabled targets, utilize the classification
testing instruction to test for Inf/Nan values. Operation
speedup is about 1.2x for FP32, and 1.5x for FP64 operands.

For P8 targets, fallback to the GCC nan inline. It provides
a 1.1/1.4x improvement for FP32/FP64 arguments.
2019-08-07 15:01:18 -05:00
Paul E. Murphy
3f92bcc11a fast_math: selectively use GCC rounding builtins when available
Add a new macro definition OPENCV_USE_FASTMATH_GCC_BUILTINS to enable
usage of GCC inline math functions, if available and requested by the
user.

Likewise, enable it for POWER. This is nearly always a substantial
improvement over using integer manipulation as most operations can
be done in several instructions with no branching. The result is a
1.5-1.8x speedup in the ceil/floor operations.

1. As tested with AT 12.0-1 (GCC 8.3.1) compiler on P9 LE.
2019-08-07 15:01:18 -05:00
Alexander Alekhin
821f17d666 Merge pull request #15235 from pmur:vsx-v_signmask-vbpermq 2019-08-06 20:09:22 +00:00
Paul E. Murphy
1031b7f4bc hal: vsx: further optimize v_signmask
Use the quadword bit permutation instruction to creatively move
the sign bits to create the mask. Note that values above 127 will
result in 0.
2019-08-05 09:00:22 -05:00
Alexander Alekhin
2693ed9b22 Merge tag '3.4.7' 2019-07-25 19:19:49 +00:00
Alexander Alekhin
4a7ca5a291 OpenCV version++ (3.4.7)
OpenCV 3.4.7
2019-07-25 19:01:19 +00:00
Alexander Alekhin
6158bd2afa Merge pull request #15103 from alalek:simd_intrinsics_in_user_code 2019-07-25 11:36:36 +00:00
Hugo Lindström
2ee00e7f7d Merge pull request #15059 from hugolm84:improved-support-for-wince
* Improve support for Windows Embedded Compact

* Remove redundant set(WINCE true) and format CMake
2019-07-24 23:12:09 +03:00
Alexander Alekhin
8bac8b513c core: support SIMD intrinsics in user code 2019-07-19 20:33:32 +00:00
Alexander Alekhin
002904e445 Merge pull request #15050 from alalek:core_fix_base64_packed_struct 2019-07-18 19:07:06 +00:00
Alexander Alekhin
4ea8526e9f core(persistence): fix writeRaw() / readRaw() struct support
- writeRaw(): support structs
- readRaw(): 'len' is buffer limit in bytes (documentation is fixed)
2019-07-16 14:03:39 +03:00
Hugo Lindström
245c256b1c Support compiliation for <=VS13 2019-07-12 19:02:36 +02:00
Alexander Alekhin
69560588fe Merge pull request #14953 from alalek:core_static_analysis_eval_expr 2019-07-02 09:44:29 +00:00
Vitaly Tuzov
9befb7a1d7 Merge pull request #14916 from terfendail:wsignmask_deprecated
* Avoid using v_signmask universal intrinsic and mark it as deprecated

* Renamed v_find_negative to v_scan_forward
2019-07-01 19:53:51 +03:00
Alexander Alekhin
44836c7f78 core: evaluate CV_Error() parameters during static scans 2019-07-01 18:17:03 +03:00
Stefan Brüns
e9a2e665b2 Explicitly default operator= for Vec<T, n>
Due to the explicitly declared copy constructor Vec<T, n>::Vec(Vec <T,n>&)
GCC 9 warns if there is no assignment operator, as having one typically
requires the other (rule-of-three, constructor/desctructor/assginment).

As the values are just a plain array the default assignment operator does
the right thing. Tell the compiler explicitly to default it.

Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
2019-06-29 22:11:00 +02:00
Rostislav Vasilikhin
f2f600f807 fixed multi instrumentations 2019-06-27 01:17:26 +03:00
Alexander Alekhin
e8a703a71d core(intrin): v_load_low() workaround for aarch64+clang 2019-06-25 17:29:04 +03:00
Alexander Alekhin
779f59da6b pre: OpenCV 3.4.7 (version++) 2019-06-21 16:57:17 +03:00
Alexander Alekhin
aa6c66aa54 Merge pull request #14848 from alalek:build_warnings_avx512 2019-06-21 13:53:52 +00:00
Alexander Alekhin
5ac55fc132 core: eliminate AVX512 build warnings
from MSVS2017 and GCC8 -O1 mode
2019-06-20 20:00:09 +03:00
Alexander Alekhin
681e0323f2 core: backport toLowerCase()/toUpperCase() 2019-06-20 17:48:18 +03:00
Vitaly Tuzov
a29e59a770 Rename parameters in AVX512 implementation of v_load_deinterleave and v_store_interleave 2019-06-14 14:16:30 +03:00
Vitaly Tuzov
d2aadabc5e Merge pull request #14743 from terfendail:wui512_fixvswarn
Fix for MSVS2019 build warnings (#14743)

* AVX512 arch support for MSVS

* Fix for MSVS2019 build warnings: updated integral() AVX512 implementation

* Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation

* fix indentation
2019-06-11 23:07:39 +03:00