Commit Graph

4527 Commits

Author SHA1 Message Date
Alexander Alekhin
43940f7ffc pre: OpenCV 3.4.15 (version++) 2021-06-07 20:10:34 +00:00
Developer-Ecosystem-Engineering
814550d2a6
Merge pull request #20011 from Developer-Ecosystem-Engineering:3.4
Improve performance on Arm64

* Improve performance on Apple silicon

This patch will
- Enable dot product intrinsics for macOS arm64 builds
- Enable for macOS arm64 builds
- Improve HAL primitives
  - reduction (sum, min, max, sad)
  - signmask
  - mul_expand
  - check_any / check_all

Results on a M1 Macbook Pro

* Updates to #20011 based on feedback

  - Removes Apple Silicon specific workarounds
  - Makes #ifdef sections smaller for v_mul_expand cases
  - Moves dot product optimization to compiler optimization check
  - Adds 4x4 matrix transpose optimization

* Remove dotprod and fix v_transpose

Based on the latest, we've removed dotprod entirely and will revisit in a future PR.

Added explicit cats with v_transpose4x4()

This should resolve all opens with this PR

* Remove commented out lines

Remove two extraneous comments
2021-06-01 09:39:55 +03:00
Alexander Alekhin
450dc92452 Merge pull request #20172 from alalek:fixup_19334 2021-05-28 14:09:52 +00:00
Alexander Alekhin
3d394943e6 core(ocl): avoid limit of Image kernel args 2021-05-28 00:43:59 +00:00
berak
302c2354a3 core: add missing implementation for Mat::ptr(Vec) 2021-05-09 14:15:12 +02:00
Alexander Alekhin
f78cebfc98 Merge pull request #20014 from alalek:fix_core_tls_process_termination 2021-04-30 16:06:40 +00:00
Alexander Alekhin
d2a9ca13f1 core(tls): handle process termination / cleanup issues 2021-04-29 23:25:44 +00:00
Zhuo Zhang
bf26050f7e
Fix missing return type for unsafe CV_XADD function 2021-04-26 20:08:45 +08:00
Alexander Alekhin
0649a2fbdb Merge pull request #19886 from alalek:issue_19875 2021-04-14 16:14:44 +00:00
Alexander Alekhin
63ba9970bd Merge pull request #19851 from sturkmen72:update_documentation 2021-04-11 21:44:03 +00:00
Alexander Alekhin
222af8e7e4 core: avoid process cleanup deadlock if TlsStorage is not used 2021-04-09 16:08:08 +00:00
Suleyman TURKMEN
ec8b7c933a Update Documentation 2021-04-08 22:29:45 +03:00
Alexander Alekhin
3a8154051f Merge pull request #19810 from aarongreig:aaron/core/relaxClArithmTest 2021-04-06 19:56:46 +00:00
Aaron Greig
f3f46096d6 Relax accuracy requirements in the OpenCL sqrt perf arithmetic test.
Also bring perf_imgproc CornerMinEigenVal accuracy requirements in line with
the test_imgproc accuracy requirements on that test and fix indentation on
the latter.

Partially addresses issue #9821
2021-04-06 17:32:48 +01:00
Alexander Alekhin
2cf1a13755 Merge tag '3.4.14' 2021-04-02 09:31:32 +00:00
Alexander Alekhin
d0e3e638c3 release: OpenCV 3.4.14 2021-04-01 21:37:19 +00:00
Alexander Alekhin
b26f5b9468 core: backward compatibility for vx_store/vx_store_aligned calls 2021-04-01 02:17:47 +00:00
Alexander Alekhin
908957317f Merge pull request #19813 from alalek:issue_19506 2021-03-31 22:57:50 +00:00
Alexander Alekhin
f82303d614 Merge pull request #19811 from alalek:issue_19599 2021-03-31 22:56:48 +00:00
Alexander Alekhin
8069a6b4f8 core(IPP): disable some ippsMagnitude_32f calls 2021-03-31 13:38:57 +00:00
Alexander Alekhin
a2a92999be core(arithm_op): workaround problem with scalars handling 2021-03-31 10:35:52 +00:00
Vitaly Tuzov
aab62aa6dd
Merge pull request #18952 from terfendail:wui_doc
* Updated UI documentation to address WUI

* Added documentation for vx_ calls

* Removed vx_store operation overload

* Doxyfile updated to enable wide UI

* Enable doxygen documentation for vx_ WUI functions

* Wide intrinsics definition rework

* core: fix SIMD C++ emulator build (supports 128-bit only)
2021-03-30 16:18:03 +00:00
Alexander Alekhin
6e8022a3af Merge pull request #19773 from jondea:add-aarch64-specialised-v_expand-3.4 2021-03-26 16:54:51 +00:00
Mikhail Nikolskii
bf9f67e93f
Merge pull request #19783 from mikhail-nikolskiy:interop-perf
Performance optimization in DirectX and VAAPI interop

* optimization in OpenCL NV12<>BGR kernels

* reduce kernel work-size
2021-03-25 21:27:31 +00:00
Jonathan Deakin
29a289dfa1 Add v_expand for AArch64, fuse vmovl+vget_high into vmovl_high 2021-03-23 15:06:41 +00:00
Alexander Alekhin
a97f6f8058 js: support setLogLevel() / getLogLevel() calls 2021-03-20 18:14:10 +00:00
Alexander Alekhin
7a8e171691 Merge pull request #19720 from alalek:ocl_test_skip_spir_amd 2021-03-13 12:48:20 +00:00
Alexander Alekhin
7ca9740da5 Merge pull request #19718 from alalek:backport_19683 2021-03-13 12:46:24 +00:00
Alexander Alekhin
87e607a19b core(ocl): skip SPIR test on AMD devices if problem detected 2021-03-13 06:12:52 +00:00
Dale Phurrough
cbe236652b noexcept def construct Mat, UMat, Mat_, MatSize, MatStep
original commit: 1b0f781b7c
2021-03-12 20:26:32 +00:00
Dan Ben Yosef
d4d805cb3e Avoiding copy by passing param by reference
It is best to pass bad_value_ param by reference to avoid copy.
2021-03-12 14:17:11 -05:00
Sayed Adel
f8181fbef8 core:ppc64 fix detecting CPU features when optimization is off 2021-03-12 02:02:31 +00:00
Sayed Adel
84fcc4ab9b core:ppc64 fix the build with the newer versions of Eigen on IBM/Power
It also fixes the build when universal intrinsics is disabled
   via `-DDCV_ENABLE_INTRINSICS=OFF`.
2021-03-09 19:20:18 +02:00
Vitaly Tuzov
04a9ff88d8
Merge pull request #19622 from terfendail:ref_doc
* Updated cpp reference implementations for a few intrinsics to address wide universal intrinsics as well

* Updated cpp reference implementations for a few more universal intrinsics
2021-03-06 17:22:21 +00:00
Mradul Agrawal
640f188ca2
Merge pull request #19583 from theroyalpekka:patch-1
* Update polynom_solver.cpp

This pull request is in the response to Issue  #19526. I have fixed the problem with the cube root calculation of 2*R. The Issue was in the usage of pow function with negative values of R, but if it is calculated for only positive values of R then changing x0 according to the parity of R, the Issue is resolved. Kindly consider it, Thanks!

* add cv::cubeRoot(double)

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2021-03-05 13:55:52 +00:00
Alexander Alekhin
cbfd38bd41 core: rework code locality
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)

backport is done to improve merge experience (less conflicts)
backport of commit: 65eb946756
2021-03-02 23:24:28 +00:00
Alexander Alekhin
a123c48d4d pre: OpenCV 3.4.14 (version++) 2021-03-02 20:47:29 +00:00
Federico Martinez
773262bc09 Fix UB in CopyMakeConstBoder_8u
Caused by overflow of arithmetic operators conversion rank
2021-02-26 19:15:50 +00:00
Alexander Alekhin
67b6ef4c2a Merge pull request #19503 from komakai:fix-android-putget 2021-02-24 21:07:13 +00:00
Alexander Alekhin
7ffc4b57aa Merge pull request #19535 from alalek:issue_18897 2021-02-23 22:42:51 +00:00
Alexander Alekhin
309e1e2b1d core(InputArray): replace STD_ARRAY to MATX
- remove duplication kind
2021-02-21 19:12:21 +00:00
Dale Phurrough
4badf640bf add noexcept to default constructors of cv::ocl
- follows iso c++ guideline C.44
- enables default compiler-created constructors to
  also be noexcept

original commit: 77e26a7db3

- handled KernelArg, Image2D
2021-02-20 16:20:25 +00:00
Giles Payne
5cf08b0722 Fix/optimize Android put/get functions 2021-02-19 17:10:11 +09:00
Alexander Alekhin
6d3502833f core: include version.hpp in cvdef.h, fix precomp.hpp usage 2021-02-16 11:10:45 +00:00
Zhuo Zhang
743099f9f9
Merge pull request #19521 from zchrissirhcz:3.4-fix-core-module-android-arm64-build
* fix core module android arm64 build

* fix core module android build when neon is off

When building for Android ARM platform, cmake with
`-D CV_DISABLE_OPTIMIZATION=ON`, the expected behavior is
not using ARM NEON, using naive computation instead.

This commit fix the un-expected compile error for neon intrinsincs.
2021-02-14 21:37:11 +03:00
Francesco Petrogalli
6ee23c9b85
Merge pull request #19486 from fpetrogalli:dotprod_fast-3.4
* [hal][neon] Optimize the v_dotprod_fast intrinsics for aarch64.

On Armv8 in AArch64 execution mode, we can skip the sequence

   v<op>_<ty>(vget_high_<ty>(x), vget_high_<ty>(y))

in favour of

   v<op>_high_<ty>(x, y)

This has better changes for recent compilers to use less data movement
operations and better register allocation. See for example:

   https://godbolt.org/z/bPq7vd

* [hal][neon] Fix build failure on armv7.

* [hal][neon] Address review comments in PR.

PR: https://github.com/opencv/opencv/pull/19486

* [hal][neon] Define macro to check for the AArch64 execution state of Armv8.

* [hal][neon] Fix macro definition for AArch64.

The fix is needed to prevent warnings when building for Armv7.
2021-02-11 13:24:09 +00:00
Vincent Rabaud
847b16fb76 Disable thread sanitization when CV_USE_GLOBAL_WORKERS_COND_VAR is not set.
This fixes #19463
2021-02-09 14:12:39 +01:00
Pavel Rojtberg
6c1a433c4c python: also catch general c++ exceptions
they might be thrown from third-party code (notably Ogre in the ovis
module).
While Linux is kind enough to print them, they cause instant termination
on Windows.
Arguably, they do not origin from OpenCV itself, but still this helps
understanding what went wrong when calling an OpenCV function.
2021-02-02 21:16:01 +01:00
Alexander Alekhin
5ab4623c2a Merge pull request #19430 from alalek:fixup_19216 2021-01-31 17:41:24 +00:00
Alexander Alekhin
cdf73f2e05 Merge pull request #19427 from alalek:issue_19426 2021-01-31 14:24:37 +00:00
Alexander Alekhin
30bef20e22 js: fix SIMD build 2021-01-31 00:12:51 +00:00
Alexander Alekhin
c5bf15e009 build: fix cv2.cpp compilation 2021-01-30 11:32:27 +00:00
Alexander Alekhin
857f339914 Merge pull request #19385 from alalek:ocl_isOpenCLActivated_update 2021-01-25 13:54:00 +00:00
Alexander Alekhin
62b60b11bb Merge pull request #19344 from VadimLevin:dev/vlevin/generic-sequence-conversion 2021-01-25 08:22:57 +00:00
Vadim Levin
1d3207d7c7 feat: common fixed size sequence conversion for Python bindings 2021-01-25 08:08:38 +03:00
Alexander Alekhin
37e656082b core(ocl): update isOpenCLActivated()
- reuse g_isOpenCLAvailable variable instead
2021-01-24 01:25:17 +00:00
Alexander Alekhin
413febf657 Merge pull request #19334 from alalek:fix_19134 2021-01-22 20:05:58 +00:00
Alexander Alekhin
6ce9bb6f7a Merge pull request #19312 from VadimLevin:dev/vlevin/clear-msg-for-failed-overload-resolution 2021-01-18 20:14:10 +00:00
Vadim Levin
a0bdb78a99 feat: add overload resolution exception for Python bindings 2021-01-18 16:29:17 +03:00
Alexander Alekhin
212815a10d core(ocl): fix lifetime handling of Image kernel args 2021-01-18 06:24:36 +00:00
Vitaly Tuzov
8f653ba8de Inlined WASM fallback intrinsics to avoid using of V_TypeTraits 2021-01-11 18:12:21 +03:00
Alexander Alekhin
68fb8dd873 Merge tag '3.4.13' 2020-12-21 14:55:54 +00:00
Alexander Alekhin
8869dc7762 release: OpenCV 3.4.13 2020-12-20 22:15:49 +00:00
Alexander Alekhin
dd276dbb59 Merge pull request #19176 from alalek:issue_19131 2020-12-20 16:40:28 +00:00
Alexander Alekhin
663bd73518 Merge pull request #19164 from fpetrogalli:tranform_16u 2020-12-20 16:38:59 +00:00
Francesco Petrogalli
c526705f4f [cv::transform] Enable CV_SIMD for the 16U case on AArch64. 2020-12-20 15:58:21 +00:00
Alexander Alekhin
3359bdc464 docs(core): fix process_video_frame() code snippet 2020-12-20 02:27:46 +00:00
Vincent Rabaud
4c75b1c102 Fix comment typos. 2020-12-19 08:22:37 +01:00
Alexander Alekhin
b2ea15da35 Merge pull request #19137 from VadimLevin:dev/vlevin/safe-string-conversion 2020-12-18 11:20:50 +00:00
Vincent Rabaud
8391a23600 Optimize calls to std::string::find() and friends for a single char.
The character literal overload is more efficient. More info at:

http://clang.llvm.org/extra/clang-tidy/checks/performance-faster-string-find.html
2020-12-17 09:39:23 +01:00
Alexander Alekhin
d159417474 Merge pull request #19101 from alalek:issue_5209 2020-12-16 22:13:18 +00:00
Vadim Levin
7b0d7d0c9a fix: conversion to string in python bindings
If provided `PyObject` can't be converted to string `TypeError` is
 reported instead of `SytemError` without any message.
2020-12-16 15:11:58 +03:00
Alexander Alekhin
7631056b8a Merge pull request #19114 from alalek:issue_18937 2020-12-15 20:47:05 +00:00
Alexander Alekhin
c240355cc6 dnn(ocl): avoid mess FP16/FP32 in convolution layer 2020-12-15 08:51:24 +00:00
Alexander Alekhin
4b3d2c8834 dnn(ocl): fix gemm kernels with beta=0
- dst is not initialized, may include NaN values
- 0*NaN produces NaN
2020-12-15 00:58:43 +00:00
Alexander Alekhin
392991fa0b core(opencl): add version check before clCreateFromGLTexture() call 2020-12-13 20:57:26 +00:00
Alexander Alekhin
d2bc0e5fe0 js(wasm): use fallback on missing intrinsics in Emscripten 2.0.0+ 2020-12-09 04:19:53 +00:00
Alexander Alekhin
26e8048a0a core: update handling of allocator stats type
- don't use OPENCV_ALLOCATOR_STATS_COUNTER_TYPE definition in non C++11 builds
- don't use with MinGW
2020-12-05 20:54:47 +00:00
Alexander Alekhin
e309ad8465 Merge pull request #18994 from alalek:umat_drop_unavailable_methods 2020-12-02 22:54:47 +00:00
Alexander Alekhin
e958600f32 Merge pull request #18986 from alalek:fix_ipp_17453_2 2020-12-02 19:09:24 +00:00
Alexander Alekhin
484251c52b Merge pull request #18831 from rjiejie:master-opt@pipeline 2020-12-02 19:07:38 +00:00
Alexander Alekhin
6f8120cb3a core(UMat): drop unavailable methods 2020-12-02 15:02:43 +00:00
Alexander Alekhin
d35e2f5339 core(ipp): workaround getIppTopFeatures() value mismatch 2020-12-02 11:33:55 +00:00
Alexander Alekhin
91ce6ef190 core(ipp): disable SSE4.2 code path in countNonZero() 2020-12-01 14:01:42 +00:00
Alexander Alekhin
8c5b3c4150 Merge pull request #17077 from i386x:check-negative-values 2020-11-26 15:07:58 +00:00
Or Avital
5a3a915a9b Remove unnecessary condition (will never reach) 2020-11-22 14:19:20 +02:00
Jiri Kucera
ce31c9c448 core(matrix): Negative values checks
Add checks that prevents indexing an array by negative values.
2020-11-20 22:51:06 +01:00
Jojo R
12b8d542b7 norm.cpp(normL2Sqr_): improve performance of pipeline
The most of target machine use one type cpu unit resource
to execute some one type of instruction, e.g.
all vx_load API use load/store cpu unit,
and v_muladd API use mul/mula cpu unit, we interleave
vx_load and v_muladd to improve performance on most targets like
RISCV or ARM.
2020-11-19 09:49:49 +08:00
Alexander Alekhin
328883b6ea Merge pull request #18675 from sturkmen72:update-documentation 2020-11-18 16:50:35 +00:00
Suleyman TURKMEN
cc7f17f011 update documentation 2020-11-18 17:07:04 +03:00
Alexander Alekhin
9485113923 pre: OpenCV 3.4.13 (version++) 2020-11-17 21:50:30 +00:00
Alexander Alekhin
2b558a3787 core: fix F16C compilation check 2020-11-17 12:22:49 +00:00
Alexander Alekhin
716450ceb5 Merge pull request #18158 from legrosbuffle:3.4-vectorize-dft-radix 2020-10-30 22:05:50 +00:00
Alexander Alekhin
aac7c5465b core: move inline code from mat.inl.hpp 2020-10-21 23:06:09 +00:00
Kun Liang
c82417697a
Merge pull request #18068 from lionkunonly:gsoc_2020_simd
[GSoC] OpenCV.js: WASM SIMD optimization 2.0

* gsoc_2020_simd Add perf test for filter2d

* add perf test for kernel scharr and kernel gaussianBlur

* add perf test for blur, medianBlur, erode, dilate

* fix the errors for the opencv PR robot

fix the trailing whitespace.

* add perf tests for kernel remap, warpAffine, warpPersepective, pyrDown

* fix a bug in  modules/js/perf/perf_imgproc/perf_remap.js

* add function smoothBorder in helpfun.js and remove replicated function in perf test of warpAffine and warpPrespective

* fix the trailing white space issues

* add OpenCV.js loader

* Implement the Loader with help of WebAssembly Feature Detection, remove trailing whitespaces

* modify the explantion for loader in js_setup.markdown and fix bug in loader.js
2020-10-18 20:30:36 +00:00
Alexander Alekhin
b5717f82a0 core: fix __clang_major__ typo regression 2020-10-16 15:35:51 +00:00
Alexander Alekhin
7ed82aea38 Merge tag '3.4.12' 2020-10-10 20:18:09 +00:00
Alexander Alekhin
dc15187f1b release: OpenCV 3.4.12 2020-10-10 20:14:29 +00:00
Alexander Alekhin
1ef4b7ae5a Merge pull request #18515 from alalek:test_18473 2020-10-06 19:39:28 +00:00
Alexander Alekhin
b314cc4c23 Merge pull request #18506 from alalek:issue_18472 2020-10-06 19:37:40 +00:00