Commit Graph

710 Commits

Author SHA1 Message Date
Vincent Rabaud
c8268e65fd Fix potential NaN in cv::norm.
There can be an int overflow.
cv::norm( InputArray _src, int normType, InputArray _mask ) is fine,
not cv::norm( InputArray _src1, InputArray _src2, int normType, InputArray _mask ).
2021-06-15 14:58:11 +02:00
Developer-Ecosystem-Engineering
814550d2a6
Merge pull request #20011 from Developer-Ecosystem-Engineering:3.4
Improve performance on Arm64

* Improve performance on Apple silicon

This patch will
- Enable dot product intrinsics for macOS arm64 builds
- Enable for macOS arm64 builds
- Improve HAL primitives
  - reduction (sum, min, max, sad)
  - signmask
  - mul_expand
  - check_any / check_all

Results on a M1 Macbook Pro

* Updates to #20011 based on feedback

  - Removes Apple Silicon specific workarounds
  - Makes #ifdef sections smaller for v_mul_expand cases
  - Moves dot product optimization to compiler optimization check
  - Adds 4x4 matrix transpose optimization

* Remove dotprod and fix v_transpose

Based on the latest, we've removed dotprod entirely and will revisit in a future PR.

Added explicit cats with v_transpose4x4()

This should resolve all opens with this PR

* Remove commented out lines

Remove two extraneous comments
2021-06-01 09:39:55 +03:00
Alexander Alekhin
450dc92452 Merge pull request #20172 from alalek:fixup_19334 2021-05-28 14:09:52 +00:00
Alexander Alekhin
3d394943e6 core(ocl): avoid limit of Image kernel args 2021-05-28 00:43:59 +00:00
berak
302c2354a3 core: add missing implementation for Mat::ptr(Vec) 2021-05-09 14:15:12 +02:00
Alexander Alekhin
908957317f Merge pull request #19813 from alalek:issue_19506 2021-03-31 22:57:50 +00:00
Alexander Alekhin
f82303d614 Merge pull request #19811 from alalek:issue_19599 2021-03-31 22:56:48 +00:00
Alexander Alekhin
8069a6b4f8 core(IPP): disable some ippsMagnitude_32f calls 2021-03-31 13:38:57 +00:00
Alexander Alekhin
a2a92999be core(arithm_op): workaround problem with scalars handling 2021-03-31 10:35:52 +00:00
Vitaly Tuzov
aab62aa6dd
Merge pull request #18952 from terfendail:wui_doc
* Updated UI documentation to address WUI

* Added documentation for vx_ calls

* Removed vx_store operation overload

* Doxyfile updated to enable wide UI

* Enable doxygen documentation for vx_ WUI functions

* Wide intrinsics definition rework

* core: fix SIMD C++ emulator build (supports 128-bit only)
2021-03-30 16:18:03 +00:00
Alexander Alekhin
87e607a19b core(ocl): skip SPIR test on AMD devices if problem detected 2021-03-13 06:12:52 +00:00
Alexander Alekhin
309e1e2b1d core(InputArray): replace STD_ARRAY to MATX
- remove duplication kind
2021-02-21 19:12:21 +00:00
Vincent Rabaud
8391a23600 Optimize calls to std::string::find() and friends for a single char.
The character literal overload is more efficient. More info at:

http://clang.llvm.org/extra/clang-tidy/checks/performance-faster-string-find.html
2020-12-17 09:39:23 +01:00
Alexander Alekhin
8c5b3c4150 Merge pull request #17077 from i386x:check-negative-values 2020-11-26 15:07:58 +00:00
Jiri Kucera
ce31c9c448 core(matrix): Negative values checks
Add checks that prevents indexing an array by negative values.
2020-11-20 22:51:06 +01:00
Alexander Alekhin
2b558a3787 core: fix F16C compilation check 2020-11-17 12:22:49 +00:00
Alexander Alekhin
f30aafc3cc core(test): regression test for 18473 2020-10-05 17:14:22 +00:00
Alexander Alekhin
b3755e617c ocl: silence warning in case of async cleanup
- OpenCL kernel cleanup processing is asynchronous and can be called even after forced clFinish()
- buffers are released later in asynchronous mode
- silence these false positive cases for asynchronous cleanup
2020-08-20 19:33:37 +00:00
Maksim Shabunin
59608907b8 Added countNonZero test for big arrays and disable IPP for some cases 2020-06-03 18:58:41 +03:00
Josh Bradley
9fef09fe89
Merge pull request #17320 from jgbradley1:add-eigen-tensor-conversions
* add eigen tensor conversion functions

* add eigen tensor conversion tests

* add support for column major order

* update eigen tensor tests

* fix coding style and add conditional compilation

* fix conditional compilation checks

* remove whitespace

* rearrange functions for easier reading

* reformat function documentation and add tensormap unit test

* cleanup documentation of unit test

* remove condition duplication

* check Eigen major version, not minor version

* restrict to Eigen v3.3.0+

* add documentation note and add type checking to cv2eigen_tensormap()
2020-05-23 18:25:01 +00:00
Alexander Alekhin
d7abb641ca core(test): add InputArray(MatExpr) fetch test 2020-04-10 11:35:42 +00:00
Alexander Alekhin
c920b45fb8 core(persistence): fix resource leaks - force closing files
backporting commit 673eb2b006
2020-03-25 10:49:16 +00:00
Alexander Alekhin
377dd04224 core: fix .begin()/.end() of empty Mat 2020-03-20 14:08:45 +00:00
Alexander Alekhin
3a2f40ac6f core: don't allow reallocation in add/div/sub/bitwise aug operators 2020-03-06 13:00:40 +00:00
Alexander Alekhin
4f288a1e28
Merge pull request #16704 from alalek:core_log_once_log_if
* core(logger): add CV_LOG_ONCE_xxx() CV_LOG_IF_xxx() macros

* core(logger): keep tests disabled
2020-03-04 20:42:41 +00:00
Alexander Alekhin
4d0f13544d
Merge pull request #16700 from alalek:fix_core_matexpr_size_gemm
core: fix MatExpr::size() for gemm()

* core(test): MatExpr::size() test for gemm()

* core: fix MatExpr::size() for gemm()
2020-03-02 17:13:02 +03:00
Alexander Alekhin
c13a62ce10 Merge pull request #16638 from mshabunin:use-safe-buffers 2020-02-26 14:54:57 +00:00
Alexander Alekhin
f48c84eaee Merge pull request #16656 from alalek:issue_16655 2020-02-26 12:47:46 +00:00
Maksim Shabunin
bf96d8239d Use BufferArea in more places 2020-02-26 11:45:19 +03:00
Alexander Alekhin
d54d01ca46 core(MatExpr): fix .type() bug 2020-02-23 17:05:05 +00:00
Alexander Alekhin
966c2191cb
Merge pull request #13928 from catree:add_matx_div_operations 2020-02-21 22:35:03 +03:00
Alexander Smorkalov
c87b99e82b Added test for new MatX division. 2020-02-21 10:08:55 +03:00
Maksim Shabunin
55cdeaa6dd BufferArea: initial version, usage in StereoBM
New class BufferArea is used to hide complexity of buffers allocations and allow instrumentation with valgrind and sanitizers.
2020-02-07 14:57:36 +03:00
Chip Kerchner
301626ba26 Merge pull request #15488 from ChipKerchner:vectorizeMinMax2
Vectorize minMaxIdx functions

* Updated documentation and intrinsic tests for v_reduce

* Add other files back in from the forced push

* Prevent an constant overflow with v_reduce for int8 type

* Another alternative to fix constant overflow warning.

* Fix another compiler warning.

* Update comments and change comparison form to be consistent with other vectorized loops.

* Change return type of v_reduce_min & max for v_uint8 and v_uint16 to be same as lane type.

* Cast v_reduce functions to int to avoid overflow. Reduce number of parameters in MINMAXIDX_REDUCE macro.

* Restore cast type for v_reduce_min & max to LaneType
2020-01-17 19:37:35 +03:00
Alexander Alekhin
e180cc050b
Merge pull request #16236 from alalek:fix_core_simd_emulator
* core: fix intrin_cpp, allow to build modules with SIMD emulator

* core(arithm): fix v_zero initialization

* core(simd): 'strict' types for binary/bitwise operations

* features2d: avoid aligned load issue in GCC 5.4 with emulated SIMD

* core(simd): alignment checks in SIMD emulator
2020-01-10 21:31:02 +03:00
Alexander Alekhin
523f081923 core(check): add Size_<int> 2019-12-28 13:50:39 +00:00
RAJKIRAN NATARAJAN
e6ce752da1 Merge pull request #15966 from saskatchewancatch:issue-15760
Add checks for empty operands in Matrix expressions that don't check properly

* Starting to add checks for empty operands in Matrix expressions that
don't check properly.

* Adding checks and delcarations for checker functions

* Fix signatures and add checks for each class of Matrix Expr operation

* Make it catch the right exception

* Don't expose helper functions to public API
2019-12-12 19:23:57 +03:00
Alexander Alekhin
70146700aa Merge pull request #15839 from alalek:core_simd_v_setall_template 2019-11-27 19:19:35 +00:00
Alexander Alekhin
3266ac7667 core(kmeans): bailout if can't select cluster center 2019-11-22 14:40:02 +00:00
Everton Constantino
75315fb297 Merge pull request #15494 from everton1984:hal_vector_get_n
Improving VSX performance of integral function

* Adding support for vector get function on VSX datatypes so the
integral function gains a bit of performance.

* Removing get as a datatype member function and implementing a new HAL
instruction v_extract_n to get the n-th element of a vector register.

* Adding SSE/NEON/AVX intrinsics.

* Implement new HAL instruction v_broadcast_element on VSX/AVX/NEON/SSE.

* core(simd): add tests for v_extract_n/v_broadcast_element

- updated docs
- commented out code to repair compilation
- added WASM and MSA default implementations

* core(simd): fix compilation

- x86: avoid _mm256_extract_epi64/32/16/8 with MSVS 2015
- x86: _mm_extract_epi64 is 64-bit only

* cleanup
2019-11-20 13:41:07 +03:00
Alexander Alekhin
e07a488012
Merge pull request #15925 from alalek:core_test_simd_cpp_emulation
core(test): extending tests with SIMD C++ emulation code (intrin_cpp.hpp)

* core(test): test SIMD CPP emulation code (intrin_cpp.hpp)

* core(simd): eliminate build warnings from intrin_cpp.hpp
2019-11-19 21:08:45 +03:00
clunietp
2185bce4b7 Fix 13577 2019-11-18 07:41:34 -05:00
TH3CHARLie
2c2716de0f core(test): add test for YAML parse multiple documents
- added removal of temporary file
2019-11-05 18:39:07 +03:00
Alexander Alekhin
a893969ec9 core(simd): v_setall template 2019-11-03 12:49:25 +00:00
Alexander Alekhin
bad4e5c3eb Merge pull request #15692 from alalek:core_tls_handle_thread_termination 2019-10-29 20:40:35 +00:00
Alexander Alekhin
17e2bf5717 core(tls): implement releasing of TLS on thread termination
- move TLS & instrumentation code out of core/utility.hpp
- (*) TLSData lost .gather() method (to dispose thread data on thread termination)
- use TLSDataAccumulator for reliable collecting of thread data
- prefer using of .detachData() + .cleanupDetachedData() instead of .gather() method

(*) API is broken: replace TLSData => TLSDataAccumulator if gather required
(objects disposal on threads termination is not available in accumulator mode)
2019-10-24 06:36:18 +00:00
Chip Kerchner
5a6a49405d Merge pull request #15738 from ChipKerchner:bugInt64x2Comparison
Fixing bug with comparison of v_int64x2 or v_uint64x2

* Casting v_uint64x2 to v_float64x2 and comparing does NOT work in all cases.  Rewrite using epi64 instructions - faster too.

* Fix bad merge.

* Fix equal comparsion for non-SSE4.1. Add test cases for v_int64x2 comparisons.

* Try to fix merge conflict.

* Only test v_int64x2 comparisons if CV_SIMD_64F

* Fix compiler warning.
2019-10-22 16:37:20 +03:00
Alexander Alekhin
bce653117f Merge pull request #15700 from alalek:issue_12943 2019-10-16 11:12:49 +00:00
Alexander Alekhin
6a7d1c15d3 core(ipp): skip huge input in flip()
- IPP/SSE4.2 works well
2019-10-14 18:26:19 +03:00
Chip Kerchner
027769bf5d Merge pull request #15662 from ChipKerchner:addVReverseIntrinsic
* New v_reverse HAL intrinsic for reversing the ordering of a vector

* Fix conflict.

* Try to resolve conflict again.

* Try one more time.

* Add _MM_SHUFFLE. Remove non-vectorize code in SSE2. Fix copy and paste issue with NEON.

* Change v_uint16x8 SSE2 version to use shuffles
2019-10-11 18:34:17 +03:00