Everton Constantino
75315fb297
Merge pull request #15494 from everton1984:hal_vector_get_n
...
Improving VSX performance of integral function
* Adding support for vector get function on VSX datatypes so the
integral function gains a bit of performance.
* Removing get as a datatype member function and implementing a new HAL
instruction v_extract_n to get the n-th element of a vector register.
* Adding SSE/NEON/AVX intrinsics.
* Implement new HAL instruction v_broadcast_element on VSX/AVX/NEON/SSE.
* core(simd): add tests for v_extract_n/v_broadcast_element
- updated docs
- commented out code to repair compilation
- added WASM and MSA default implementations
* core(simd): fix compilation
- x86: avoid _mm256_extract_epi64/32/16/8 with MSVS 2015
- x86: _mm_extract_epi64 is 64-bit only
* cleanup
2019-11-20 13:41:07 +03:00
Vitaly Tuzov
3b015dfc7d
Merge pull request #14210 from terfendail:wui_512
...
AVX512 wide universal intrinsics (#14210 )
* Added implementation of 512-bit wide universal intrinsics(WIP)
* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks
* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros
* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines
* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.
* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build
* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Brad Kelly
0fe17eeb68
Implementing AVX512 Support for 1 channel mats for CV_64F format
2019-03-22 09:44:23 -07:00
Brad Kelly
507f8add1c
Implementing AVX512 Support for 2 and 4 channel mats for CV_64F format
2019-02-19 11:31:20 -08:00
Brad Kelly
0165ffa90d
Implementing AVX512 support for 3 channel cv::integral for CV_64F
2019-01-14 16:11:01 -08:00
Vitaly Tuzov
6b84990620
integral() implementation updated to utilize wide universal intrinsics
2018-10-01 17:25:43 +03:00
Hamdi Sahloul
5d54def264
Add semicolons after CV_INSTRUMENT
macros
2018-09-14 06:45:31 +09:00
Alexander Alekhin
b09a4a98d4
opencv: Use cv::AutoBuffer<>::data()
2018-07-04 19:11:29 +03:00
Alexander Alekhin
0ed3209b00
ocl: avoid unnecessary loading/initializing OpenCL subsystem
...
If there are no OpenCL/UMat methods calls from application.
OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called
Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
Pavel Vlasov
11c2ffaf1c
Update for IPP for OpenCV 2017u2 integration;
...
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
2017-04-25 15:53:12 +03:00
Maksim Shabunin
3d5c0f1faf
HAL interface for cv::integral
2016-09-29 12:12:10 +03:00
Pavel Vlasov
30a6cee2fe
Instrumentation for OpenCV API regions and IPP functions;
2016-08-19 18:10:03 +03:00
Pavel Vlasov
680ca88ce0
Outdated ICV restrictions were removed;
2016-08-19 15:08:39 +03:00
Pavel Vlasov
89eee6ca99
Fixes for IPP integration:
...
dotProd_16s - disabled for IPP 9.0.0;
filter2D - fixed kernel preparation;
morphology - conditions fix and disabled FilterMin and FilterMax for IPP 9.0.0;
GaussianBlur - disabled for CV_8UC1 due to buffer overflow;
integral - disabled for IPP 9.0.0;
IppAutoBuffer class was added;
2015-10-12 10:51:28 +03:00
Pavel Vlasov
e837d69f8f
IPPInitSingelton was added to contain IPP related global variables;
...
OPENCV_IPP env var now allows to select IPP architecture level for IPP9+;
IPP initialization logic was unified across modules;
2015-10-01 09:58:48 +03:00
Vadim Pisarevsky
56e637d5f4
Merge pull request #4135 from lupustr3:ipp_code_refactoring
2015-06-24 16:18:55 +00:00
Dmitry Budnikov
a5a21019b2
ipp_countNonZero build fix;
...
Removed IPP port for tiny arithm.cpp functions
Additional warnings fix on various platforms.
Build without OPENCL and GCC warnings fixed
Fixed warnings, trailing spaces and removed unused secure_cpy.
IPP code refactored.
IPP code path implemented as separate static functions to simplify future work with IPP code and make it more readable.
2015-06-18 12:47:07 +03:00
Vadim Pisarevsky
5a94a95fbf
improvements in Haar CascadeClassifier: 1) use CV_32S instead of CV_32F for the integral of squares (which is more accurate and more efficient); 2) skip the window if its contrast is too low
2015-05-28 19:33:21 +03:00
Ilya Lavrenov
fc0869735d
used popcnt
2015-01-12 10:59:30 +03:00
Ilya Lavrenov
9436103ed6
SSE2 optimization of cv::integral
2014-12-29 15:51:55 +03:00
Pavel Vlasov
45958eaabc
Implementation detector and selector for IPP and OpenCL;
...
IPP can be switched on and off on runtime;
Optional implementation collector was added (switched off by default in CMake). Gathers data of implementation used in functions and report this info through performance TS;
TS modifications for implementations control;
2014-10-15 14:24:41 +04:00
Adil Ibragimov
8a4a1bb018
Several type of formal refactoring:
...
1. someMatrix.data -> someMatrix.prt()
2. someMatrix.data + someMatrix.step * lineIndex -> someMatrix.ptr( lineIndex )
3. (SomeType*) someMatrix.data -> someMatrix.ptr<SomeType>()
4. someMatrix.data -> !someMatrix.empty() ( or !someMatrix.data -> someMatrix.empty() ) in logical expressions
2014-08-13 15:21:35 +04:00
Alexander Alekhin
55188fe991
world fix
2014-08-05 20:12:35 +04:00
vbystricky
09bcc061dd
Change kernel for optimization. Remove restriction to align data
...
Fix kernel compilation errors on AMD system
Fix licanse information in cl file
Support CV_64F destination type
Change build options of the kernel
Optimize sum of square
Remove separate kernel for integral square
Increase epsilon for perfomance tests
Increase epsilon for perfomance tests
Test double support on AMD devices
Fix some issues
Try to fix problems with AMD device
Try to solve problem with AMD device
Fix error of destination size in kernel
Fix warnings
2014-06-24 18:32:52 +04:00
vbystricky
606df0469a
Fix pointer conversion
2014-06-16 18:14:05 +04:00
vbystricky
504bc7634a
Remove pre_invalid parameter
2014-06-16 13:07:39 +04:00
Alexander Alekhin
1f638a3e5b
icv: enable functions
2014-05-12 15:38:38 +04:00
vbystricky
c9103730b6
Disable ipp integral in IPPICV version
2014-04-30 12:55:34 +04:00
Ilya Lavrenov
ce0941160e
added status check
2014-04-17 11:08:02 +04:00
Vadim Pisarevsky
f417c79d16
Merge pull request #1932 from seth-planet:master
2014-04-10 13:36:14 +04:00
vbystricky
38913bf5f6
Change all 'ippStsNoErr==' to '0<=', and all 'ippStsNoErr!=' to '0>'
2014-04-07 14:31:34 +04:00
vbystricky
824ed8a3ae
Fix errors
2014-04-07 14:31:31 +04:00
vbystricky
1b3651d8ee
Undo changes ipp to ippicv prefix of function names
2014-04-07 14:30:03 +04:00
vbystricky
a9a0ea3706
Fix error not initialized IppStatus before ipp functions call
2014-04-07 14:26:50 +04:00
vbystricky
01a66a2938
Prepare codes for ippicv library
2014-04-07 14:24:05 +04:00
sprice
75ed2f52f1
Merge branch 'master' of https://github.com/Itseez/opencv
...
Conflicts:
modules/features2d/include/opencv2/features2d.hpp
modules/features2d/src/freak.cpp
modules/features2d/src/stardetector.cpp
2014-03-06 15:39:06 -08:00
Ilya Lavrenov
78c2b3ca2a
refactored imgproc
2014-01-27 18:47:16 +04:00
Leszek Swirski
9c22d4887c
imgproc: IPP compilation fix and minor cleanup
2013-12-20 12:40:40 +00:00
sprice
1c15776cf5
Merge branch 'master' of https://github.com/Itseez/opencv
...
Conflicts:
modules/imgproc/src/sumpixels.cpp
2013-12-11 14:35:51 -08:00
Ilya Lavrenov
73aa43d2ca
fixed warnings
2013-12-07 01:05:07 +04:00
Ilya Lavrenov
3eaa8f149b
added cv::intergal to T-API
2013-12-06 13:18:25 +04:00
sprice
2cc11e2c6a
Updating STAR detector and FREAK descriptor to work with large and/or 16-bit images
2013-12-04 18:13:34 -08:00
Ilya Lavrenov
036e99d03a
fixed ipp-related warnings
2013-10-05 14:54:00 +04:00
Roman Donchenko
3c137f7a04
Converted tabs to spaces.
2013-08-21 18:59:26 +04:00
kdrobnyh
f8eb806565
Add IPP support to integral function
2013-07-10 11:25:36 +04:00
Andrey Kamaev
bd0e0b5800
Merged the trunk r8589:8653 - all changes related to build warnings
2012-06-15 13:04:17 +00:00
Andrey Kamaev
95d659a3cf
Refactored Tegra related macro usage
2011-12-27 11:21:45 +00:00
Andrey Kamaev
583ceef6a5
Terga optimization for integral_8u32s
2011-10-31 08:00:20 +00:00
Vadim Pisarevsky
2d2b8a496e
renamed "None()" to "noArray()" to avoid conflicts with X11 (ticket #1122 )
2011-06-08 06:55:04 +00:00
Vadim Pisarevsky
0c877f62e9
replaced "const InputArray&" => "InputArray"; made InputArray and OutputArray references. added "None()" constant (no array()).
2011-06-06 14:51:27 +00:00