- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
Added assertios to remap and warpAffine functions
As @mshabunin said, remap and warpAffine functions do not support more than 4 channels in
Bicubic and Lanczos4 interpolation modes. Assertions were added. Appropriate test was chenged.
resolves#8272
Warping a matrix with more than 4 channels using BORDER_CONSTANT and
INTER_NEAREST, INTER_CUBIC or INTER_LANCZOS4 interpolation led to
undefined behaviour. This commit changes the behavior of these methods
to be similar to that of INTER_LINEAR. Changed the scope of some of the
variables to more local. Modified some tests to be able to detect the
error described.
- don't use undefined flag=0. It should be CONSTANT instead.
- don't allow 'UMat* m=NULL' argument (except LOCAL/CONSTANT flags).
This case is not handled well to provide NULL __global pointers.
It is better to use '-D' macro defines instead (at least for performance)
Add new OpenCL kernels for bicubic interploation, it is 20% faster
than current warp image kernel with bicubic interploation.
Signed-off-by: Li Peng <peng.li@intel.com>
Add new ocl kernels for warpAffine and warpPerspective,
The average performance improvemnt is about 30%. The new
ocl kernels require CV_8UC1 format and support nearest
neighbor and bilinear interpolation.
Signed-off-by: Li Peng <peng.li@intel.com>
Maximum depth limit var was added to the instrumentation structure;
Trace names output console output fix: improper tree formatting could happen;
Output in case of error was added;
Custom regions improvements;
Improved timing and weight calculation for parallel regions; New TC (threads counter) value to indicate how many different threads accessed particular node;
parallel_for, warnings fixes and ReturnAddress code from Alexander Alekhin;
Add OpenCL support to linearPolar & logPolar.
The OpenCL code use float instead of double, so that it does not require
cl_khr_fp64 extension, with slight precision lost.
Add explicit conversion
Add explicit conversion from double to float to eliminate warning during
compilation.
Rewrite linearPolar & logPolar so that they do not depend on the
deprecated API CvMat. Issue 6377 is resolved in this way because the two
routines do not convert UMat to CvMat anymore.
IPP_VERSION_MAJOR * 100 + IPP_VERSION_MINOR*10 + IPP_VERSION_UPDATE
to manage changes between updates more easily.
IPP_DISABLE_BLOCK was added to ease tracking of disabled IPP functions;
Removed IPP port for tiny arithm.cpp functions
Additional warnings fix on various platforms.
Build without OPENCL and GCC warnings fixed
Fixed warnings, trailing spaces and removed unused secure_cpy.
IPP code refactored.
IPP code path implemented as separate static functions to simplify future work with IPP code and make it more readable.