[GSoC] New universal intrinsic backend for RVV
* Add new rvv backend (partially implemented).
* Modify the framework of Universal Intrinsic.
* Add CV_SIMD macro guards to current UI code.
* Use vlanes() instead of nlanes.
* Modify the UI test.
* Enable the new RVV (scalable) backend.
* Remove whitespace.
* Rename and some others modify.
* Update intrin.hpp but still not work on AVX/SSE
* Update conditional compilation macros.
* Use static variable for vlanes.
* Use max_nlanes for array defining.
It's not clear how ranges argument should be used in the overload of
calcHist that accepts std::vector. The main overload uses array of
arrays there, while std::vector overload uses a plain array. The code
interprets the vector as a flattened array and rebuilds array of arrays
from it. This is not obvious interpretation, so documentation has been
added to explain the expected usage.
Replaced sprintf with safer snprintf
* Straightforward replacement of sprintf with safer snprintf
* Trickier replacement of sprintf with safer snprintf
Some functions were changed to take another parameter: the size of the buffer, so that they can pass that size on to snprintf.
Fixed out-of-bounds read in parallel version of ippGaussianBlur()
* Fixed out-of-memory read in parallel version of ippGaussianBlur()
* Fixed check
* Revert changes in CMakeLists.txt
* better accuracy of _rotatedRectangleIntersection
instead of just migrating to double-precision (which would work), some computations are scaled by a factor that depends on the length of the smallest vectors.
There is a better accuracy even with floats, so this is certainly better for very sensitive cases
* Update intersection.cpp
use L2SQR norm to tune the numeric scale
* Update intersection.cpp
adapt samePointEps with L2 norm
* Update intersection.cpp
move comment
* Update intersection.cpp
fix wrong numericalScalingFactor usage
* added tests
* fixed warnings returned by buildbot
* modifications suggested by reviewer
renaming numericalScaleFctor to normalizationScale
refactor some computations
more "const"
* modifications as suggested by reviewer
Optimize cv::applyColorMap() for simple case
* Optimize cv::applyColorMap() for simple case
PR for 21640
For regular cv::Mat CV_8UC1 src, applying the colormap is simpler than calling the cv::LUT() mechanism.
* add support for src as CV_8UC3
src as CV_8UC3 is handled with a BGR2GRAY conversion, the same optimized code being used afterwards
* code style
rely on cv::Mat.ptr() to index data
* Move new implementation to ColorMap::operator()
Changes as suggested by reviewer
* style
improvements suggsted by reviewer
* typo
* tune parallel work
* better usage of parallel_for_
use nstripes parameter of parallel_for_
assume _lut is continuous to bring faster pixel indexing
optimize src/dst access by contiguous rows of pixels
do not locally copy the LUT any more, it is no more relevant with the new optimizations
Fixed threshold(THRESH_TOZERO) at imgproc(IPP)
* Fixed#16085: imgproc(IPP): wrong result from threshold(THRESH_TOZERO)
* 1. Added test cases with float where all bits of mantissa equal 1, min and max float as inputs
2. Used nextafterf instead of cast to hex
* Used float value in test instead of hex and casts
* Changed input value in test
When computing:
t1 = (bayer[1] + bayer[bayer_step] + bayer[bayer_step+2] + bayer[bayer_step*2+1])*G2Y;
there is a T (unsigned short or char) multiplied by an int which can overflow.
Then again, it is stored to t1 which is unsigned so the overflow disappears.
Keeping all unsigned is safer.
* Fix integer overflow in cv::Luv2RGBinteger::process.
For LL=49, uu=205, vv=23, we end up with x=7373056 and y=458
which overflows y*x.
* imgproc(test): adjust test parameters to cover SIMD code
In case of very small negative h (e.g. -1e-40), with the current implementation,
you will go through the first condition and end up with h = 6.f, and will miss
the second condition.
* Add RowVec_8u32f
* Fix build errors in Linux x64 Debug and armeabi-v7a
* Reformat code to make it more clean and conventional
* Optimise with vx_load_expand_q()
different paddings in cvtColorTwoPlane() for biplane YUV420
* Different paddings support in cvtColorTwoPlane() for biplane YUV420
* Build fix for dispatch case.
* Resoted old behaviour for y.step==uv.step to exclude perf regressions.
Co-authored-by: amir.tulegenov <amir.tulegenov@xperience.ai>
Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
Update rotatedRectangleIntersection function to calculate near to origin
* Change type used in points function from RotatedRect
In the function that sets the points of a RotatedRect, the types
should be double in order to keep the precision when dealing with
RotatedRects that are defined far from the origin.
This commit solves the problem in some assertions from
rotatedRectangleIntersection when dealing with rectangles far from
origin.
* added proper type casts
* Update rotatedRectangleIntersection function to calculate near to origin
This commit changes the rotatedRectangleIntersection function in order
to calculate the intersection of two rectangles considering that they
are shifted near the coordinates origin (0, 0).
This commit solves the problem in some assertions from
rotatedRectangleIntersection when dealing with rectangles far from
origin.
* Revert type changes in types.cpp and adequate code to c++98
* Revert unnecessary casts on types.cpp
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
* Add Neon optimised RGB2Lab conversion
* Fix compile errors, change lambda to macro
* Change NEON optimised RGB2Lab to just use HAL
* Change [] to v_extract_n in RGB2Lab
* RGB2LAB Code quality, change to nlane agnostic
* Change RGB2Lab to use function rather than macro
* Remove whitespace
Co-authored-by: Francesco Petrogalli <25690309+fpetrogalli@users.noreply.github.com>