Resize reworked using wide universal intrinsics (#13781)
* Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize
* Reworked linear resize using new wide LUT intrinsics
* Fix for VSX intrinsics
* LineIterator witout a Mat
cv::LineIterator can be used without being attached to any cv::Mat, it only needs the size and type of data. An alternative constructor has been defined for that.
In that case, a LineIterator can no more be dereferenced with the * operator, but pos() still returns valid pixel positions.
It can be useful when LineIterator is just used to compute positions of pixels on a line, without requiring to build a Mat just for that.
Use case : with a dataset that would represent a huge image, pixel positions can be pre-computed before querying the dataset API.
* Update imgproc.hpp
removed trailing spaces
* Update drawing.cpp
fixed warning
Due to size limit of shared memory, histogram is built on
the global memory for CV_16UC1 case.
The amount of memory needed for building histogram is:
65536 * 4byte = 256KB
and shared memory limit is 48KB typically.
Added test cases for CV_16UC1 and various clip limits.
Added perf tests for CV_16UC1 on both CPU and CUDA code.
There was also a bug in CV_8UC1 case when redistributing
"residual" clipped pixels. Adding the test case where clip
limit is 5.0 exposes this bug.
PyrDown: Fix bug #12961 (#13672)
* Force unaligned pointer and create test
* More cross-platform solution
* MSVC expects a proper order
* Remove useless clang macro
* added performance test for compareHist
* compareHist reworked to use wide universal intrinsics
* Disabled vectorization for CV_COMP_CORREL and CV_COMP_BHATTACHARYYA if f64 is unsupported
* significantly reduced OpenCV binary size by disabling IPP calls in some OpenCV functions: Sobel, Scharr, medianBlur, GaussianBlur, filter2D, mean, meanStdDev, norm, sum, minMaxIdx, sort.
* re-enable IPP in norm, since it's much faster (without adding too much space overhead)
* removed C API in the following modules: photo, video, imgcodecs, videoio
* trying to fix various compile errors and warnings on Windows and Linux
* continue to fix compile errors and warnings
* continue to fix compile errors, warnings, as well as the test failures
* trying to resolve compile warnings on Android
* Update cap_dc1394_v2.cpp
fix warning from the new GCC
* Updated boxFilter implementations to use wide universal intrinsics
* boxFilter implementation moved to separate file
* Replaced ROUNDUP macro with roundUp() function
* integrated the new C++ persistence; removed old persistence; most of OpenCV compiles fine! the tests have not been run yet
* fixed multiple bugs in the new C++ persistence
* fixed raw size of the parsed empty sequences
* [temporarily] excluded obsolete applications traincascade and createsamples from build
* fixed several compiler warnings and multiple test failures
* undo changes in cocoa window rendering (that was fixed in another PR)
* fixed more compile warnings and the remaining test failures (hopefully)
* trying to fix the last little warning
* RGB2RGB initially rewritten
* NEON impl removed
* templated version added for ushort, float
* data copying allowed for RGB2RGB
* inplace processing fixed
* fields to local vars
* no zeroupper until it's fixed
* vx_cleanup() added back
* rewrote the line segment intersection function to make the static analyzer happy
* fixed bug with improper "no intersection" detection in some of corner cases
* fixed bug with improper "no intersection" detection in some of corner cases
Exceptions caught by value incur needless cost in C++, most of them can
be caught by const-reference, especially as nearly none are actually
used. This could allow compiler generate a slightly more efficient code.
- improve cpu dispatching calls to allow more SIMD extentions
(SSE4.1, AVX2, VSX)
- wide universal intrinsics
- replace dummy v_expand with v_expand_low
- replace v_expand + v_mul_wrap with v_mul_expand for product accumulate operations
- use FMA for accumulate operations
- add mask and more types to accumulate's performance tests
* bgr2gray 8u fixed to be in conformance with IPP code
* coefficients fixed so their sum is 32768
* java test for CascadeDetect fixed: equalizeHist added
fix some errors found by static analyzer. (#12391)
* fix possible divided by zero and by negative values
* only 4 elements are used in these arrays
* fix uninitialized member
* use boolean type for semantic boolean variables
* avoid invalid array index
* to avoid exception and because base64_beg is only used in this block
* use std::atomic<bool> to avoid thread control race condition
* fix 12218
* Update test_distancetransform.cpp
marked the test as "BIGDATA_TEST" in order to skip it on low-mem platforms
* modify test
* use a smaller image in the test
* fix test code
* Bit-exact resize reworked to use wide intrinsics
* Reworked bit-exact resize row data loading
* Added bit-exact resize row data loaders for SIMD256 and SIMD512
* Fixed type punned pointer dereferencing warning
* Reworked loading of source data for SIMD256 and SIMD512 bit-exact resize
* Add HPX backend for OpenCV implementation
Adds hpx backend for cv::parallel_for_() calls respecting the nstripes chunking parameter. C++ code for the backend is added to modules/core/parallel.cpp. Also, the necessary changes to cmake files are introduced.
Backend can operate in 2 versions (selectable by cmake build option WITH_HPX_STARTSTOP): hpx (runtime always on) and hpx_startstop (start and stop the backend for each cv::parallel_for_() call)
* WIP: Conditionally include hpx_main.hpp to tests in core module
Header hpx_main.hpp is included to both core/perf/perf_main.cpp and core/test/test_main.cpp.
The changes to cmake files for linking hpx library to above mentioned test executalbles are proposed but have issues.
* Add coditional iclusion of hpx_main.hpp to cpp cpu modules
* Remove start/stop version of hpx backend
* add universal intrinsics for HSV2RGB_b
* rewritten HSV2RGB_b without using extra universal intrinsics
* removed unused variable
* undo changes in v_load_deinterleave
fixes handling of empty matrices in some functions (#11634)
* a part of PR #11416 by Yuki Takehara
* moved the empty mat check in Mat::copyTo()
* fixed some test failures
* Added accumulator value to the output of HoughLines and HoughCircles
* imgproc: refactor Hough patch
- eliminate code duplication
- fix type handling, fix OpenCL code
- fix test data generation
- re-generated test data in debug mode via plain CPU code path
* make sure that the matrix with more than INT_MAX elements is marked as non-continuous, and thus all the pixel-wise functions process it correctly (i.e. row-by-row, not as a single row, where integer overflow may occur when computing the total number of elements)
Arm: fix the test failure of OCL_Imgproc/CLAHETest.Accuracy on ODROID-XU4 (#11409)
* fix the test failure of OCL_Imgproc/CLAHETest.Accuracy on ODROID-XU4
* avoid the race condition in the reduce
* imgproc(ocl): simplify CLAHE code
* remove unused class
* model is not learned when grabcut is called with GC_EVAL
* fixed test, was writing to wrong file.
* modified patch by Iwan Paolucci; added GC_EVAL_FREEZE_MODEL in addition to GC_EVAL (which semantics is retained)
* Rewrite polar transformations
- A new wrapPolar function encapsulate both linear and semi-log remap
- Destination size is a parameter or calculated automatically to keep objects size between remapping
- linearPolar and logPolar has been deprecated
* Fix build warning and error in accuracy test
* Fix function name to warpPolar
* Explicitly specify the mapping mode, so we retain all the parameters as non-optional.
Introduces WarpPolarMode enum to specify the mapping mode in flags
* resolves performance warning on windows build
* removed duplicated logPolar and linearPolar implementations
* use universal intrinsic instead of raw intrinsic
* add 2 channels de-interleave on x86 platform
* add v_int32x4 version of v_muladd
* add accumulate version of v_dotprod based on the commit from seiko2plus on bf1852d
* remove some verify check in performance test
* avoid the out of boundary access and keep the performance
* Added custom implementation for NxN bit-exact GaussianBlur
* Reworked fixedpoint interface a bit
* Reworked horizontal line estimation for bit-exact GaussianBlur
* Reworked vertical line estimation for bit-exact GaussianBlur
* Updated range estimation for vectorized part of bit-exact GaussianBlur evaluation
* Fix#9363
* Renamed the structure and added a new function to the LineSegmentDetectorImpl class as a static member
* Added a new function to the LineSegmentDetectorImpl class as a static member
color.cpp split (#10869)
* initial split is done
* files renamed (these names are excluded during compilation)
* IPP code moved to corresponding files
* splineBuild, splineInterpolate -> color_lab.cpp
* Lab, Luv: little refactored
* it compiles (didn't check work); Lab OCL code moved to color_lab.cpp
* cvtcolor.cl: Lab/Luv part moved to color_lab.cl
* cvtcolor.cl: color_rgb.cl extracted
* cvtcolor.cl: color_yuv.cl separated
* cvtcolor.cl: color_hsv.cl extracted
* cvtcolor.cl: extracted to color_lab.cl and color_rgb.cl
* helper functions moved to hpp file
* Lab, Luv: moved to color_lab.cpp
* CPU XYZ: to color_lab.cpp
* OCL XYZ: to color_lab.cpp
* warning fixed
* CvtHelper added
* CPU YUV: to color_yuv.cpp, helpers to color.hpp
* CPU HLS/HSV: to color_hsv.cpp
* CPU BGR2BGR: to color_rgb.cpp
* CPU RGB: to color_rgb.cpp
* extra arg removed
* CPU YUV: to color_yuv.cpp
* color code decoded
* OclHelper added, some funcs rewritten
* color_lab.cpp: refactored to use OclHelper
* OCL RGB: to color_rgb.cpp
* OCL HLS/HSV: to color_hsv.cpp
* OCL YUV: to color_yuv.cpp
* OCL YUV planes: to color_yuv.cpp
* OCL: color code reduced
* licence to demosaicing.cpp
* IPP func tables to color_rgb.cpp
* code cleanup
* HAVE_OPENCL ifdefs added
* helpers made more common
* fixed two plane YUV with separate mats
* fixed warning in gcc7.2.0
* precomp header fixed
* color space classification functions fixed
* helpers fixed
* rename: isSRGB -> is_sRGB
* loosen some test threshold mainly for integer types
* use relative error for floating points result
* avoid division by zero by following the comment
* fix the indentation
* Add a new interface for hough transform
* Fixed warning code
* Fix HoughLinesUsingSetOfPoints based on HoughLinesStandard
* Delete memset
* Rename HoughLinesUsingSetOfPoints and add common function
* Fix test error
* Change static function name
* Change using CV_Assert instead of if-block and add integer test case
* I solve the conflict and delete 'std :: tr1' and changed it to use 'tuple'
* I deleted std::tr1::get and changed int to use 'get'
* Fixed sample code
* revert test_main.cpp
* Delete sample code in comment and add snippets
* Change file name
* Delete static function
* Fixed build error
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
* Fixing a bug in Canny implemetation when Sobel aperture size is 7.
* Fixing the bug in Canny accross variants and in test_canny.cpp
* Replacing a tab with white space
* Bit-exact implementation of GaussianBlur smoothing
* Added universal intrinsics based implementation for bit-exact CV_8U GaussianBlur smoothing.
* Added parallel_for to evaluation of bit-exact GaussianBlur
* Added custom implementations for 3x3 and 5x5 bit-exact GaussianBlur
Hough many circles (#10232)
* Add Hui's optimization. Merge with latest changes in OpenCV.
* Use conditional compilation instead of a runtime flag.
* Whitespace.
* Create the sequence for the nonzero edge pixels only if using that approach.
* Improve performance for finding very large numbers of circles
* Return the circles with the larger accumulator values first, as per API documentation.
Use a separate step to check distance between circles. Allows circles to be sorted by strength first. Avoids locking in EstimateRadius which was slowing it down.
Return centers only if maxRadius == 0 as per API documentation.
* Sort the circles so results are deterministic. Otherwise the order of circles with the same strength depends on parallel processing completion order.
* Add test for HoughCircles.
* Add beads test.
* Wrap the non-zero points structure in a common interface so the code can use either a vector or a matrix.
* Remove the special case for skipping the radius search if maxRadius==0.
* Add performance tests.
* Use NULL instead of nullptr.
OpenCV should compile with C++98 compiler.
* Put test suite name first.
Use different test suite names for each test to avoid an error from the test runner.
* Address build bot errors and warnings.
* Skip radius search if maxRadius < 0.
* Dynamically switch to NZPointList when it will be faster than NZPointSet.
* Fix compile error: missing 'typename' prior to dependent type name.
* Fix compile error: missing 'typename' prior to dependent type name.
This time fix it the non C++ 11 way.
* Fix compile error: no type named 'const_reference' in 'class cv::NZPointList'
* Disable ManySmallCircles tests. Failing on Mac.
* Change beads image to JPEG for smaller file size.
Try enabling the ManySmallCircles tests again.
* Remove ManySmallCircles tests. They are failing on the Mac build.
* Fix expectations to check all circles.
* Changing case on a case-insensitive file system
Step 1: remove the old file names
* Changing case on a case-insensitive file system
Step 2: add them back with the new names
* Fix cmpAccum function to be strictly weak ordered.
* Add tests for many small circles.
* imgproc(perf): fix HoughCircles tests
* imgproc(houghCircles): refactor code
- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)
* Add test that fails
* Fix integer pointPolygonTest for large coordinate values
* Review fixes:
- change type from long long to int64
- move test code to test_contours.cpp, and make it C++98 compliant
* Hopefully fix compiler error by using push_back instead of emplace_back
* fixed OpenCL functions on Mac, so that the tests pass
* fixed compile warnings; temporarily disabled OCL branch of TV L1 optical flow on mac
* fixed other few warnings on macos
If there are no OpenCL/UMat methods calls from application.
OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called
Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
* Error in the documentation for cv::getRectSubPix. #9788
The function name is corrected to GetRectSubPix since, it uses the notation
of src, dst and center. Also added the number of channel assertion criteria.
* Error in the documentation for cv::getRectSubPix. #9788
Replace dst with patch in the formula, reverted function name to
getRectSubPix, removed BorderTypes comment line due to no explicit call
to the function found.
* Error in the documentation for cv::getRectSubPix. #9788
Replace dst with patch in the formula, reverted function name to
getRectSubPix, removed BorderTypes comment line due to no explicit call
to the function found.
* Update OpenCVCompilerOptimizations.cmake
Neon not supported on MSVC ARM breaking build fix
* Update OpenCVCompilerOptimizations.cmake
Whitespace
* Update intrin.hpp
Many problems in MSVC ARM builds (at least on VS2017) being fixed in this PR now.
C:\Users\Gregory\DOCUME~1\MYLIBR~1\OPENCV~3\opencv\sources\modules\core\include\opencv2/core/hal/intrin.hpp(444): error C3861: '_tzcnt_u32': identifier not found
* Update hal_replacement.hpp
Passing variadic expansion in a macro to another macro does not work properly in MSVC and a famous known workaround is hereby applied. Discussion of it: https://stackoverflow.com/questions/5134523/msvc-doesnt-expand-va-args-correctly
Only needed the fix for ARM builds: TEGRA_ macros are used for cv_hal_ functions in the carotene library.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): warning C4003: not enough actual parameters for macro 'TEGRA_ADD'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2143: syntax error: missing ')' before ','
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2059: syntax error: ')'
* Update hal_replacement.hpp
All hal_replacement's using carotene\hal\tegra_hal.hpp TEGRA_ functions as macros preprocessed by variadic macros should be changed, identical as was done in core.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): warning C4003: not enough actual parameters for macro 'TEGRA_CVTBGRTOBGR'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): error C2059: syntax error: '=='
* Update OpenCVCompilerOptimizations.cmake
* Update hal_replacement.hpp
* Update hal_replacement.hpp
Adds fitEllipseDirect to imgproc: The Direct least square (Direct) method by Fitzgibbon1999.
New Tests are included for the methods.
fitEllipseAMS Tests
fitEllipseDirect Tests
Comparative examples are added to fitEllipse.cpp in Samples.
imgproc: use universal intrinsic as much as possible (#9714)
* use universal intrinsic as much as possible
* make SSE3 part as common as possible with universal intrinsic implementation
* put the reducing part out of the main loop
* follow the comment
* fix the typo
* use v_reduce_sum4
* follow the comment again
* remove all CV_SSE3 part from smooth.cpp
The non-maximum suppression in the Hough accumulator incorrectly ignores maxima that extend over more than one cell, i.e. two neighboring cells both have the same accumulator value. This maximum is dropped completely instead of picking at least one of the entries. This frequently results in obvious circles being missed.
The behavior is now changed to be the same as for hough_lines.
See also https://github.com/opencv/opencv/issues/4440