Commit Graph

1994 Commits

Author SHA1 Message Date
Philipp Hasper
6032d86e23 Added getFontScaleFromHeight() 2018-01-11 14:00:27 +01:00
Alexander Alekhin
d0b2e60edc Merge pull request #10514 from alalek:build_warnings 2018-01-05 20:54:01 +00:00
Alexander Alekhin
8acd05f12a Merge pull request #10421 from cezheng:patch-1 2018-01-05 09:24:33 +00:00
Alexander Alekhin
116c8d0ddf build: eliminate warnings
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
2018-01-05 04:30:08 +00:00
victor-ludorum
ad23c10600
Updating rotcalipers.cpp to resolve issue #10096
Updating the documentation of the rotcalipers.cpp to resolve issue #10096
2017-12-30 10:21:46 +05:30
Tom Becker
592f8d8c1b Merge pull request #10232 from TomBecker-BD:hough-many-circles
Hough many circles (#10232)

* Add Hui's optimization. Merge with latest changes in OpenCV.

* Use conditional compilation instead of a runtime flag.

* Whitespace.

* Create the sequence for the nonzero edge pixels only if using that approach.

* Improve performance for finding very large numbers of circles

* Return the circles with the larger accumulator values first, as per API documentation.
Use a separate step to check distance between circles. Allows circles to be sorted by strength first. Avoids locking in EstimateRadius which was slowing it down.
Return centers only if maxRadius == 0 as per API documentation.

* Sort the circles so results are deterministic. Otherwise the order of circles with the same strength depends on parallel processing completion order.

* Add test for HoughCircles.

* Add beads test.

* Wrap the non-zero points structure in a common interface so the code can use either a vector or a matrix.

* Remove the special case for skipping the radius search if maxRadius==0.

* Add performance tests.

* Use NULL instead of nullptr.
OpenCV should compile with C++98 compiler.

* Put test suite name first.
Use different test suite names for each test to avoid an error from the test runner.

* Address build bot errors and warnings.

* Skip radius search if maxRadius < 0.

* Dynamically switch to NZPointList when it will be faster than NZPointSet.

* Fix compile error: missing 'typename' prior to dependent type name.

* Fix compile error: missing 'typename' prior to dependent type name.
This time fix it the non C++ 11 way.

* Fix compile error: no type named 'const_reference' in 'class cv::NZPointList'

* Disable ManySmallCircles tests. Failing on Mac.

* Change beads image to JPEG for smaller file size.
Try enabling the ManySmallCircles tests again.

* Remove ManySmallCircles tests. They are failing on the Mac build.

* Fix expectations to check all circles.

* Changing case on a case-insensitive file system
Step 1: remove the old file names

* Changing case on a case-insensitive file system
Step 2: add them back with the new names

* Fix cmpAccum function to be strictly weak ordered.

* Add tests for many small circles.

* imgproc(perf): fix HoughCircles tests

* imgproc(houghCircles): refactor code

- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)
2017-12-28 17:23:11 +03:00
Vadim Pisarevsky
3f68d6d8a7 Merge pull request #10392 from terfendail:bitexact_fallback 2017-12-22 13:23:55 +00:00
Alexander Alekhin
83b8cd0152 Merge pull request #10375 from tomoaki0705:buildWarningMSVC 2017-12-22 13:17:12 +00:00
Vitaly Tuzov
5fdb42a7c9 Added fallback to generic linear resize in case bit-exact resize of provided matrix isn't supported 2017-12-22 14:29:50 +03:00
Ce Zheng
602b08d9c7
Update resize inline comments
Reading through the implementation, I feel this line of comment is not consistent with the actually code, so this is for correcting it.
2017-12-22 16:03:12 +08:00
Vitaly Tuzov
019162486c Disabled universal intrinsic based implementation for bit-exact resize of 3-channel images 2017-12-22 10:08:30 +03:00
Tomoaki Teshima
fe7b3f1228 clean up the code
* disable the warning in CMake, not int the code using pragma
2017-12-22 08:42:21 +09:00
Vadim Pisarevsky
a8a51db42b Merge pull request #10316 from terfendail:bitexact_c234 2017-12-21 18:56:54 +00:00
Alexander Alekhin
813ff37967 imgproc(ocl): fix RGB2RGBA kernel out of range access 2017-12-20 14:19:46 +00:00
Vitaly Tuzov
1eb2fa9efb Added universal intrinsics based implementations for CV_8UC2, CV_8UC3, CV_8UC4 bit-exact resizes. 2017-12-20 17:17:10 +03:00
elenagvo
b0e9d76ced HAL for canny 2017-12-19 11:03:10 +03:00
Maksim Shabunin
1033f2b1bd Fixed 3 issues found by static analysis 2017-12-15 17:29:26 +03:00
Vitaly Tuzov
51cb56ef2c Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. (#9468) 2017-12-13 15:00:38 +03:00
Maksim Shabunin
7349b8f5ce Build for embedded systems 2017-12-11 13:27:37 +03:00
Elena Gvozdeva
6185f7209e Merge pull request #10172 from ElenaGvozdeva:eg/HAL_sobel
* add HAL for SobelFilter
* add HAL for pyrDown
* add HAL for Scharr
2017-12-08 16:36:24 +03:00
Vadim Pisarevsky
4781f0a337 Merge pull request #10024 from iago-suarez:bugfix-lsd-multiple-imgs-issue#10023 2017-12-06 09:01:37 +00:00
Vadim Pisarevsky
4b8275061e Merge pull request #10058 from ElenaGvozdeva:eg/HAL 2017-12-05 20:56:24 +00:00
Juha Reunanen
5b41599911 Fix pointPolygonTest for large coordinate values (#10222)
* Add test that fails

* Fix integer pointPolygonTest for large coordinate values

* Review fixes:
- change type from long long to int64
- move test code to test_contours.cpp, and make it C++98 compliant

* Hopefully fix compiler error by using push_back instead of emplace_back
2017-12-05 15:49:44 +03:00
Vadim Pisarevsky
5ce38e516e Merge pull request #10223 from vpisarev:ocl_mac_fixes
* fixed OpenCL functions on Mac, so that the tests pass

* fixed compile warnings; temporarily disabled OCL branch of TV L1 optical flow on mac

* fixed other few warnings on macos
2017-12-05 13:32:28 +03:00
elenagvo
7bfb38055c remove matrix release 2017-12-01 14:38:00 +03:00
elenagvo
81519537ae fix the parameters order 2017-12-01 14:38:00 +03:00
elenagvo
0f12351a41 fix accelerators order 2017-12-01 14:38:00 +03:00
elenagvo
7aadbc9607 remove complex data structs 2017-12-01 14:38:00 +03:00
elenagvo
ce65975625 call HAL for GaussianBlur is fixed 2017-12-01 14:38:00 +03:00
elenagvo
c2c7333107 add hal for GaussianBlur 2017-12-01 14:38:00 +03:00
elenagvo
cb9e110adb add HAL for BoxFilter 2017-12-01 14:38:00 +03:00
Vadim Pisarevsky
7ae19467b5 Merge pull request #10171 from ElenaGvozdeva:Threshold 2017-11-30 10:02:47 +00:00
elenagvo
73ac5321f5 fix threshold HAL 2017-11-28 17:42:29 +03:00
elenagvo
762138e77e define adaptiveMethod and thresholdType for HAL 2017-11-28 16:02:06 +03:00
Alexander Alekhin
0ed3209b00 ocl: avoid unnecessary loading/initializing OpenCL subsystem
If there are no OpenCL/UMat methods calls from application.

OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called

Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
elenagvo
c95bc0c7fd add HAL for threshold 2017-11-27 15:35:02 +03:00
elenagvo
11ddb9332c add HAL for adaptiveThreshold 2017-11-27 15:35:02 +03:00
Vadim Pisarevsky
92be112388 Merge pull request #10107 from ElenaGvozdeva:medianBlur_HAL 2017-11-27 10:12:16 +00:00
Alexander Alekhin
11330b9253 Merge pull request #10095 from alalek:fix_canny_intrinsics 2017-11-25 12:31:47 +00:00
elenagvo
5d0a8d2aaf fix the parameters order 2017-11-23 13:30:00 +03:00
Maksim Shabunin
1c46034166 Other buffers 2017-11-21 17:55:23 +03:00
Maksim Shabunin
2178c5e95e init ABtoXZ 2017-11-21 17:55:23 +03:00
Maksim Shabunin
b3018ba89e LUV tables 2017-11-21 17:55:23 +03:00
Maksim Shabunin
e75056a084 static init 2017-11-21 17:55:23 +03:00
Maksim Shabunin
51fc891a5c cvtColor: fixed tables init, moved some tables to heap 2017-11-21 17:55:23 +03:00
Rostislav Vasilikhin
397b57dd72 Merge pull request #10041 from savuor:RevHoughWorks
HoughCircles rewritten (PR #7434 updated) (#10041)

* initial version of renewed HoughCircles done

* fixed compilation

* fixed SIMD ability & compilation warning

* fixed accumulator nonmax comparison

* common Mutex for all invokers

* nzLocal is std::vector

* nz is std::vector

* SSE2 -> SIMD128

* centers is now std::vector

* circles is std::vector

* estimateRadius updated

* accum calculation w/o mutex

* less deprecated code

* several bugs fixed

* back to mutex, TLS gathering doesn't work

* extra code removed

* little refactoring

* docs note updated

* a little speedup

* warning fixed
2017-11-21 14:18:47 +03:00
elenagvo
3a09da71d8 add HAL for medianBlur 2017-11-20 17:09:22 +03:00
elenagvo
20c08eab73 change border type for medianBlur to BORDER_ISOLATED 2017-11-17 15:13:04 +03:00
Alexander Alekhin
baff521b63 imgproc(Canny): eliminate unnecessary operations
drop manual loop unrolling:
- don't block compiler optimizations
- no effect on i5
2017-11-15 19:29:59 +00:00
Suleyman TURKMEN
992a81dcaa Update lsd.cpp 2017-11-11 22:26:44 +03:00
Alexander Alekhin
e89ae986fa Merge pull request #10053 from terfendail:intersectconvex_fix 2017-11-09 11:05:56 +00:00
Vitaly Tuzov
0205238dca Fix for intersectConvexConvex nested contours check 2017-11-08 20:27:07 +03:00
Alexander Alekhin
981009ac1f Merge pull request #9999 from mshabunin:fix-gcc72-warnings 2017-11-07 13:37:25 +00:00
Alexander Alekhin
96aebbe7f9 Merge pull request #9970 from mshabunin:media-sdk-convert 2017-11-07 13:37:07 +00:00
Iago Suárez
42d942e19c Clearing data into the detect method of the class cv::LineSegmentDetectorImpl 2017-11-04 17:12:20 +01:00
Maksim Shabunin
184daa155f Fixed minor issues reported by GCC 7.2 2017-11-03 18:06:39 +03:00
Alexander Alekhin
c9c759f700 ocl: fix moments global_size (should be >= local_size) 2017-11-02 13:37:57 +03:00
Maksim Shabunin
0c79f4a00f MediaSDK: fixed Linux build, improved BGR<->NV12 conversions 2017-11-01 14:14:45 +03:00
Alexander Alekhin
7b0d2d189f Merge pull request #9951 from alalek:fix_bilateral_f32_simd 2017-10-27 16:43:20 +00:00
Vadim Pisarevsky
ff190b1ef4 Merge pull request #9802 from Nickolays:Fix#9797 2017-10-27 13:00:34 +00:00
Alexander Alekhin
cc9ab7e582 imgproc: fix bilateral filter SIMD 32f optimization 2017-10-27 14:20:13 +03:00
Gregory Morse
d30a0c6f03 Merge pull request #9856 from GregoryMorse:patch-1
* Update OpenCVCompilerOptimizations.cmake

Neon not supported on MSVC ARM breaking build fix

* Update OpenCVCompilerOptimizations.cmake

Whitespace

* Update intrin.hpp

Many problems in MSVC ARM builds (at least on VS2017) being fixed in this PR now.

C:\Users\Gregory\DOCUME~1\MYLIBR~1\OPENCV~3\opencv\sources\modules\core\include\opencv2/core/hal/intrin.hpp(444): error C3861: '_tzcnt_u32': identifier not found

* Update hal_replacement.hpp

Passing variadic expansion in a macro to another macro does not work properly in MSVC and a famous known workaround is hereby applied.  Discussion of it: https://stackoverflow.com/questions/5134523/msvc-doesnt-expand-va-args-correctly
Only needed the fix for ARM builds: TEGRA_ macros are used for cv_hal_ functions in the carotene library.

C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): warning C4003: not enough actual parameters for macro 'TEGRA_ADD'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2143: syntax error: missing ')' before ','
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2059: syntax error: ')'

* Update hal_replacement.hpp

All hal_replacement's using carotene\hal\tegra_hal.hpp TEGRA_ functions as macros preprocessed by variadic macros should be changed, identical as was done in core.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): warning C4003: not enough actual parameters for macro 'TEGRA_CVTBGRTOBGR'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): error C2059: syntax error: '=='

* Update OpenCVCompilerOptimizations.cmake

* Update hal_replacement.hpp

* Update hal_replacement.hpp
2017-10-16 12:12:35 +03:00
Nickola
b2b56b6896 Fix Issue #9797 minEnclosingCircle from three points
and add test for only 3 pts
2017-10-13 19:26:54 +03:00
Vadim Pisarevsky
e955bcb872 Merge pull request #9824 from sturkmen72:upd_minEnclosingTriangle 2017-10-11 17:11:15 +00:00
Suleyman TURKMEN
29c186a022 Update min_enclosing_triangle.cpp 2017-10-11 17:48:19 +03:00
Suleyman TURKMEN
b2673a19cf Updates min_enclosing_triangle.cpp 2017-10-10 23:23:36 +03:00
Alexander Alekhin
e615fafe2d build: fix MSVS2010 2017-10-08 23:32:22 +03:00
Jasper Shemilt
800da724a3 Fix Transposed eigenvals and vecs. Didn't notice at first 2017-10-02 17:56:08 +01:00
Jasper Shemilt
0136711cf4 Adds fitEllipseAMS to imgproc: The Approximate Mean Square (AMS) proposed by Taubin 1991.
Adds fitEllipseDirect to imgproc: The Direct least square (Direct) method by Fitzgibbon1999.

New Tests are included for the methods.
fitEllipseAMS Tests
fitEllipseDirect Tests

Comparative examples are added to fitEllipse.cpp in Samples.
2017-10-02 16:38:41 +01:00
Tomoaki Teshima
139b32734e Merge pull request #9714 from tomoaki0705:universalBilateral
imgproc: use universal intrinsic as much as possible (#9714)

* use universal intrinsic as much as possible
  * make SSE3 part as common as possible with universal intrinsic implementation
  * put the reducing part out of the main loop

* follow the comment
  * fix the typo
  * use v_reduce_sum4

* follow the comment again
  * remove all CV_SSE3 part from smooth.cpp
2017-09-28 12:30:22 +03:00
Alexander Alekhin
488d4df520 Merge pull request #9710 from savuor:ovx_harris_build_fix 2017-09-28 09:27:09 +00:00
Peter Fischer
332588fcee Fix bug: non-maximum suppression for hough circle
The non-maximum suppression in the Hough accumulator incorrectly ignores maxima that extend over more than one cell, i.e. two neighboring cells both have the same accumulator value. This maximum is dropped completely instead of picking at least one of the entries. This frequently results in obvious circles being missed.

The behavior is now changed to be the same as for hough_lines.

See also https://github.com/opencv/opencv/issues/4440
2017-09-27 11:47:30 +02:00
Alexander Alekhin
7475d23fec Merge pull request #9707 from woodychow:fix_undistortrectifymap_avx2 2017-09-26 16:42:47 +00:00
woody.chow
ef8c919e3f Minor optimization of initUndistortRectifyMapLine_AVX 2017-09-26 09:12:22 +09:00
Alexander Alekhin
d9107601c3 Merge pull request #9690 from tomoaki0705:universalSmooth 2017-09-25 19:42:06 +00:00
Rostislav Vasilikhin
62c4cf3ac6 fixed build of OpenVX Harris 2017-09-25 15:58:09 +03:00
Tomoaki Teshima
e932160a8d replace raw SSE2/NEON implementation with universal intrinsic 2017-09-22 23:43:05 +09:00
vipinanand4
39e742765a Merge pull request #9618 from vipinanand4:goodFeaturesToTrack_added_gradiantSize
Added gradiantSize param into goodFeaturesToTrack API (#9618)

* Added gradiantSize param into goodFeaturesToTrack API

Removed hardcode value 3 in goodFeaturesToTrack API, and
added new param 'gradinatSize' in this API so that user can
pass any gradiant size as 3, 5 or 7.

Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
Signed-off-by: Nilaykumar Patel<nilay.nilpat@gmail.com>
Signed-off-by: Prashanth Voora <prashanthx85@gmail.com>

* fixed compilation error for java test

Signed-off-by: Vipin Anand <anand.vipin@gmail.com>

* Modifying code for previous binary compatibility and fixing other warnings

fixed ABI break issue

resolved merged conflict

compilation error fix

Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
Signed-off-by: Patel, Nilaykumar K <nilay.nilpat@gmail.com>
2017-09-22 14:04:43 +00:00
Vadim Pisarevsky
3f794cd951 Merge pull request #9147 from sovrasov:phase_corr_fix 2017-09-22 10:32:16 +00:00
Rostislav Vasilikhin
cc547e8260 Bit-exact version of Luv2RGB_b (#9470)
* lab_tetra squashed

* initial version is almost written

* unfinished work

* compilation fixed, to be debugged

* Lab test removed

* more fixes

* Luv2RGBinteger: channels order fixed

* Lab structs removed

* good trilinear interpolation added

* several fixes

* removed Luv2RGB interpolations, XYZ tables; 8-cell LUT added

* no_interpolate made 8-cell

* interpolations rewritten to 8-cell, minor fixes

* packed interpolation added for RGB2Luv

* tetra implemented

* removing unnecessary code

* LUT building merged

* changes ported to color.cpp

* minor fixes; try to suppress warnings

* fixed v range of Luv

* fixed incorrect src channel number

* minor fixes

* preliminary version of Luv2RGBinteger is done

* Luv2RGB_b is in progress

* XYZ color constants converted to softfloat

* Luv test: precision fixed

* Luv bit-exactness test added

* warnings fixed

* compilation fixed, error message fixed

* Luv check is limited to [0-2,0-2,0-2] by XYZ

* L->Y generation moved to LUT

* LUTs added for up and vp of Luv2RGB_b

* still works

* fixed-point is done, works at maxerr 2

* vectorized code is done, 2x slower than original

* perf improved by 10%

* extra comments removed

* code moved to color.cpp

* test_lab.cpp updated

* minor refactoring

* test added for Luv2RGB

* OCL Luv2RGB_b: XYZ are limited to [0, 2]; docs updated

* Luv2RGB_b rewritten to universal intrinsics

* test_lab.cpp moved to luv_tetra branch
2017-09-21 14:20:45 +03:00
Dmitry Kurtaev
fa109b94d9 Update 16UC thresholding 2017-09-18 18:34:45 +03:00
Arvid Piehl Lauritsen Böttiger
ca245e995a Added support for thresholding CV_16U images. 2017-09-18 15:34:37 +03:00
Vadim Pisarevsky
48cc1b351d Merge pull request #9560 from sovrasov:undistort_stop_criteria 2017-09-15 12:34:16 +00:00
Rostislav Vasilikhin
4435ec5f26 Bit-exact version of RGB2Luv_b (#9226)
* Imgproc_ColorLab_Full.accuracy test fixed

* Lab and Luv tests: rewritten, constants explained

* CV_ColorCvtBaseTest: added methods for 8u implementations

* Lab2RGB_b: bit-exactness enabled for all modes; non-vectorized code fixed to comply with vectorized

* srgb support added

* XYZ constants made softdouble

* bit-exact tests written for Lab

* ColorLab_full test fixed

* reverted: no 8u convertors for CV_ColorCvtBaseTest

* added checksum-based test for Lab bit-exactness

* extra declarations removed

* Lab test fix: stop at first mismatch

* test info output improved

* error message fixed

* lab_tetra squashed

* initial version is almost written

* unfinished work

* compilation fixed, to be debugged

* Lab test removed

* more fixes

* Luv2RGBinteger: channels order fixed

* Lab structs removed

* good trilinear interpolation added

* several fixes

* removed Luv2RGB interpolations, XYZ tables; 8-cell LUT added

* no_interpolate made 8-cell

* interpolations rewritten to 8-cell, minor fixes

* packed interpolation added for RGB2Luv

* tetra implemented

* removing unnecessary code

* LUT building merged

* changes ported to color.cpp

* minor fixes; try to suppress warnings

* fixed v range of Luv

* fixed incorrect src channel number

* minor fixes

* preliminary version of Luv2RGBinteger is done

* Luv2RGB_b is in progress

* XYZ color constants converted to softfloat

* Luv test: precision fixed

* Luv bit-exactness test added

* warnings fixed

* compilation fixed, error message fixed

* test_lab.cpp removed
2017-09-14 14:51:27 +03:00
Vladislav Sovrasov
b421ebef86 imgproc: slightly change the signature of undistortPoints overload 2017-09-14 12:19:40 +03:00
Vladislav Sovrasov
701c7e5685 imgproc: add stop criteria tuning in undistortPoints 2017-09-14 11:43:54 +03:00
Rostislav Vasilikhin
5ea1c72291 Lab2RGB_b: bit-exactness enabled for all modes; non-vectorized code fixed to comply with vectorized 2017-09-12 17:16:30 +03:00
Maksim Shabunin
248e2c7d47 Fixed some issues found by static analysis 2017-09-08 12:22:12 +03:00
Alexander Alekhin
89bb028bfc imgproc(ocl): don't use doubles to process float data 2017-09-07 12:42:20 +03:00
Vitaly Tuzov
e8caa9b5c0 removed unused interpolateLinear 2017-08-31 15:34:27 +03:00
Vitaly Tuzov
b1f46b6d69 Move resize implementation to separate file 2017-08-31 14:36:19 +03:00
Vadim Pisarevsky
d743a4c969 Merge pull request #9506 from alalek:ocl_fix_canny_ub_9496 2017-08-30 13:37:44 +00:00
Alexander Alekhin
e3b12bdb59 imgproc(ocl): eliminate div by zero in Canny 2017-08-29 19:29:53 +03:00
Vladislav Sovrasov
91e56abcf1 imgproc: disable buggy inplace processing in convexHull 2017-08-29 15:28:34 +03:00
Alexander Alekhin
9c14a2f0aa Merge pull request #9395 from lupustr3:pvlasov/icv2017u3_update 2017-08-24 11:48:53 +00:00
Pavel Vlasov
a57718e1ac ICV2017u3 package update;
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
2017-08-23 14:24:43 +03:00
berak
e7b9cfa8f2 imgproc:fix winSize in createHanningWindow() 2017-08-16 08:53:45 +02:00
KUANG, Fangjun
4bbe67451d fix some typos in the documentation. 2017-08-08 17:32:04 +02:00
Jiri Horner
bb6496d9e5 Merge pull request #8951 from hrnr:akaze_part2
[GSOC] Speeding-up AKAZE, part #2 (#8951)

* feature2d: instrument more functions used in AKAZE

* rework Compute_Determinant_Hessian_Response

* this takes 84% of time of Feature_Detection
* run everything in parallel
* compute Scharr kernels just once
* compute sigma more efficiently
* allocate all matrices in evolution without zeroing

* features2d: add one bigger image to tests

* now test have images: 600x768, 900x600 and 1385x700 to cover different resolutions

* explicitly zero Lx and Ly

* add Lflow and Lstep to evolution as in original AKAZE code

* reworked computing keypoints orientation

integrated faster function from https://github.com/h2suzuki/fast_akaze

* use standard fastAtan2 instead of getAngle

* compute keypoints orientation in parallel

* fix visual studio warnings

* replace some wrapped functions with direct calls to OpenCV functions

* improved readability for people familiar with opencv
* do not same image twice in base level

* rework diffusity stencil

* use one pass stencil for diffusity from https://github.com/h2suzuki/fast_akaze
* improve locality in Create_Scale_Space

* always compute determinat od hessian and spacial derivatives

* this needs to be computed always as we need derivatives while computing descriptors
* fixed tests of AKAZE with KAZE descriptors which have been affected by this

Currently it computes all first and second order derivatives together and the determiant of the hessian. For descriptors it would be enough to compute just first order derivates, but it is not probably worth it optimize for scenario where descriptors and keypoints are computed separately, since it is already very inefficient. When computing keypoint and descriptors together it is faster to do it the current way (preserves locality).

* parallelize non linear diffusion computation

* do multiplication right in the nlp diffusity kernel

* rework kfactor computation

* get rid of sharing buffers when creating scale space pyramid, the performace impact is neglegible

* features2d: initialize TBB scheduler in perf tests

* ensures more stable output
* more reasonable profiles, since the first call of parallel_for_ is not getting big performace hit

* compute_kfactor: interleave finding of maximum and computing distance

* no need to go twice through the data

* start to use UMats in AKAZE to leverage OpenCl in the future

* fixed bug that prevented computing determinant for scale pyramid of size 1 (just the base image)
* all descriptors now support writing to uninitialized memory
* use InputArray and OutputArray for input image and descriptors, allows to make use UMAt that user passes to us

* enable use of all existing ocl paths in AKAZE

* all parts that uses ocl-enabled functions should use ocl by now

* imgproc: fix dispatching of IPP version when OCL is disabled

* when OCL is disabled IPP version should be always prefered (even when the dst is UMat)

* get rid of copy in DeterminantHessian response

* this slows CPU version considerably
* do no run in parallel when running with OCL

* store derivations as UMat in pyramid

* enables OCL path computing of determint hessian
* will allow to compute descriptors on GPU in the future

* port diffusivity to OCL

* diffusivity itself is not a blocker, but this saves us downloading and uploading derivations

* implement kernel for nonlinear scalar diffusion step

* download the pyramid from GPU just once

we don't want to downlaod matrices ad hoc from gpu when the function in AKAZE needs it. There is a HUGE mapping overhead and without shared memory support a LOT of unnecessary transfers.

This maps/downloads matrices just once.

* fix bug with uninitialized values in non linear diffusion

* this was causing spurious segfaults in stitching tests due to propagation of NaNs
* added new test, which checks for NaNs (added new debug asserts for NaNs)
* valgrind now says everything is ok

* add nonlinear diffusion step OCL implementation

* Lt in pyramid changed to UMat, it will be downlaoded from GPU along with Lx, Ly
* fix bug in pm_g2 kernel. OpenCV mangles dimensions passed to OpenCL, so we need to check for boundaries in each OCL kernel.

* port computing of determinant to OCL

* computing of determinant is not a blocker, but with this change we don't need to download all spatial derivatives to CPU, we only download determinant
* make Ldet in the pyramid UMat, download it from CPU together with the other parts of the pyramid
* add profiling macros

* fix visual studio warning

* instrument non_linear_diffusion

* remove changes I have made to TEvolution

* TEvolution is used only in KAZE now

* Revert "features2d: initialize TBB scheduler in perf tests"

This reverts commit ba81e2a711.
2017-08-01 12:46:01 +00:00
Alexander Alekhin
caa5e3b4c5 imgproc: fix vectorized code of accumulate 2017-07-26 17:21:46 +03:00