Commit Graph

302 Commits

Author SHA1 Message Date
Alexander Alekhin
44d473fba0 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-07-08 21:03:43 +00:00
Alexander Alekhin
2fed41dfa5 imgproc(test): test bitExact cases in OCL/sepFilter2D 2020-07-08 10:14:56 +00:00
Alexander Alekhin
593af7287b Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-05-18 17:50:16 +00:00
Alexander Alekhin
a1b09a3734 imgproc(perf): add GaussianBlur cases for SIFT 2020-05-17 10:15:31 +00:00
Alexander Alekhin
d4a17da7b2 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-03-04 20:49:09 +00:00
Alexander Alekhin
fd09413566
Merge pull request #16731 from alalek:issue_16708
* imgproc(integral): avoid OOB access

* imgproc(test): fix integral perf check

- FP32 computation is not accurate

* imgproc(integral): tune loop limits
2020-03-04 19:28:04 +00:00
Alexander Alekhin
92b9888837 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-12 13:02:19 +03:00
Paul Murphy
a011035ed6 Merge pull request #15257 from pmur:resize
* resize: HResizeLinear reduce duplicate work

There appears to be a 2x unroll of the HResizeLinear against k,
however the k value is only incremented by 1 during the unroll. This
results in k - 1 duplicate passes when k > 1.

Likewise, the final pass may not respect the work done by the vector
loop. Start it with the offset returned by the vector op if
implemented. Note, no vector ops are implemented today.

The performance is most noticable on a linear downscale. A set of
performance tests are added to characterize this.  The performance
improvement is 10-50% depending on the scaling.

* imgproc: vectorize HResizeLinear

Performance is mostly gated by the gather operations
for x inputs.

Likewise, provide a 2x unroll against k, this reduces the
number of alpha gathers by 1/2 for larger k.

While not a 4x improvement, it still performs substantially
better under P9 for a 1.4x improvement. P8 baseline is
1.05-1.10x due to reduced VSX instruction set.

For float types, this results in a more modest
1.2x improvement.

* Update U8 processing for non-bitexact linear resize

* core: hal: vsx: improve v_load_expand_q

With a little help, we can do this quickly without gprs on
all VSX enabled targets.

* resize: Fix cn == 3 step per feedback

Per feedback, ensure we don't overrun. This was caught via the
failure observed in Test_TensorFlow.inception_accuracy.
2019-12-09 14:54:06 +03:00
Alexander Alekhin
bea2c75452 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-09-05 14:29:22 +03:00
Everton Constantino
76e403cf25 Merge pull request #15440 from everton1984:new_integral_tests
* Adding all possible data type interactions to the perf tests since some
use SIMD acceleration and others do not.

* Disabling full tests by default.

* Giving proper names, removing magic numbers and sanity checks of new
performance tests for the integral function.

* Giving proper names, making array static.
2019-09-04 19:14:00 +03:00
luz.paz
fcc7d8dd4e Fix modules/ typos
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`

backporting of commit: ec43292e1e
2019-08-16 17:34:29 +03:00
luz.paz
ec43292e1e Fix modules/ typos
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`
2019-08-15 18:02:09 -04:00
Alexander Alekhin
32772a5436 3.4: backported changes from 'master' branch 2019-08-14 16:36:08 +03:00
Alexander Alekhin
174b4ce29d Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-08-05 18:11:43 +00:00
dcouwenh
d3cf0d2c06 Bayer VNG Demosaicing Fix #2 (Merge pull request #15086)
* Update demosaicing.cpp

Fixed calculation of Bs for non-green pixels.

* Fixed cvtColor perf test for bayer VNG
2019-07-30 23:49:46 +03:00
Vitaly Tuzov
e0f8bb83a6 Merge pull request #14994 from terfendail:wintr_undistort
WUI based implementation to initUndistortRectifyMap (#14994)

* Add initUndistortRectifyMap performance test

* Move cv namespace boundaries

* Add wide universal intrinsics based implementation to initUndistortRectifyMap

* Dispatch undistort
2019-07-18 19:32:51 +03:00
Alexander Alekhin
8c0b0714e7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-03-11 19:20:22 +00:00
Alexander Alekhin
d5a2fe5180 perf: ignore _ovx tests 2019-03-06 15:52:23 +03:00
Alexander Alekhin
c3cf35ab63 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-26 17:34:42 +03:00
Brad Kelly
507f8add1c Implementing AVX512 Support for 2 and 4 channel mats for CV_64F format 2019-02-19 11:31:20 -08:00
Alexander Alekhin
f414c16c13 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-08 17:18:56 +00:00
Alexander Alekhin
757d8ac8f7 Merge pull request #13769 from savuor:cvtColor_tests_16u_32f 2019-02-08 15:29:35 +00:00
Rostislav Vasilikhin
4e679e1cc5 disabled 16u and 32f perf tests 2019-02-07 19:26:36 +03:00
Rostislav Vasilikhin
87f651c119 disabled sanity check for 32f 2019-02-07 18:20:29 +03:00
Namgoo Lee
fb8e652c3f Add CV_16UC1 support for cuda::CLAHE
Due to size limit of shared memory, histogram is built on
the global memory for CV_16UC1 case.

The amount of memory needed for building histogram is:

    65536 * 4byte = 256KB

and shared memory limit is 48KB typically.

Added test cases for CV_16UC1 and various clip limits.
Added perf tests for CV_16UC1 on both CPU and CUDA code.

There was also a bug in CV_8UC1 case when redistributing
"residual" clipped pixels. Adding the test case where clip
limit is 5.0 exposes this bug.
2019-02-06 17:21:55 +00:00
Rostislav Vasilikhin
bbedebb57c perf tests for cvtColor for 16U and 32f added 2019-02-06 17:56:44 +03:00
Alexander Alekhin
665408e57f Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-01 13:17:32 +03:00
Vitaly Tuzov
ed2e1af3e8 Added performance test for blendLinear 2019-01-25 14:16:19 +03:00
Alexander Alekhin
631b246881 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-01-22 18:00:34 +00:00
Vitaly Tuzov
78f80c35d2 Performance test for bounding rect estimation 2019-01-18 15:50:21 +03:00
Brad Kelly
0165ffa90d Implementing AVX512 support for 3 channel cv::integral for CV_64F 2019-01-14 16:11:01 -08:00
Alexander Alekhin
0c16d8f6c3 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-12-13 15:12:26 +03:00
Vitaly Tuzov
3903174f7c Merge pull request #13334 from terfendail:histogram_wintr
* added performance test for compareHist

* compareHist reworked to use wide universal intrinsics

* Disabled vectorization for CV_COMP_CORREL and CV_COMP_BHATTACHARYYA if f64 is unsupported
2018-12-13 14:20:22 +03:00
Alexander Alekhin
2e0150e601 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-12-03 18:38:27 +03:00
Vitaly Tuzov
e991e05b9b Added anonymous namespace to perf_contours 2018-11-27 11:35:40 +03:00
Alexander Alekhin
8f4e5c2fb8 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-26 15:37:45 +03:00
Vitaly Tuzov
e9e8bf4b81 Added performance tests for findContours 2018-11-21 19:57:02 +03:00
Alexander Alekhin
edacd91a27 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-10-15 20:15:42 +00:00
Sayed Adel
8965f3ae06 imgproc:simd Enable VSX and wide universal intrinsics for accumulate operations
- improve cpu dispatching calls to allow more SIMD extentions
    (SSE4.1, AVX2, VSX)
  - wide universal intrinsics
  - replace dummy v_expand with v_expand_low
  - replace v_expand + v_mul_wrap with v_mul_expand for product accumulate operations
  - use FMA for accumulate operations
  - add mask and more types to accumulate's performance tests
2018-10-11 04:37:12 +02:00
Jakub Golinowski
9f1218b00b Merge pull request #11897 from Jakub-Golinowski:hpx_backend
* Add HPX backend for OpenCV implementation
Adds hpx backend for cv::parallel_for_() calls respecting the nstripes chunking parameter. C++ code for the backend is added to modules/core/parallel.cpp. Also, the necessary changes to cmake files are introduced.
Backend can operate in 2 versions (selectable by cmake build option WITH_HPX_STARTSTOP): hpx (runtime always on) and hpx_startstop (start and stop the backend for each cv::parallel_for_() call)

* WIP: Conditionally include hpx_main.hpp to tests in core module
Header hpx_main.hpp is included to both core/perf/perf_main.cpp and core/test/test_main.cpp.
The changes to cmake files for linking hpx library to above mentioned test executalbles are proposed but have issues.

* Add coditional iclusion of hpx_main.hpp to cpp cpu modules

* Remove start/stop version of hpx backend
2018-08-31 16:23:26 +03:00
Alexander Alekhin
d044ac1c66 imgproc(getPerspectiveTransform): add solveMethod parameter
drop compatibility workaround via runtime parameter
2018-07-26 18:22:59 +03:00
Alexander Alekhin
2170811e48 imgproc(perf): update getPerspectiveTransform perf test
Function is very fast, so 0.000 ms results are useless.
1000 runs requires 25ms on i7-6700K.
2018-07-13 15:31:33 +03:00
yuki takehara
ed207d79e7 Merge pull request #11108 from take1014:hough_4303
* Added accumulator value to the output of HoughLines and HoughCircles

* imgproc: refactor Hough patch

- eliminate code duplication
- fix type handling, fix OpenCL code
- fix test data generation
- re-generated test data in debug mode via plain CPU code path
2018-05-23 20:42:12 +00:00
Vadim Pisarevsky
b8b7ca7302
Rewite polar transforms (#11323)
* Rewrite polar transformations

- A new wrapPolar function encapsulate both linear and semi-log remap
- Destination size is a parameter or calculated automatically to keep objects size between remapping
- linearPolar and logPolar has been deprecated

* Fix build warning and error in accuracy test

* Fix function name to warpPolar

* Explicitly specify the mapping mode, so we retain all the parameters as non-optional.

Introduces WarpPolarMode enum to specify the mapping mode in flags

* resolves performance warning on windows build

* removed duplicated logPolar and linearPolar implementations
2018-04-17 15:50:52 +03:00
Tomoaki Teshima
a82e70cd40 remove raw SSE2/NEON implementation from imgwarp.cpp
* use universal intrinsic instead of raw intrinsic
  * add 2 channels de-interleave on x86 platform
  * add v_int32x4 version of v_muladd
  * add accumulate version of v_dotprod based on the commit from seiko2plus on bf1852d
  * remove some verify check in performance test
  * avoid the out of boundary access and keep the performance
2018-04-13 21:19:16 +09:00
Alexander Alekhin
4a297a2443 ts: refactor OpenCV tests
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
2018-02-03 19:39:47 +00:00
Tom Becker
592f8d8c1b Merge pull request #10232 from TomBecker-BD:hough-many-circles
Hough many circles (#10232)

* Add Hui's optimization. Merge with latest changes in OpenCV.

* Use conditional compilation instead of a runtime flag.

* Whitespace.

* Create the sequence for the nonzero edge pixels only if using that approach.

* Improve performance for finding very large numbers of circles

* Return the circles with the larger accumulator values first, as per API documentation.
Use a separate step to check distance between circles. Allows circles to be sorted by strength first. Avoids locking in EstimateRadius which was slowing it down.
Return centers only if maxRadius == 0 as per API documentation.

* Sort the circles so results are deterministic. Otherwise the order of circles with the same strength depends on parallel processing completion order.

* Add test for HoughCircles.

* Add beads test.

* Wrap the non-zero points structure in a common interface so the code can use either a vector or a matrix.

* Remove the special case for skipping the radius search if maxRadius==0.

* Add performance tests.

* Use NULL instead of nullptr.
OpenCV should compile with C++98 compiler.

* Put test suite name first.
Use different test suite names for each test to avoid an error from the test runner.

* Address build bot errors and warnings.

* Skip radius search if maxRadius < 0.

* Dynamically switch to NZPointList when it will be faster than NZPointSet.

* Fix compile error: missing 'typename' prior to dependent type name.

* Fix compile error: missing 'typename' prior to dependent type name.
This time fix it the non C++ 11 way.

* Fix compile error: no type named 'const_reference' in 'class cv::NZPointList'

* Disable ManySmallCircles tests. Failing on Mac.

* Change beads image to JPEG for smaller file size.
Try enabling the ManySmallCircles tests again.

* Remove ManySmallCircles tests. They are failing on the Mac build.

* Fix expectations to check all circles.

* Changing case on a case-insensitive file system
Step 1: remove the old file names

* Changing case on a case-insensitive file system
Step 2: add them back with the new names

* Fix cmpAccum function to be strictly weak ordered.

* Add tests for many small circles.

* imgproc(perf): fix HoughCircles tests

* imgproc(houghCircles): refactor code

- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)
2017-12-28 17:23:11 +03:00
Vadim Pisarevsky
69a6765bf7 Merge pull request #10387 from terfendail:resize23_perftest 2017-12-22 13:26:05 +00:00
Vitaly Tuzov
b6fe4cc807 Added performance tests for linear resize of 2 and 3-channel images 2017-12-20 18:11:21 +03:00
Alexander Alekhin
d5f152494b fix file names 2017-12-15 14:59:35 +03:00