Commit Graph

160 Commits

Author SHA1 Message Date
Sayed Adel
f2fe6f40c2 Merge pull request #15510 from seiko2plus:issue15506
* core: rework and optimize SIMD implementation of dotProd

  - add new universal intrinsics v_dotprod[int32], v_dotprod_expand[u&int8, u&int16, int32], v_cvt_f64(int64)
  - add a boolean param for all v_dotprod&_expand intrinsics that change the behavior of addition order between
    pairs in some platforms in order to reach the maximum optimization when the sum among all lanes is what only matters
  - fix clang build on ppc64le
  - support wide universal intrinsics for dotProd_32s
  - remove raw SIMD and activate universal intrinsics for dotProd_8
  - implement SIMD optimization for dotProd_s16&u16
  - extend performance test data types of dotprod
  - fix GCC VSX workaround of vec_mule and vec_mulo (in little-endian it must be swapped)
  - optimize v_mul_expand(int32) on VSX

* core: remove boolean param from v_dotprod&_expand and implement v_dotprod_fast&v_dotprod_expand_fast

  this changes made depend on "terfendail" review
2019-10-07 22:01:35 +03:00
Paul E. Murphy
b2135be594 fast_math: add extra perf/unit tests
Add a basic sanity test to verify the rounding functions
work as expected.

Likewise, extend the rounding performance test to cover the
additional float -> int fast math functions.
2019-08-07 14:59:46 -05:00
Alexander Alekhin
f1f0f630c7 core: disable I/O perf test
- can be enable separately if needed
- not stable (due storage I/O processing)
2019-02-27 18:07:45 +03:00
Vitaly Tuzov
00c9ab8c23 Merge pull request #13317 from terfendail:norm_wintr
* Added performance tests for hal::norm functions

* Added sum of absolute differences intrinsic

* norm implementation updated to use wide universal intrinsics

* improve and fix v_reduce_sad on VSX
2018-11-29 19:34:14 +03:00
Hamdi Sahloul
a39e0daacf Utilize CV_UNUSED macro 2018-09-07 20:33:52 +09:00
Vadim Pisarevsky
80b62a41c6 Merge pull request #12411 from vpisarev:wide_convert
* rewrote Mat::convertTo() and convertScaleAbs() to wide universal intrinsics; added always-available and SIMD-optimized FP16<=>FP32 conversion

* fixed compile warnings

* fix some more compile errors

* slightly relaxed accuracy threshold for int->float conversion (since we now do it using single-precision arithmetics, not double-precision)

* fixed compile errors on iOS, Android and in the baseline C++ version (intrin_cpp.hpp)

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds
2018-09-06 19:36:59 +03:00
Vadim Pisarevsky
54279523a3 Merge pull request #12437 from vpisarev:avx2_fixes
* trying to fix the custom AVX2 builder test failures (false alarms)

* fixed compile error with CPU_BASELINE=AVX2 on x86; raised tolerance thresholds in a couple of tests

* fixed compile error with CPU_BASELINE=AVX2 on x86; raised tolerance thresholds in a couple of tests

* fixed compile error with CPU_BASELINE=AVX2 on x86; raised tolerance thresholds in a couple of tests

* seemingly disabled false alarm warning in surf.cpp; increased tolerance thresholds in the tests for SolvePnP and in DNN/ENet
2018-09-06 18:56:55 +03:00
Alexander Alekhin
b24fc6954d core(perf): fix addScalar test
keep the same type for passed Scalar values
2018-08-16 19:36:28 +03:00
Alexander Alekhin
b0ee5d9023 core: CV_NODISCARD macro with semantic of [[nodiscard]] attr
[[nodiscard]] is defined in C++17.
There is fallback alias for modern GCC / Clang compilers.
2018-07-16 18:03:32 +03:00
Alexander Alekhin
65726e4244 core(hal): improve v_select() SSE4.1+
v_select 'mask' is restricted to these values only: 0 or ~0 (0xff/0xffff/etc)
mask in accuracy test is updated.
2018-04-23 13:17:53 +03:00
Vadim Pisarevsky
53661d55ae Merge pull request #10406 from seiko2plus:coreUnvintrinCopy 2018-02-20 14:50:17 +00:00
Alexander Alekhin
4a297a2443 ts: refactor OpenCV tests
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
2018-02-03 19:39:47 +00:00
Alexander Alekhin
a5cd62f7bf core(perf): refactor kmeans test
- don't use RNG for "task size" parameters (N, K, dims)
- add "good" kmeans test data (without singularities: K > unique points)
2018-01-22 14:25:29 +03:00
Sayed Adel
fd0ac962fb core: replace raw intrinsics with universal intrinsics in copy.cpp
- use universal intrinsic instead of raw intrinsic
- add performance check for Mat::copyTo/setTo with mask
2017-12-26 05:30:32 +02:00
Tomoaki Teshima
ca1a0a1108 core: remove raw SSE2/NEON implementation from convert.cpp (#9831)
* remove raw SSE2/NEON implementation from convert.cpp
  * remove raw implementation from Cvt_SIMD
  * remove raw implementation from cvtScale_SIMD
  * remove raw implementation from cvtScaleAbs_SIMD
  * remove duplicated implementation cvt_<float, short>
  * remove duplicated implementation cvtScale_<short, short, float>
  * add "from double" version of Cvt_SIMD
  * modify the condition of test ConvertScaleAbs

* Update convert.cpp

fixed crash in cvtScaleAbs(8s=>8u)

* fixed compile error on Win32

* fixed several test failures because of accuracy loss in cvtScale(int=>int)

* fixed NEON implementation of v_cvt_f64(int=>double) intrinsic

* another attempt to fix test failures

* keep trying to fix the test failures and just introduced compile warnings

* fixed one remaining test (subtractScalar)
2017-12-15 00:00:35 +03:00
Tomoaki Teshima
3cbe60cca2 Merge pull request #9753 from tomoaki0705:universalMatmul
* add accuracy test and performance check for matmul
  * add performance tests for transform and dotProduct
  * add test Core_TransformLargeTest for 8u version of transform

* remove raw SSE2/NEON implementation from matmul.cpp
  * use universal intrinsic instead of raw intrinsic
  * remove unused templated function
  * add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
  * add v_rotate_left/right in universal intrinsic
  * suppress intrinsic on some function and platform
  * add pure SW implementation of new universal intrinsics
  * add test for new universal intrinsics

* core: prevent memory access after the end of buffer

* fix perf tests
2017-11-20 15:56:53 +03:00
Alexander Alekhin
582bb3c311 core(perf): added Hamming tests 2017-07-01 00:49:18 +00:00
Vitaly Tuzov
2492c299f3 Extended set of existing performance test to OpenVX HAL suitable execution modes 2017-04-27 12:32:29 +03:00
Pavel Vlasov
35c7216846 IPP for OpenCV 2017u2 initial enabling patch; 2017-04-20 20:26:30 +03:00
Alexander Alekhin
a901cc542b test: fix tolerance perf check for Exp/Log/Sqrt 2016-10-20 16:54:48 +03:00
Alexander Alekhin
43c48e2ed1 test: update Div and ConvertScaleAbs perf tests 2016-10-20 16:54:46 +03:00
Alexander Alekhin
5bafc1db75 test: fix tolerance
cv::rand result is not bitexact for floating-point numbers
2016-10-20 16:54:33 +03:00
Alexander Alekhin
abad2ca76c test: fix tolerance
cv::rand result is not bitexact for floating-point numbers
2016-10-20 16:54:30 +03:00
Maksim Shabunin
dc704d77ac Fixed several GCC 5.x warnings 2016-09-01 15:44:01 +03:00
MYLS
8596e82d98 Add JSON support.
a JSON emitter, a parser, tests and some basic doc.
2016-08-11 00:53:15 +08:00
MYLS
8a65e73bfd add SANITY_CHECK_NOTHING() to perf_test 2016-07-20 20:18:16 +08:00
MYLS
27b924e99e remove CHECK from performance test 2016-07-19 22:06:21 +08:00
MYLS
cf2d6f6721 solve errors and warnings
Modified performance test and solve a problem caused by an enum type.
2016-07-19 21:18:41 +08:00
MYLS
78ca5ddd45 solve errors and warnings 2016-07-19 19:56:57 +08:00
MYLS
0823ec0ef0 modified performance test
For faster test
2016-07-19 16:11:20 +08:00
MYLS
617df09143 Modify Base64 functions and add test and documentation
Major changes:

- modify the Base64 functions to compatible with `cvWriteRawData` and so
on.
- add a Base64 flag for FileStorage and outputs raw data in Base64
automatically.
- complete all  testing and documentation.
2016-07-19 15:54:38 +08:00
Alexander Alekhin
96937bac74 Merge pull request #6581 from mshabunin:hal_mag 2016-06-21 13:16:17 +00:00
Tomoaki Teshima
070e4d754e let the performance test of split pass on 64bit ARM
* loosen the threshold only under aarch64
  * fix #6610
2016-05-31 23:57:49 +09:00
Maksim Shabunin
1e667de1f3 HAL math interfaces: fastAtan2, magnitude, sqrt, invSqrt, log, exp 2016-05-31 11:54:52 +03:00
Alexander Alekhin
4ecc023219 UMat: add perf test for custom ptr 2015-09-03 10:48:07 +03:00
thebucc
421e1b237c Fix for bug #5007: moved definition of Size_MatDepth_t and Size_MatDepth from ts_perf.hpp to perf_channels.cpp. This way they are closer to where they are needed and live in a different namespace (possibly the reason why the fix works). 2015-08-17 16:09:00 +01:00
Ilya Lavrenov
6f8b3fc633 cvRound 2015-03-11 16:14:39 +03:00
Ilya Lavrenov
ef29b15c9a reciprocal 2015-01-12 10:59:30 +03:00
Ilya Lavrenov
44d89638fd divide 2015-01-12 10:59:30 +03:00
Ilya Lavrenov
63fc6ef316 convertTo from 64f 2015-01-12 10:59:29 +03:00
Ilya Lavrenov
19e77e4787 convertTo from 8u 2015-01-12 10:59:29 +03:00
Vladislav Vinogradov
1be1a28920 move CUDA core tests to core module 2014-12-23 17:42:49 +03:00
Elena Gvozdeva
46b6a095a2 changed perf test for ocl_gemm 2014-08-26 15:05:36 +04:00
Alexander Karsakov
f3dfdfef8a Fixed warning with "uninitialized local variable" 2014-08-12 10:35:15 +04:00
Alexander Alekhin
55188fe991 world fix 2014-08-05 20:12:35 +04:00
Alexander Karsakov
e51c0810b6 Added accuracy and performance tests for DFT all modes. 2014-07-24 15:17:31 +04:00
Ilya Lavrenov
c949845a6b fixed perf test 2014-07-10 16:03:31 +04:00
Ilya Lavrenov
978f7eb44a added perf test for transpose inplace 2014-06-30 18:33:26 +04:00
Ilya Lavrenov
9edd24fe51 changed power in cv::pow test to test actual kernel 2014-06-10 17:41:43 +04:00
Ilya Lavrenov
80470f9cf6 added performance test 2014-05-30 18:34:04 +04:00