Commit Graph

185 Commits

Author SHA1 Message Date
Namgoo Lee
c219f97f48 SSE2 : use _mm_cvtpd_epi32 when converting from CV_64F to CV_32S (#10987)
* SSE2 : use _mm_cvtpd_epi32 when converting from CV_64F to CV_32S

* No need to define a new universal intrinsic
2018-03-06 09:50:53 +03:00
Maksim Shabunin
221342fb25 Split convert.cpp into smaller pieces 2018-02-12 15:17:19 +03:00
Alexander Alekhin
5a791e6e06 cmake: update reporting of excluded dispatching files (#10711)
* cmake: add ocv_get_smart_file_name() macro

* cmake: avoid adding files for unavailable dispatch modes
2018-02-12 14:48:20 +03:00
Tomoaki Teshima
ca1a0a1108 core: remove raw SSE2/NEON implementation from convert.cpp (#9831)
* remove raw SSE2/NEON implementation from convert.cpp
  * remove raw implementation from Cvt_SIMD
  * remove raw implementation from cvtScale_SIMD
  * remove raw implementation from cvtScaleAbs_SIMD
  * remove duplicated implementation cvt_<float, short>
  * remove duplicated implementation cvtScale_<short, short, float>
  * add "from double" version of Cvt_SIMD
  * modify the condition of test ConvertScaleAbs

* Update convert.cpp

fixed crash in cvtScaleAbs(8s=>8u)

* fixed compile error on Win32

* fixed several test failures because of accuracy loss in cvtScale(int=>int)

* fixed NEON implementation of v_cvt_f64(int=>double) intrinsic

* another attempt to fix test failures

* keep trying to fix the test failures and just introduced compile warnings

* fixed one remaining test (subtractScalar)
2017-12-15 00:00:35 +03:00
Alexander Alekhin
0ed3209b00 ocl: avoid unnecessary loading/initializing OpenCL subsystem
If there are no OpenCL/UMat methods calls from application.

OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called

Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
Pavel Vlasov
a57718e1ac ICV2017u3 package update;
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
2017-08-23 14:24:43 +03:00
Alexander Alekhin
b4716b1d92 core: fix convertTo() AVX2 optimization 2017-07-17 15:02:14 +03:00
Vitaly Tuzov
5448d9186a AVX and SSE4.1 optimized conversion implementations migrated to separate files 2017-07-04 14:48:01 +03:00
Tomoaki Teshima
94848a3e1f suppress unreachable code warning
- fix the define condition based on the comment
2017-06-13 08:11:04 +09:00
Alexander Alekhin
125abe2fe4 Merge pull request #8838 from tomoaki0705:dispatchFp16 2017-06-06 15:31:42 +00:00
Tomoaki Teshima
e269ef96cb update convertFp16 using CV_CPU_CALL_FP16
* avoid link error (move the implementation of software version to header)
 * make getConvertFuncFp16 local (move from precomp.hpp to convert.hpp)
 * fix error on 32bit x86
2017-06-06 22:26:51 +09:00
Vadim Pisarevsky
ee257ffe9e Merge pull request #8455 from terfendail:ovxhal_skipsmall 2017-05-26 12:10:18 +00:00
Tomoaki Teshima
d81cdb8e1c add OpenCL version of convertFp16 and test
* disable vector operation for now
 * brush up the implementation based on comment
2017-05-23 20:00:21 +09:00
Pavel Vlasov
11c2ffaf1c Update for IPP for OpenCV 2017u2 integration;
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
2017-04-25 15:53:12 +03:00
Alexander Alekhin
f1c8094f5f Merge pull request #8575 from lupustr3:pvlasov/icv2017u2_initial_update 2017-04-21 10:55:29 +00:00
Pavel Vlasov
35c7216846 IPP for OpenCV 2017u2 initial enabling patch; 2017-04-20 20:26:30 +03:00
Arnaud Brejeon
636ab095b0 Merge pull request #8535 from arnaudbrejeon:std_array
Add support for std::array<T, N> (#8535)

* Add support for std::array<T, N>

* Add std::array<Mat, N> support

* Remove UMat constructor with std::array parameter
2017-04-19 13:13:39 +03:00
Vitaly Tuzov
bf5b7843e8 Extended set of OpenVX HAL calls disabled for small images 2017-04-06 18:17:32 +03:00
Vitaly Tuzov
9a4b5a4545 OpenVX calls updated to use single common OpenVX context per thread 2017-02-21 16:08:23 +03:00
Rostislav Vasilikhin
fcdbe16252 openvx_cvt disabled for Khronos, fixed sstep and dstep usage 2016-12-16 23:00:55 +03:00
Rostislav Vasilikhin
cf5e976fad OpenVX convert enabled 2016-12-16 15:57:21 +03:00
Rostislav Vasilikhin
8b9422a052 OpenVX wrappers rewritten with CV_OVX_RUN, VX_DbgThrow 2016-12-14 17:49:41 +03:00
apavlenko
76c38f0c80 trying to enable canny_vx adding a new test comparing canny_cv vs canny_vx 2016-12-09 14:53:06 +03:00
Alexander Alekhin
48bff3bfd3 Merge pull request #7748 from LaurentBerger:Normalize3d 2016-12-08 16:20:11 +00:00
Rostislav Vasilikhin
695b20172b Merge pull request #7794 from savuor:fix/ovx_cvt_continuous
Fixed OpenVX wrapper for Mat::convertTo() (#7794)

* fixed for cases of unrolled (w*h x 1) matrices

* more error handling
2016-12-06 18:29:44 +02:00
Vitaly Tuzov
ced81f72bc Added OpenVX based processing to LUT 2016-12-02 14:36:47 +03:00
Rostislav Vasilikhin
2b56b174e8 fixed: data types, empty input case 2016-11-29 17:52:50 +03:00
Rostislav Vasilikhin
0a6958813c added OpenVX call to Mat::convertTo() (w/o scaling) 2016-11-29 17:52:36 +03:00
LaurentBerger
c56c0e140b Solve exception for 3D Mat 2016-11-29 12:10:33 +01:00
Alexander Alekhin
a9ab629f60 build: fix aarch64 build with aarch64-linux-gnu-g++-4.8 2016-09-29 17:26:19 +03:00
Tomoaki Teshima
c7cb116dc0 check FP16 build condition correctly
* use __GNUC_MINOR__ in correct place to check the version of GCC
  * check processor support of FP16 at run time
  * check compiler support of FP16 and pass correct compiler option
  * rely on ENABLE_AVX on gcc since AVX is generated when mf16c is passed
  * guard correctly using ifdef in case of various configuration
  * use v_float16x4 correctly by including the right header file
2016-09-23 11:04:22 +09:00
Tomoaki Teshima
903789f7af use universal intrinsic for FP16
* use v_float16x4 (universal intrinsic) instead of raw SSE/NEON implementation
  * define v_load_f16/v_store_f16 since v_load can't be distinguished when short pointer passed
  * brush up implementation on old compiler (guard correctly)
  * add test for v_load_f16 and round trip conversion of v_float16x4
  * fix conversion error
2016-09-05 08:13:52 +09:00
Alexander Alekhin
da5ead2c23 Merge pull request #7166 from tomoaki0705:brushUpFp16 2016-08-25 11:49:23 +00:00
Tomoaki Teshima
c5d7791b67 brush up fp16 implementation
* DRY
  * switch to Cv32suf and remove fp32Int32
  * add Cv16suf
2016-08-25 05:31:25 +09:00
Pavel Vlasov
30a6cee2fe Instrumentation for OpenCV API regions and IPP functions; 2016-08-19 18:10:03 +03:00
Tomoaki Teshima
3debc78a5f fix build error on JetsonTK1
* avoid using vld1_f16 and vst1_f16 on gcc 4 series (Ubuntu 14.04)
  * guard correctly with #if
  * use static inline
2016-08-09 17:12:22 +09:00
Tomoaki Teshima
87ca607fd4 brush up convertFp16
* raise an error when wrong bit depth passed
  * raise an build error when wrong depth is specified for cvtScaleHalf_
  * remove unnecessary safe check in cvtScaleHalf_
  * use intrinsic instead of direct pointer access
  * update the explanation
2016-08-03 17:27:45 +09:00
Tomoaki Teshima
c57f8780e9 show CPU feature correctly when FP16 is available
* make sure that CV_FP16 has the correct meaning
  * check FP16 feature correctly
2016-07-29 14:10:33 +09:00
Alexander Alekhin
2ec63e4dd1 fix android pack build 2016-07-20 16:49:57 +03:00
Vadim Pisarevsky
48b747903b Merge pull request #6830 from tomoaki0705:featureSupportFp16 2016-07-18 15:56:00 +00:00
Alexander Alekhin
5f269d08b4 bigdata: add test, resolve split/merge issue 2016-07-08 18:05:53 +03:00
Tomoaki Teshima
d0a8390963 fix run time error on Mac
* integrate HW version and SW version to same function
2016-06-09 08:41:37 +09:00
Tomoaki Teshima
fd76ed5c0f fix to support wider compiler
* check compiler more strictly
  * use gcc version of fp16 conversion if it's possible (gcc 4.7 and later)
  * use current SW implementation in other cases
2016-06-07 18:32:47 +09:00
Tomoaki Teshima
6f6eebbcb9 fix warning 2016-06-07 18:31:18 +09:00
Tomoaki Teshima
fbfd3158a7 fix corner case when number is small 2016-06-07 08:59:28 +09:00
Tomoaki Teshima
eccf2fa4c3 follow other interface
* remove useHW option
  * update test
2016-06-06 08:56:37 +09:00
Tomoaki Teshima
b2ad7cd9c0 add feature to convert FP32(float) to FP16(half)
* check compiler support
  * check HW support before executing
  * add test doing round trip conversion from / to FP32
  * treat array correctly if size is not multiple of 4
  * add declaration to prevent warning
  * make it possible to enable fp16 on 32bit ARM
  * let the conversion possible on non-supported HW, too.
  * add test using both HW and SW implementation
2016-05-21 21:31:33 +09:00
Alexander Alekhin
6997d423c8 fix normalize in case of inplace operations
fixes #5876
2015-12-25 15:33:06 +03:00
Maksim Shabunin
b4bcdd10a1 HAL: improvements
- added new functions from core module: split, merge, add, sub, mul, div, ...
- added function replacement mechanism
- added example of HAL replacement library
2015-12-03 14:43:37 +03:00
hoangviet1985
e679d97100 remove redundant code 2015-11-22 14:32:18 -05:00