Commit Graph

197 Commits

Author SHA1 Message Date
Alexander Alekhin
1913482cf5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-10 20:50:26 +00:00
Alexander Alekhin
858a7da5c0 core: rework getContinuousSize() for vector-col/row support 2018-11-10 11:08:28 +00:00
Alexander Alekhin
5059523937 core: fix processing of vector-rows 2018-11-08 20:04:22 +03:00
Alexander Alekhin
808ba552c5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-14 23:44:35 +00:00
Hamdi Sahloul
5d54def264 Add semicolons after CV_INSTRUMENT macros 2018-09-14 06:45:31 +09:00
Vadim Pisarevsky
6d7f5871db
added basic support for CV_16F (the new datatype etc.) (#12463)
* added basic support for CV_16F (the new datatype etc.). CV_USRTYPE1 is now equal to CV_16F, which may break some [rarely used] functionality. We'll see

* fixed just introduced bug in norm; reverted errorneous changes in Torch importer (need to find a better solution)

* addressed some issues found during the PR review

* restored the patch to fix some perf test failures
2018-09-10 16:56:29 +03:00
Vadim Pisarevsky
80b62a41c6 Merge pull request #12411 from vpisarev:wide_convert
* rewrote Mat::convertTo() and convertScaleAbs() to wide universal intrinsics; added always-available and SIMD-optimized FP16<=>FP32 conversion

* fixed compile warnings

* fix some more compile errors

* slightly relaxed accuracy threshold for int->float conversion (since we now do it using single-precision arithmetics, not double-precision)

* fixed compile errors on iOS, Android and in the baseline C++ version (intrin_cpp.hpp)

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds
2018-09-06 19:36:59 +03:00
Alexander Alekhin
acce95f446 backport fixes for static analyzer warnings
Commits:
- 09837928d9
- 10fb88d027

Excluded changes with std::atomic (C++98 requirement)
2018-09-04 16:49:42 +03:00
amatyuko
3ea2586a5a Fix for SSE2 intrinsics problem in the part of saturation arithmetic processing during 32s->16u packed conversion -
for some big negative values less than -INT_MAX+32767 the sign of the numbers is lost due to overflow that leads to
incorrect saturation to MAX value, instead of zero.
The issue is not reproduced with CV_ENABLED_INTRINSICS=OFF
2018-08-01 16:04:08 +03:00
Vadim Pisarevsky
7d19bd6c19 Merge pull request #11634 from vpisarev:empty_mat_with_types_2
fixes handling of empty matrices in some functions (#11634)

* a part of PR #11416 by Yuki Takehara

* moved the empty mat check in Mat::copyTo()

* fixed some test failures
2018-05-31 16:36:39 +00:00
Alexander Alekhin
97882d03cc core: fix FP16 conversion with CV_DISABLE_OPTIMIZATION option
Reproducer:
    cmake -DCPU_BASELINE=AVX2 -DCV_DISABLE_OPTIMIZATION=ON ...
2018-04-18 14:13:03 +03:00
Dmitry Kurtaev
73ca194313 Fix convertFp16 in JavaScript build 2018-04-07 09:44:43 +03:00
Namgoo Lee
c219f97f48 SSE2 : use _mm_cvtpd_epi32 when converting from CV_64F to CV_32S (#10987)
* SSE2 : use _mm_cvtpd_epi32 when converting from CV_64F to CV_32S

* No need to define a new universal intrinsic
2018-03-06 09:50:53 +03:00
Maksim Shabunin
221342fb25 Split convert.cpp into smaller pieces 2018-02-12 15:17:19 +03:00
Alexander Alekhin
5a791e6e06 cmake: update reporting of excluded dispatching files (#10711)
* cmake: add ocv_get_smart_file_name() macro

* cmake: avoid adding files for unavailable dispatch modes
2018-02-12 14:48:20 +03:00
Tomoaki Teshima
ca1a0a1108 core: remove raw SSE2/NEON implementation from convert.cpp (#9831)
* remove raw SSE2/NEON implementation from convert.cpp
  * remove raw implementation from Cvt_SIMD
  * remove raw implementation from cvtScale_SIMD
  * remove raw implementation from cvtScaleAbs_SIMD
  * remove duplicated implementation cvt_<float, short>
  * remove duplicated implementation cvtScale_<short, short, float>
  * add "from double" version of Cvt_SIMD
  * modify the condition of test ConvertScaleAbs

* Update convert.cpp

fixed crash in cvtScaleAbs(8s=>8u)

* fixed compile error on Win32

* fixed several test failures because of accuracy loss in cvtScale(int=>int)

* fixed NEON implementation of v_cvt_f64(int=>double) intrinsic

* another attempt to fix test failures

* keep trying to fix the test failures and just introduced compile warnings

* fixed one remaining test (subtractScalar)
2017-12-15 00:00:35 +03:00
Alexander Alekhin
0ed3209b00 ocl: avoid unnecessary loading/initializing OpenCL subsystem
If there are no OpenCL/UMat methods calls from application.

OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called

Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
Pavel Vlasov
a57718e1ac ICV2017u3 package update;
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
2017-08-23 14:24:43 +03:00
Alexander Alekhin
b4716b1d92 core: fix convertTo() AVX2 optimization 2017-07-17 15:02:14 +03:00
Vitaly Tuzov
5448d9186a AVX and SSE4.1 optimized conversion implementations migrated to separate files 2017-07-04 14:48:01 +03:00
Tomoaki Teshima
94848a3e1f suppress unreachable code warning
- fix the define condition based on the comment
2017-06-13 08:11:04 +09:00
Alexander Alekhin
125abe2fe4 Merge pull request #8838 from tomoaki0705:dispatchFp16 2017-06-06 15:31:42 +00:00
Tomoaki Teshima
e269ef96cb update convertFp16 using CV_CPU_CALL_FP16
* avoid link error (move the implementation of software version to header)
 * make getConvertFuncFp16 local (move from precomp.hpp to convert.hpp)
 * fix error on 32bit x86
2017-06-06 22:26:51 +09:00
Vadim Pisarevsky
ee257ffe9e Merge pull request #8455 from terfendail:ovxhal_skipsmall 2017-05-26 12:10:18 +00:00
Tomoaki Teshima
d81cdb8e1c add OpenCL version of convertFp16 and test
* disable vector operation for now
 * brush up the implementation based on comment
2017-05-23 20:00:21 +09:00
Pavel Vlasov
11c2ffaf1c Update for IPP for OpenCV 2017u2 integration;
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
2017-04-25 15:53:12 +03:00
Alexander Alekhin
f1c8094f5f Merge pull request #8575 from lupustr3:pvlasov/icv2017u2_initial_update 2017-04-21 10:55:29 +00:00
Pavel Vlasov
35c7216846 IPP for OpenCV 2017u2 initial enabling patch; 2017-04-20 20:26:30 +03:00
Arnaud Brejeon
636ab095b0 Merge pull request #8535 from arnaudbrejeon:std_array
Add support for std::array<T, N> (#8535)

* Add support for std::array<T, N>

* Add std::array<Mat, N> support

* Remove UMat constructor with std::array parameter
2017-04-19 13:13:39 +03:00
Vitaly Tuzov
bf5b7843e8 Extended set of OpenVX HAL calls disabled for small images 2017-04-06 18:17:32 +03:00
Vitaly Tuzov
9a4b5a4545 OpenVX calls updated to use single common OpenVX context per thread 2017-02-21 16:08:23 +03:00
Rostislav Vasilikhin
fcdbe16252 openvx_cvt disabled for Khronos, fixed sstep and dstep usage 2016-12-16 23:00:55 +03:00
Rostislav Vasilikhin
cf5e976fad OpenVX convert enabled 2016-12-16 15:57:21 +03:00
Rostislav Vasilikhin
8b9422a052 OpenVX wrappers rewritten with CV_OVX_RUN, VX_DbgThrow 2016-12-14 17:49:41 +03:00
apavlenko
76c38f0c80 trying to enable canny_vx adding a new test comparing canny_cv vs canny_vx 2016-12-09 14:53:06 +03:00
Alexander Alekhin
48bff3bfd3 Merge pull request #7748 from LaurentBerger:Normalize3d 2016-12-08 16:20:11 +00:00
Rostislav Vasilikhin
695b20172b Merge pull request #7794 from savuor:fix/ovx_cvt_continuous
Fixed OpenVX wrapper for Mat::convertTo() (#7794)

* fixed for cases of unrolled (w*h x 1) matrices

* more error handling
2016-12-06 18:29:44 +02:00
Vitaly Tuzov
ced81f72bc Added OpenVX based processing to LUT 2016-12-02 14:36:47 +03:00
Rostislav Vasilikhin
2b56b174e8 fixed: data types, empty input case 2016-11-29 17:52:50 +03:00
Rostislav Vasilikhin
0a6958813c added OpenVX call to Mat::convertTo() (w/o scaling) 2016-11-29 17:52:36 +03:00
LaurentBerger
c56c0e140b Solve exception for 3D Mat 2016-11-29 12:10:33 +01:00
Alexander Alekhin
a9ab629f60 build: fix aarch64 build with aarch64-linux-gnu-g++-4.8 2016-09-29 17:26:19 +03:00
Tomoaki Teshima
c7cb116dc0 check FP16 build condition correctly
* use __GNUC_MINOR__ in correct place to check the version of GCC
  * check processor support of FP16 at run time
  * check compiler support of FP16 and pass correct compiler option
  * rely on ENABLE_AVX on gcc since AVX is generated when mf16c is passed
  * guard correctly using ifdef in case of various configuration
  * use v_float16x4 correctly by including the right header file
2016-09-23 11:04:22 +09:00
Tomoaki Teshima
903789f7af use universal intrinsic for FP16
* use v_float16x4 (universal intrinsic) instead of raw SSE/NEON implementation
  * define v_load_f16/v_store_f16 since v_load can't be distinguished when short pointer passed
  * brush up implementation on old compiler (guard correctly)
  * add test for v_load_f16 and round trip conversion of v_float16x4
  * fix conversion error
2016-09-05 08:13:52 +09:00
Alexander Alekhin
da5ead2c23 Merge pull request #7166 from tomoaki0705:brushUpFp16 2016-08-25 11:49:23 +00:00
Tomoaki Teshima
c5d7791b67 brush up fp16 implementation
* DRY
  * switch to Cv32suf and remove fp32Int32
  * add Cv16suf
2016-08-25 05:31:25 +09:00
Pavel Vlasov
30a6cee2fe Instrumentation for OpenCV API regions and IPP functions; 2016-08-19 18:10:03 +03:00
Tomoaki Teshima
3debc78a5f fix build error on JetsonTK1
* avoid using vld1_f16 and vst1_f16 on gcc 4 series (Ubuntu 14.04)
  * guard correctly with #if
  * use static inline
2016-08-09 17:12:22 +09:00
Tomoaki Teshima
87ca607fd4 brush up convertFp16
* raise an error when wrong bit depth passed
  * raise an build error when wrong depth is specified for cvtScaleHalf_
  * remove unnecessary safe check in cvtScaleHalf_
  * use intrinsic instead of direct pointer access
  * update the explanation
2016-08-03 17:27:45 +09:00
Tomoaki Teshima
c57f8780e9 show CPU feature correctly when FP16 is available
* make sure that CV_FP16 has the correct meaning
  * check FP16 feature correctly
2016-07-29 14:10:33 +09:00