opencv

mirror of https://github.com/opencv/opencv.git synced 2025-01-06 02:08:12 +08:00

Author	SHA1	Message	Date
HAN Liutong	0dd7769bb1	Merge pull request #23980 from hanliutong:rewrite-core Rewrite Universal Intrinsic code by using new API: Core module. #23980 The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API. The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885. Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are: 1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited - ./modules/core/src/stat.simd.hpp - ./modules/core/src/matrix_transform.cpp - ./modules/core/src/matmul.simd.hpp 2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly. - ./modules/core/src/mathfuncs_core.simd.hpp ```cpp struct v_atan_f32 { explicit v_atan_f32(const float& scale) { ... } v_float32 compute(const v_float32& y, const v_float32& x) { ... } ... v_float32 val90; // sizeless type can not used in a class v_float32 val180; v_float32 val360; v_float32 s; }; ``` 3. The API interface does not support/does not match - ./modules/core/src/norm.cpp Use `v_popcount`, ~~waiting for #23966~~ Fixed - ./modules/core/src/has_non_zero.simd.hpp Use illegal Universal Intrinsic API: For float type, there is no logical operation `\|`. Further discussion needed ```cpp /** @brief Bitwise OR Only for integer types. / template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator\|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator\|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); ``` ```cpp #if CV_SIMD typedef v_float32 v_type; const v_type v_zero = vx_setzero_f32(); constexpr const int unrollCount = 8; int step = v_type::nlanes unrollCount; int len0 = len & -step; const float* srcSimdEnd = src+len0; int countSIMD = static_cast<int>((srcSimdEnd-src)/step); while(!res && countSIMD--) { v_type v0 = vx_load(src); src += v_type::nlanes; v_type v1 = vx_load(src); src += v_type::nlanes; .... src += v_type::nlanes; v0 \|= v1; //Illegal ? .... //res = v_check_any(((v0 \| v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ res = !v_check_all(((v0 \| v4) == v_zero)); } v_cleanup(); #endif ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-08-11 08:33:33 +03:00
Alexander Alekhin	1339ebaa84	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-03-26 16:00:28 +00:00
Maksim Shabunin	593996216f	cartToPolar/polarToCart: disable inplace mode	2022-03-21 16:06:12 +03:00
Alexander Alekhin	cbfd38bd41	core: rework code locality - to reduce binaries size of FFmpeg Windows wrapper - MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper) - move code to improve locality with its used dependencies - move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there) - move UMat::inv() to lapack.cpp - move UMat::mul() to arithm.cpp - move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation) - move normalize(): convert_scale.cpp => norm.cpp - move convertAndUnrollScalar(): arithm.cpp => copy.cpp - move scalarToRawData(): array.cpp => copy.cpp - move transpose(): matrix_operations.cpp => matrix_transform.cpp - move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose) - add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module - matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls - the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb) backport is done to improve merge experience (less conflicts) backport of commit: `65eb946756`	2021-03-02 23:24:28 +00:00
Alexander Alekhin	65eb946756	core: rework code locality - to reduce binaries size of FFmpeg Windows wrapper - MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper) - move code to improve locality with its used dependencies - move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there) - move UMat::inv() to lapack.cpp - move UMat::mul() to arithm.cpp - move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation) - move normalize(): convert_scale.cpp => norm.cpp - move convertAndUnrollScalar(): arithm.cpp => copy.cpp - move scalarToRawData(): array.cpp => copy.cpp - move transpose(): matrix_operations.cpp => matrix_transform.cpp - move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose) - add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module - matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls - the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)	2021-03-02 11:27:58 +00:00
Nikita Shulga	ec37364762	Use std::atomic in getExpTab32f and getLogTab32f Reads and writes to volatile bool are not guaranteed to be atomic.	2019-10-07 16:35:07 -07:00
Alexander Alekhin	858a7da5c0	core: rework getContinuousSize() for vector-col/row support	2018-11-10 11:08:28 +00:00
Alexander Alekhin	f185640eda	Merge pull request #12799 from alalek:update_build_js * js: update build script - support emscipten 1.38.12 (wasm is ON by default) - verbose build messages * js: use builtin Math functions * js: disable tracing code completelly	2018-10-15 17:35:21 +03:00
Alexander Alekhin	72eccb7694	Merge pull request #12825 from alalek:issue_8413_3.4	2018-10-15 14:23:21 +00:00
Vitaly Tuzov	43d9256096	Replaced core module calls to universal intrinsics with wide universal intrinsics	2018-10-15 11:46:45 +03:00
Alexander Alekhin	c813ad5533	core(ocl): replace ambiguous 'depth' to 'DEPTH_dst' - always pass DEPTH_dst value to core/arithm kernel	2018-10-14 02:18:04 +00:00
Hamdi Sahloul	5d54def264	Add semicolons after `CV_INSTRUMENT` macros	2018-09-14 06:45:31 +09:00
Alexander Alekhin	acce95f446	backport fixes for static analyzer warnings Commits: - `09837928d9` - `10fb88d027` Excluded changes with std::atomic (C++98 requirement)	2018-09-04 16:49:42 +03:00
Alexander Alekhin	5b3ac112fe	core: move const tables outside of dispatched code To avoid duplicates in binaries	2018-08-08 17:54:54 +03:00
Alexander Alekhin	b09a4a98d4	opencv: Use cv::AutoBuffer<>::data()	2018-07-04 19:11:29 +03:00
Tomoaki Teshima	2a781bb616	remove raw SSE2/NEON implementation from mathfuncs.cpp * replace the implementation by universal intrinsic * make sure no degradation happens on ARM platform	2017-10-15 00:24:31 +09:00
Pavel Vlasov	11c2ffaf1c	Update for IPP for OpenCV 2017u2 integration; Updated integrations for: cv::split cv::merge cv::insertChannel cv::extractChannel cv::Mat::convertTo - now with scaled conversions support cv::LUT - disabled due to performance issues Mat::copyTo Mat::setTo cv::flip cv::copyMakeBorder - currently disabled cv::polarToCart cv::pow - ipp pow function was removed due to performance issues cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance cv::countNonZero cv::minMaxIdx cv::norm cv::canny - new integration. Disabled for threaded; cv::cornerHarris cv::boxFilter cv::bilateralFilter cv::integral	2017-04-25 15:53:12 +03:00
Pavel Vlasov	30a6cee2fe	Instrumentation for OpenCV API regions and IPP functions;	2016-08-19 18:10:03 +03:00
Pavel Vlasov	3860b8db02	IPP was enabled in mathfuncs_core; Exp and Log IPP implementations are changed to hal interface;	2016-08-12 18:16:04 +03:00
Maksim Shabunin	1e667de1f3	HAL math interfaces: fastAtan2, magnitude, sqrt, invSqrt, log, exp	2016-05-31 11:54:52 +03:00
Maksim Shabunin	84f37d352f	HAL moved back to core	2015-12-17 12:33:23 +03:00
Alexander Alekhin	b26580cc7b	checkRange fixes 1) fix multichannel support 2) remove useless bad_value, read value from original Mat directly 3) add more tests 4) fix docs for cvCeil and checkRange	2015-12-09 18:31:27 +03:00
Vadim Pisarevsky	d19897b734	Merge pull request #5651 from hoangviet1985:fix_solvePoly_3.0.0	2015-12-07 10:12:54 +00:00
Maksim Shabunin	ddf293a081	Merge pull request #5649 from hoangviet1985:solve_pow(x,3)=0_opencv300	2015-11-22 18:02:40 +00:00
hoangviet1985	3e96b724c2	squash	2015-11-20 15:03:32 -05:00
hoangviet1985	b96def885f	squash	2015-11-20 14:48:29 -05:00
Vadim Pisarevsky	3942b1f362	Merge pull request #5340 from alalek:ocl_off	2015-11-10 16:53:36 +00:00
Maksim Shabunin	6e9d0d9a0c	Visual Studio 2015 warning and test fixes	2015-10-20 12:48:37 +03:00
Alexander Alekhin	7213e5f68a	ocl: correct disabling of OpenCL code	2015-09-13 20:28:23 +03:00
Vadim Pisarevsky	85149f8686	hack solvePoly to finds roots of polynoms with zero higher-order coefficients. The roots are populated in this case, which is not valid, strictly speaking, but good enough for function like correctMatches. This solves http://code.opencv.org/issues/4330	2015-05-25 23:43:39 +03:00
Vadim Pisarevsky	73f760fdf0	some more compile warnings fixed	2015-05-05 18:03:40 +03:00
Vadim Pisarevsky	931a519969	fixed warning in mathfuncs	2015-05-05 17:49:36 +03:00
Vadim Pisarevsky	9fbd1d68ad	refactored div & pow funcs; added tests for special cases in pow() function. fixed http://code.opencv.org/issues/3935 possibly fixed http://code.opencv.org/issues/3594	2015-05-01 21:49:11 +03:00
Vadim Pisarevsky	ee11a2d266	fully implemented SSE and NEON cases of intrin.hpp; extended the HAL with some basic math functions	2015-04-16 23:00:26 +03:00
Vladislav Vinogradov	cda6fed41f	move tegra namespace out of cv to prevent conflicts	2015-02-27 12:52:11 +03:00
Vladislav Vinogradov	44e41baffe	use new functions before all tegra:: calls	2015-02-26 19:34:58 +03:00
Ilya Lavrenov	6bce6ee34a	checks	2015-01-12 10:59:31 +03:00
Ilya Lavrenov	1d3c860411	SinCos_32f	2015-01-12 10:59:31 +03:00
Ilya Lavrenov	fc0869735d	used popcnt	2015-01-12 10:59:30 +03:00
Ilya Lavrenov	3a78a22733	convertScaleAbs for s8, f64	2015-01-12 10:59:29 +03:00
Ilya Lavrenov	972ff1d0c4	polarToCart	2015-01-12 10:59:28 +03:00
Ilya Lavrenov	0a5c9cf145	magnitude 64f	2015-01-12 10:59:28 +03:00
Ilya Lavrenov	6ab928fb39	phase 64f	2015-01-12 10:59:28 +03:00
Alexander Karsakov	462c3c25a9	Removed incorrect using of rootn() and powr() in ocl_pow	2014-11-06 16:23:02 +03:00
Ilya Lavrenov	5ca25ab8f0	cv::pow (integer power)	2014-11-01 13:19:51 +03:00
Ilya Lavrenov	ccdc71286c	cv::polarToCart	2014-11-01 13:19:51 +03:00
Ilya Lavrenov	d5f006eee5	cv::magnitude; cv::corner**	2014-11-01 13:19:51 +03:00
Ilya Lavrenov	fb97273b3c	cv::phase; cv::cartToPolar	2014-11-01 13:19:51 +03:00
Pavel Vlasov	45958eaabc	Implementation detector and selector for IPP and OpenCL; IPP can be switched on and off on runtime; Optional implementation collector was added (switched off by default in CMake). Gathers data of implementation used in functions and report this info through performance TS; TS modifications for implementations control;	2014-10-15 14:24:41 +04:00
Ilya Lavrenov	00f16e9178	neon	2014-10-03 08:43:02 +00:00

1 2 3

119 Commits