opencv

mirror of https://github.com/opencv/opencv.git synced 2025-01-11 23:18:11 +08:00

Author	SHA1	Message	Date
Alexander Smorkalov	daa8f7dfc6	Partially back-port #25075 to 4.x	2024-03-05 12:15:39 +03:00
HAN Liutong	07bf9cb013	Merge pull request #24325 from hanliutong:rewrite Rewrite Universal Intrinsic code: float related part #24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: #23885 First patch, an example #23980 Core module #24058 ImgProc module, part 1 #24132 ImgProc module, part 2 #24166 ImgProc module, part 3 #24301 Features2d and calib3d module #24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD \|\| CV_SIMD_SCALABLE)` instead of `#if CV_SIMD \|\| CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-10-05 17:57:25 +03:00
HAN Liutong	0dd7769bb1	Merge pull request #23980 from hanliutong:rewrite-core Rewrite Universal Intrinsic code by using new API: Core module. #23980 The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API. The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885. Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are: 1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited - ./modules/core/src/stat.simd.hpp - ./modules/core/src/matrix_transform.cpp - ./modules/core/src/matmul.simd.hpp 2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly. - ./modules/core/src/mathfuncs_core.simd.hpp ```cpp struct v_atan_f32 { explicit v_atan_f32(const float& scale) { ... } v_float32 compute(const v_float32& y, const v_float32& x) { ... } ... v_float32 val90; // sizeless type can not used in a class v_float32 val180; v_float32 val360; v_float32 s; }; ``` 3. The API interface does not support/does not match - ./modules/core/src/norm.cpp Use `v_popcount`, ~~waiting for #23966~~ Fixed - ./modules/core/src/has_non_zero.simd.hpp Use illegal Universal Intrinsic API: For float type, there is no logical operation `\|`. Further discussion needed ```cpp /** @brief Bitwise OR Only for integer types. / template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator\|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator\|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); ``` ```cpp #if CV_SIMD typedef v_float32 v_type; const v_type v_zero = vx_setzero_f32(); constexpr const int unrollCount = 8; int step = v_type::nlanes unrollCount; int len0 = len & -step; const float* srcSimdEnd = src+len0; int countSIMD = static_cast<int>((srcSimdEnd-src)/step); while(!res && countSIMD--) { v_type v0 = vx_load(src); src += v_type::nlanes; v_type v1 = vx_load(src); src += v_type::nlanes; .... src += v_type::nlanes; v0 \|= v1; //Illegal ? .... //res = v_check_any(((v0 \| v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ res = !v_check_all(((v0 \| v4) == v_zero)); } v_cleanup(); #endif ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-08-11 08:33:33 +03:00
yuki takehara	a6277370ca	Merge pull request #21107 from take1014:remove_assert_21038 resolves #21038 * remove C assert * revert C header * fix several points in review * fix test_ds.cpp	2021-11-27 18:34:52 +00:00
Alexander Alekhin	cbfd38bd41	core: rework code locality - to reduce binaries size of FFmpeg Windows wrapper - MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper) - move code to improve locality with its used dependencies - move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there) - move UMat::inv() to lapack.cpp - move UMat::mul() to arithm.cpp - move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation) - move normalize(): convert_scale.cpp => norm.cpp - move convertAndUnrollScalar(): arithm.cpp => copy.cpp - move scalarToRawData(): array.cpp => copy.cpp - move transpose(): matrix_operations.cpp => matrix_transform.cpp - move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose) - add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module - matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls - the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb) backport is done to improve merge experience (less conflicts) backport of commit: `65eb946756`	2021-03-02 23:24:28 +00:00
Alexander Alekhin	e180cc050b	Merge pull request #16236 from alalek:fix_core_simd_emulator * core: fix intrin_cpp, allow to build modules with SIMD emulator * core(arithm): fix v_zero initialization * core(simd): 'strict' types for binary/bitwise operations * features2d: avoid aligned load issue in GCC 5.4 with emulated SIMD * core(simd): alignment checks in SIMD emulator	2020-01-10 21:31:02 +03:00
Vitaly Tuzov	43d9256096	Replaced core module calls to universal intrinsics with wide universal intrinsics	2018-10-15 11:46:45 +03:00
Vitaly Tuzov	283348afc3	SSE2 code in invert() replaced with universal intrinsics	2018-10-02 12:47:07 +03:00
Hamdi Sahloul	5d54def264	Add semicolons after `CV_INSTRUMENT` macros	2018-09-14 06:45:31 +09:00
Hamdi Sahloul	03b3be0f51	MSVC: Slience external/meaningless warnings	2018-09-12 20:02:13 +09:00
Alexander Alekhin	5385086fef	core: solve(): add check for passed 'method' values	2018-07-13 15:15:48 +03:00
Alexander Alekhin	b09a4a98d4	opencv: Use cv::AutoBuffer<>::data()	2018-07-04 19:11:29 +03:00
woody.chow	611cf8d86f	Use Eigen::SelfAdjointEigenSolver in cv::eigen	2017-12-05 02:40:55 +03:00
Tomoaki Teshima	fd711219a2	use universal intrinsic in VBLAS - brush up v_reduce_sum of SSE version	2017-01-31 05:36:27 +09:00
Vladislav Sovrasov	dfe4519c07	Add QR decomposition to HAL	2016-09-05 18:20:04 +03:00
Pavel Vlasov	30a6cee2fe	Instrumentation for OpenCV API regions and IPP functions;	2016-08-19 18:10:03 +03:00
Vladislav Sovrasov	a2d0cc878c	Implement internal HAL for GEMM and matrix decompositions	2016-06-03 10:38:30 +03:00
Maksim Shabunin	84f37d352f	HAL moved back to core	2015-12-17 12:33:23 +03:00
Vadim Pisarevsky	d2aaa70e93	removed HAL calls from public OpenCV headers; put IPP calls back to hall:sqrt() and such (but they are disabled for now)	2015-05-22 16:04:10 +03:00
Vadim Pisarevsky	9fbd1d68ad	refactored div & pow funcs; added tests for special cases in pow() function. fixed http://code.opencv.org/issues/3935 possibly fixed http://code.opencv.org/issues/3594	2015-05-01 21:49:11 +03:00
Vadim Pisarevsky	7918267d02	fixed U non-orthogonality in SVD (http://code.opencv.org/issues/3801 )	2015-04-29 16:09:58 +03:00
Vadim Pisarevsky	ee11a2d266	fully implemented SSE and NEON cases of intrin.hpp; extended the HAL with some basic math functions	2015-04-16 23:00:26 +03:00
Dmitry-Me	8ed4bae4dd	Reduce variable scope, make formatting consistent with surrounding code	2015-03-14 12:50:42 +03:00
Adil Ibragimov	8a4a1bb018	Several type of formal refactoring: 1. someMatrix.data -> someMatrix.prt() 2. someMatrix.data + someMatrix.step * lineIndex -> someMatrix.ptr( lineIndex ) 3. (SomeType*) someMatrix.data -> someMatrix.ptr<SomeType>() 4. someMatrix.data -> !someMatrix.empty() ( or !someMatrix.data -> someMatrix.empty() ) in logical expressions	2014-08-13 15:21:35 +04:00
Adil Ibragimov	98d5731ad8	some formal changes (generally adding constness)	2014-08-07 15:49:14 +04:00
Vadim Pisarevsky	ba3783d205	initial commit; ml has been refactored; it compiles and the tests run well; some other modules, apps and samples do not compile; to be fixed	2014-07-29 23:54:23 +04:00
Roman Donchenko	2c4bbb313c	Merge commit '43aec5ad' into merge-2.4 Conflicts: cmake/OpenCVConfig.cmake cmake/OpenCVLegacyOptions.cmake modules/contrib/src/retina.cpp modules/gpu/doc/camera_calibration_and_3d_reconstruction.rst modules/gpu/doc/video.rst modules/gpu/src/speckle_filtering.cpp modules/python/src2/cv2.cv.hpp modules/python/test/test2.py samples/python/watershed.py	2013-08-27 13:26:44 +04:00
Roman Donchenko	e9a28f66ee	Normalized file endings.	2013-08-21 18:59:25 +04:00
Andrey Kamaev	e27f4da9c6	Merge pull request #795 from taka-no-me:move_imgproc_utils_to_core	2013-04-11 11:35:15 +04:00
Andrey Kamaev	c98c246fc2	Move border type constants and Moments class to core module	2013-04-10 19:14:24 +04:00
Andrey Kamaev	b0e6606b98	Cleanup core module API * Drop some low level API * Remove outdated overloads * Utilize Input/OutputArray	2013-04-09 13:36:32 +04:00
Andrey Kamaev	67073daf19	Merge branch '2.4'	2013-04-05 21:11:59 +04:00
Andrey Kamaev	235a678458	SVD: always update W vector for better algorithm convergency	2013-04-04 13:55:36 +04:00
Andrey Kamaev	715fa3303e	Move cv::Mat out of core.hpp	2013-04-01 15:24:34 +04:00
Andrey Kamaev	1ca8f33b4e	Merge branch '2.4'	2013-03-21 23:11:54 +04:00
Vadim Pisarevsky	9a86245242	added test for bug #1448 and hopefully fixes the bug #2898	2013-03-20 11:58:19 +04:00
Andrey Kamaev	55698548dd	Avoid assert in lapac.cpp if findHomography fails in BestOf2NearestMatcher::match	2013-03-12 22:49:40 +04:00
Andrey Kamaev	ab221e94c0	Fix invert under MSVC	2013-02-26 11:16:57 +04:00
Vadim Pisarevsky	416432a8e5	replaced tabs with spaces	2013-02-25 23:10:38 +04:00
Vadim Pisarevsky	087537463d	attempt to make the ultimate fix for the failure in Core_Invert.small	2013-02-25 22:46:30 +04:00
Vadim Pisarevsky	b57e801c04	now invert 3x3 on "bad" matrices works well on Windows	2012-11-28 23:05:51 +04:00
Vadim Pisarevsky	9163471987	improved accuracy of 3x3 invert on poorly-conditioned matrices (bug #2525 )	2012-11-08 14:09:43 +04:00
OpenCV Buildbot	04384a71e4	Normalize line endings and whitespace	2012-10-17 15:32:23 +04:00
Vadim Pisarevsky	4b5f948307	added SSE2-optimized 3x3 invert by Grigoriy Frolov	2012-08-07 17:59:52 +04:00
Vadim Pisarevsky	fac3d9994c	integrated another portion of SSE optimizations from Grigory Frolov	2012-07-31 19:07:55 +04:00
Vadim Pisarevsky	b782d8bb53	integrated patch with some SSE2/SSE4.2 optimizations from Grigory Frolov	2012-07-24 17:24:31 +04:00
Vadim Pisarevsky	82cb2ab556	fixed bug in SVD, ticket #2027 ; fixed building highgui with ffmpeg support on MacOSX	2012-06-28 19:45:13 +00:00
Andrey Kamaev	3108423a37	Fixed assert placement in cv::invert	2012-05-23 09:28:26 +00:00
Victoria Zhislina	fbdb93ec79	CV_ENABLE_UNROLLED	2012-02-10 06:05:04 +00:00
Vadim Pisarevsky	dbfa8408d2	fixed potential bug in cv::eigen()	2012-01-26 19:41:59 +00:00

1 2

74 Commits