opencv

mirror of https://github.com/opencv/opencv.git synced 2025-01-11 23:18:11 +08:00

Author	SHA1	Message	Date
Vadim Pisarevsky	1d18aba587	Extended several core functions to support new types (#24962 ) * started adding support for new types (16f, 16bf, 32u, 64u, 64s) to arithmetic functions * fixed several tests; refactored and extended sum(), extended inRange(). * extended countNonZero(), mean(), meanStdDev(), minMaxIdx(), norm() and sum() to support new types (F16, BF16, U32, U64, S64) * put missing CV_DEPTH_MAX to some function dispatcher tables * extended findnonzero, hasnonzero with the new types support * extended mixChannels() to support new types * minor fix * fixed a few compile errors on Linux and a few failures in core tests * fixed a few more warnings and test failures * trying to fix the remaining warnings and test failures. The test `MulTestGPU.MathOpTest` was disabled - not clear whether to set tolerance - it's not bit-exact operation, as possibly assumed by the test, due to the use of scale and possibly limited accuracy of the intermediate floating-point calculations. * found that in the current snapshot G-API produces incorrect results in Mul, Div and AddWeighted (at least when using OpenCL on Windows x64 or MacOS x64). Disabled the respective tests.	2024-02-11 10:42:41 +03:00
Alexander Smorkalov	c739117a7c	Merge branch 4.x	2024-01-19 17:32:22 +03:00
Vincent Rabaud	0812659e92	Fix compilation on some 32-bit windows I do not have more info on the platform as it is internal. Without this fix, the error is: core/src/arithm.simd.hpp:868:1: error: too few arguments provided to function-like macro invocation 868 \| DEFINE_SIMD_ALL(cmp) \| ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:93:5: note: expanded from macro 'DEFINE_SIMD_ALL' 93 \| DEFINE_SIMD_NSAT(fun, __VA_ARGS__) \| ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:89:5: note: expanded from macro 'DEFINE_SIMD_NSAT' 89 \| DEFINE_SIMD_F64(fun, __VA_ARGS__) \| ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:77:9: note: expanded from macro 'DEFINE_SIMD_F64' 77 \| DEFINE_NOSIMD(__CV_CAT(fun, 64f), double, __VA_ARGS__) \| ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:47:56: note: expanded from macro 'DEFINE_NOSIMD' 47 \| DEFINE_NOSIMD_FUN(fun_name, c_type, __VA_ARGS__) \| ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:860:9: note: macro 'DEFINE_NOSIMD_FUN' defined here 860 \| #define DEFINE_NOSIMD_FUN(fun, _T1, _Tvec, ...) \	2023-11-29 16:27:11 +01:00
Alexander Smorkalov	97620c053f	Merge branch 4.x	2023-10-23 11:53:04 +03:00
HAN Liutong	07bf9cb013	Merge pull request #24325 from hanliutong:rewrite Rewrite Universal Intrinsic code: float related part #24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: #23885 First patch, an example #23980 Core module #24058 ImgProc module, part 1 #24132 ImgProc module, part 2 #24166 ImgProc module, part 3 #24301 Features2d and calib3d module #24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD \|\| CV_SIMD_SCALABLE)` instead of `#if CV_SIMD \|\| CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-10-05 17:57:25 +03:00
Alexander Smorkalov	fdab565711	Merge branch 4.x	2023-09-13 14:49:25 +03:00
HAN Liutong	0dd7769bb1	Merge pull request #23980 from hanliutong:rewrite-core Rewrite Universal Intrinsic code by using new API: Core module. #23980 The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API. The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885. Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are: 1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited - ./modules/core/src/stat.simd.hpp - ./modules/core/src/matrix_transform.cpp - ./modules/core/src/matmul.simd.hpp 2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly. - ./modules/core/src/mathfuncs_core.simd.hpp ```cpp struct v_atan_f32 { explicit v_atan_f32(const float& scale) { ... } v_float32 compute(const v_float32& y, const v_float32& x) { ... } ... v_float32 val90; // sizeless type can not used in a class v_float32 val180; v_float32 val360; v_float32 s; }; ``` 3. The API interface does not support/does not match - ./modules/core/src/norm.cpp Use `v_popcount`, ~~waiting for #23966~~ Fixed - ./modules/core/src/has_non_zero.simd.hpp Use illegal Universal Intrinsic API: For float type, there is no logical operation `\|`. Further discussion needed ```cpp /** @brief Bitwise OR Only for integer types. / template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator\|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator\|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); ``` ```cpp #if CV_SIMD typedef v_float32 v_type; const v_type v_zero = vx_setzero_f32(); constexpr const int unrollCount = 8; int step = v_type::nlanes unrollCount; int len0 = len & -step; const float* srcSimdEnd = src+len0; int countSIMD = static_cast<int>((srcSimdEnd-src)/step); while(!res && countSIMD--) { v_type v0 = vx_load(src); src += v_type::nlanes; v_type v1 = vx_load(src); src += v_type::nlanes; .... src += v_type::nlanes; v0 \|= v1; //Illegal ? .... //res = v_check_any(((v0 \| v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ res = !v_check_all(((v0 \| v4) == v_zero)); } v_cleanup(); #endif ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-08-11 08:33:33 +03:00
Vadim Pisarevsky	518486ed3d	Added new data types to cv::Mat & UMat (#23865 ) * started working on adding 32u, 64u, 64s, bool and 16bf types to OpenCV * core & imgproc tests seem to pass * fixed a few compile errors and test failures on macOS x86 * hopefully fixed some compile problems and test failures * fixed some more warnings and test failures * trying to fix small deviations in perf_core & perf_imgproc by revering randf_64f to exact version used before * trying to fix behavior of the new OpenCV with old plugins; there is (quite strong) assumption that video capture would give us frames with depth == CV_8U (0) or CV_16U (2). If depth is > 7 then it means that the plugin is built with the old OpenCV. It needs to be recompiled, of course and then this hack can be removed. * try to repair the case when target arch does not have FP64 SIMD * 1. fixed bug in itoa() found by alalek 2. restored ==, !=, > and < univ. intrinsics on ARM32/ARM64.	2023-08-04 10:50:03 +03:00
HAN Liutong	0ef803950b	Merge pull request #22179 from hanliutong:new-rvv [GSoC] New universal intrinsic backend for RVV * Add new rvv backend (partially implemented). * Modify the framework of Universal Intrinsic. * Add CV_SIMD macro guards to current UI code. * Use vlanes() instead of nlanes. * Modify the UI test. * Enable the new RVV (scalable) backend. * Remove whitespace. * Rename and some others modify. * Update intrin.hpp but still not work on AVX/SSE * Update conditional compilation macros. * Use static variable for vlanes. * Use max_nlanes for array defining.	2022-07-19 20:02:00 +03:00
damonyu1989	5f637e5a02	Merge pull request #19778 from damonyu1989:master-riscv-0.7.1 * Add the support for riscv64 vector 0.7.1. * fixed GCC warnings * cleaned whitespaces * Remove the worning by the use of internal API of compiler. * Update the license header. * removed trailing whitespaces Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@me.com> Co-authored-by: yulj <linjie.ylj@alibaba-inc.com> Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>	2021-05-25 20:15:12 +03:00
Alexander Alekhin	fb61f88b9c	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2020-01-12 09:35:39 +00:00
Alexander Alekhin	e180cc050b	Merge pull request #16236 from alalek:fix_core_simd_emulator * core: fix intrin_cpp, allow to build modules with SIMD emulator * core(arithm): fix v_zero initialization * core(simd): 'strict' types for binary/bitwise operations * features2d: avoid aligned load issue in GCC 5.4 with emulated SIMD * core(simd): alignment checks in SIMD emulator	2020-01-10 21:31:02 +03:00
Alexander Alekhin	a74fe2ec01	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-09-20 21:11:49 +00:00
mipsopen-fwu	b1ea91d8bd	Merge pull request #15422 from mipsopen-fwu:msa-dev * Added MSA implementations for mips platforms. Intrinsics for MSA and build scripts for MIPS platforms are added. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed some unused code in mips.toolchain.cmake. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Added comments for mips toolchain configuration and disabled compiling warnings for libpng. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Fixed the build error of unsupported opcode 'pause' when mips isa_rev is less than 2. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed FP16 related item in MSA option defines in OpenCVCompilerOptimizations.cmake. 2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h. 3. Removed hasSIMD128() in intrin_msa.hpp. 4. Define CPU_MSA as 150. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp. 2. Removed unnecessary CV_MSA related code block in dotProd_8u(). Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Defined CPU_MSA_FLAGS_ON as "-mmsa". 2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h. Signed-off-by: Fei Wu <fwu@wavecomp.com>	2019-09-20 19:52:48 +03:00
Alexander Alekhin	1913482cf5	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2018-11-10 20:50:26 +00:00
Sayed Adel	93ffebc273	core: reimplement SIMD arithmetic, logic and comparison operations into wide universal intrinsics - initialize arithmetic dispatcher - add new universal intrinsic v_absdiffs - add new universal intrinsic v_pack_b - add accumulate version of universal intrinsic v_round - fix sse/avx2:uint8 multiplication overflow - reimplement arithmetic, logic and comparison operations into wide universal intrinsics with full support for all types - reimplement IPP arithmetic, logic and comparison operations in a sperate file arithm_ipp.hpp - avoid scalar multiplication if scaling factor eq 1 and use integer multiplication - move C arithmetic operations to precomp.hpp and delete [arithm_simd\|arithm_core].hpp - add compatibility with new opencv4 divide policy	2018-10-30 12:48:31 +02:00

16 Commits