opencv

mirror of https://github.com/opencv/opencv.git synced 2025-08-06 14:36:36 +08:00

Author	SHA1	Message	Date
Alexander Alekhin	40533dbf69	Merge pull request #24918 from opencv-pushbot:gitee/alalek/core_convertfp16_replacement core(OpenCL): optimize convertTo() with CV_16F (convertFp16() replacement) #24918 relates #24909 relates #24917 relates #24892 Performance changes: - [x] 12700K (1 thread) + Intel iGPU \|Name of Test\|noOCL\|convertFp16\|convertTo BASE\|convertTo PATCH\| \|---\|:-:\|:-:\|:-:\|:-:\| \|ConvertFP16FP32MatMat::OCL_Core\|3.130\|3.152\|3.127\|3.136\| \|ConvertFP16FP32MatUMat::OCL_Core\|3.030\|3.996\|3.007\|2.671\| \|ConvertFP16FP32UMatMat::OCL_Core\|3.010\|3.101\|3.056\|2.854\| \|ConvertFP16FP32UMatUMat::OCL_Core\|3.016\|3.298\|2.072\|2.061\| \|ConvertFP32FP16MatMat::OCL_Core\|2.697\|2.652\|2.723\|2.721\| \|ConvertFP32FP16MatUMat::OCL_Core\|2.752\|4.268\|2.662\|2.947\| \|ConvertFP32FP16UMatMat::OCL_Core\|2.706\|2.601\|2.603\|2.528\| \|ConvertFP32FP16UMatUMat::OCL_Core\|2.704\|3.215\|1.999\|1.988\| Patched version is not worse than convertFp16 and convertTo baseline (except MatUMat 32->16, baseline uses CPU code+dst buffer map). There are still gaps against noOpenCL(CPU only) mode due to T-API implementation issues (unnecessary synchronization). - [x] 12700K + AMD dGPU \|Name of Test\|noOCL\|convertFp16 dGPU\|convertTo BASE dGPU\|convertTo PATCH dGPU\| \|---\|:-:\|:-:\|:-:\|:-:\| \|ConvertFP16FP32MatMat::OCL_Core\|3.130\|3.133\|3.172\|3.087\| \|ConvertFP16FP32MatUMat::OCL_Core\|3.030\|1.713\|9.559\|1.729\| \|ConvertFP16FP32UMatMat::OCL_Core\|3.010\|6.515\|6.309\|4.452\| \|ConvertFP16FP32UMatUMat::OCL_Core\|3.016\|0.242\|23.597\|0.170\| \|ConvertFP32FP16MatMat::OCL_Core\|2.697\|2.641\|2.713\|2.689\| \|ConvertFP32FP16MatUMat::OCL_Core\|2.752\|4.076\|6.483\|4.191\| \|ConvertFP32FP16UMatMat::OCL_Core\|2.706\|9.042\|16.481\|1.834\| \|ConvertFP32FP16UMatUMat::OCL_Core\|2.704\|0.229\|15.730\|0.176\| convertTo-baseline can't compile OpenCL kernel for FP16 properly - FIXED. dGPU has much more power, so results are x16-17 better than single cpu core. Patched version is not worse than convertFp16 and convertTo baseline. There are still gaps against noOpenCL(CPU only) mode due to T-API implementation issues (unnecessary synchronization) and required memory transfers. Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>	2024-01-26 12:56:52 +03:00
Alexander Smorkalov	ae21368eb9	Merge pull request #24832 from AryanNanda17:Aryan#22177 Resolved issue number #22177	2024-01-26 10:42:47 +03:00
Sean McBride	e64857c561	Merge pull request #23736 from seanm:c++11-simplifications Removed all pre-C++11 code, workarounds, and branches #23736 This removes a bunch of pre-C++11 workrarounds that are no longer necessary as C++11 is now required. It is a nice clean up and simplification. * No longer unconditionally #include <array> in cvdef.h, include explicitly where needed * Removed deprecated CV_NODISCARD, already unused in the codebase * Removed some pre-C++11 workarounds, and simplified some backwards compat defines * Removed CV_CXX_STD_ARRAY * Removed CV_CXX_MOVE_SEMANTICS and CV_CXX_MOVE * Removed all tests of CV_CXX11, now assume it's always true. This allowed removing a lot of dead code. * Updated some documentation consequently. * Removed all tests of CV_CXX11, now assume it's always true * Fixed links. --------- Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com> Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>	2024-01-19 16:53:08 +03:00
Maksim Shabunin	6b77f50269	RISC-V: use non-saturating 64-bit add in intrin_rvv071.hpp	2024-01-17 20:34:12 +03:00
Maksim Shabunin	224b9ee33f	RISC-V: updated intrin_rvv071.hpp to work with modern toolchain 2.8.0 - intrinsics implementation (071) reworked to use modern RVV intrinsics syntax - cmake toolchain file (071) now allows selecting from predefined configurations Co-authored-by: Fang Sun <fangsun@linux.alibaba.com>	2024-01-17 20:34:12 +03:00
Stefan Dragnev	2791bb7062	Merge pull request #24773 from tailsu:sd/pathlike python: accept path-like objects wherever file names are expected #24773 Merry Christmas, all 🎄 Implements #15731 Support is enabled for all arguments named `filename` or `filepath` (case-insensitive), or annotated with `CV_WRAP_FILE_PATH`. Support is based on `PyOS_FSPath`, which is available in Python 3.6+. When running on older Python versions the arguments must have a `str` value as before. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-01-12 16:23:05 +03:00
Aryan	9b402cfa59	Resolved issue number #22177	2024-01-09 01:23:26 +05:30
Alexander Smorkalov	22a8fa0730	Merge pull request #24798 from Rageking8:correct-invalid-error-directive Correct invalid error directive	2024-01-06 12:05:07 +03:00
cudawarped	19527d79d6	core: address clang warnings	2024-01-02 08:33:55 +02:00
Rageking8	7f2c14fc4f	Correct invalid error directive	2023-12-29 21:34:16 +08:00
Alexander Alekhin	2e3ccb4e8e	Merge tag '4.9.0'	2023-12-28 09:29:33 +00:00
Alexander Smorkalov	dad8af6b17	Release 4.9.0.	2023-12-27 19:46:55 +03:00
Alexander Alekhin	49a0877b8c	docs: exclude test entites from bindings utils	2023-12-27 06:46:20 +00:00
Alexander Smorkalov	b407c58b96	pre: OpenCV 4.9.0 (version++).	2023-12-25 15:20:10 +03:00
Kumataro	dba7186378	Merge pull request #24271 from Kumataro:fix24163 Fix to convert float32 to int32/uint32 with rounding to nearest (ties to even). #24271 Fix https://github.com/opencv/opencv/issues/24163 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake (carotene is BSD)	2023-12-25 12:17:17 +03:00
Alexander Smorkalov	3893936243	Merge pull request #24565 from CNClareChen:4.x Change the lsx to baseline features.	2023-11-30 15:27:49 +03:00
Alexander Smorkalov	e20250139a	Merge pull request #24582 from hanliutong:rvv-lut Optimize the v_lut* functions for RISC-V Vector(RVV).	2023-11-30 10:59:51 +03:00
Philip Allgaier	9bb0a8d9e9	Fix comment typo in matx.hpp	2023-11-28 08:26:40 +01:00
Liutong HAN	ce0516282a	Optimize the v_lut for RVV.	2023-11-23 15:06:04 +08:00
Hao Chen	c19adb4953	Change the lsx to baseline features. This patch change lsx to baseline feature, and lasx to dispatch feature. Additionally, the runtime detection methods for lasx and lsx have been modified.	2023-11-21 11:51:22 +08:00
zihaomu	b913e73d04	DNN: add the Winograd fp16 support (#23654 ) * add Winograd FP16 implementation * fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime * fixed some warnings * hopefully fixed winograd on x64 (and maybe other platforms) --------- Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>	2023-11-20 13:45:37 +03:00
Alexander Smorkalov	8df76fe0cb	Exclude RVV UI internals from Doxygen documentation.	2023-11-08 14:22:05 +03:00
Vincent Rabaud	832f738db0	Merge pull request #24495 from vrabaud:fast_math_compile Get the SSE2 condition match the emmintrin.h inclusion condition. #24495 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-07 09:06:28 +03:00
Alexander Smorkalov	fe4d518d85	Merge pull request #24485 from hanliutong:rvv-opt Optimize the Implementation of RVV Universal Intrinsic.	2023-11-03 12:31:10 +03:00
Rostislav Vasilikhin	ea47cb3ffe	Merge pull request #24480 from savuor:backport_patch_nans Backport to 4.x: patchNaNs() SIMD acceleration #24480 backport from #23098 connected PR in extra: [#1118@extra](https://github.com/opencv/opencv_extra/pull/1118) ### This PR contains: * new SIMD code for `patchNaNs()` * CPU perf test <details> <summary>Performance comparison</summary> Geometric mean (ms) \|Name of Test\|noopt\|sse2\|avx2\|sse2 vs noopt (x-factor)\|avx2 vs noopt (x-factor)\| \|---\|:-:\|:-:\|:-:\|:-:\|:-:\| \|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC1)\|0.019\|0.017\|0.018\|1.11\|1.07\| \|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC4)\|0.037\|0.037\|0.033\|1.00\|1.10\| \|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC1)\|0.032\|0.032\|0.033\|0.99\|0.98\| \|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC4)\|0.072\|0.072\|0.070\|1.00\|1.03\| \|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC1)\|0.051\|0.051\|0.050\|1.00\|1.01\| \|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC4)\|0.137\|0.138\|0.128\|0.99\|1.06\| \|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC1)\|0.137\|0.128\|0.129\|1.07\|1.06\| \|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC4)\|0.450\|0.450\|0.448\|1.00\|1.01\| \|PatchNaNs::PatchNaNsFixture::(640x480, 32FC1)\|0.149\|0.029\|0.020\|5.13\|7.44\| \|PatchNaNs::PatchNaNsFixture::(640x480, 32FC2)\|0.304\|0.058\|0.040\|5.25\|7.65\| \|PatchNaNs::PatchNaNsFixture::(640x480, 32FC3)\|0.448\|0.086\|0.059\|5.22\|7.55\| \|PatchNaNs::PatchNaNsFixture::(640x480, 32FC4)\|0.601\|0.133\|0.083\|4.51\|7.23\| \|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC1)\|0.451\|0.093\|0.060\|4.83\|7.52\| \|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC2)\|0.892\|0.184\|0.126\|4.85\|7.06\| \|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC3)\|1.345\|0.311\|0.230\|4.32\|5.84\| \|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC4)\|1.831\|0.546\|0.436\|3.35\|4.20\| \|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC1)\|1.017\|0.250\|0.160\|4.06\|6.35\| \|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC2)\|2.077\|0.646\|0.605\|3.21\|3.43\| \|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC3)\|3.134\|1.053\|0.961\|2.97\|3.26\| \|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC4)\|4.222\|1.436\|1.288\|2.94\|3.28\| \|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC1)\|4.225\|1.401\|1.277\|3.01\|3.31\| \|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC2)\|8.310\|2.953\|2.635\|2.81\|3.15\| \|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC3)\|12.396\|4.455\|4.252\|2.78\|2.92\| \|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC4)\|17.174\|5.831\|5.824\|2.95\|2.95\| </details> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-03 08:58:07 +03:00
Liutong HAN	451ee3991e	Use local variable.	2023-11-03 10:21:13 +08:00
CNClareChen	d142a796d8	Merge pull request #23929 from CNClareChen:4.x * Optimize some function with lasx. Optimize some function with lasx. #23929 This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-10-20 14:20:09 +03:00
Alexander Smorkalov	1c0ca41b6e	Merge pull request #24371 from hanliutong:clean-up Clean up the obsolete API of Universal Intrinsic	2023-10-20 12:50:26 +03:00
Vadim Pisarevsky	ba4d6c859d	added detection & dispatching of some modern NEON instructions (NEON_FP16, NEON_BF16) (#24420 ) * added more or less cross-platform (based on POSIX signal() semantics) method to detect various NEON extensions, such as FP16 SIMD arithmetics, BF16 SIMD arithmetics, SIMD dotprod etc. It could be propagated to other instruction sets if necessary. * hopefully fixed compile errors * continue to fix CI * another attempt to fix build on Linux aarch64 * * reverted to the original method to detect special arm neon instructions without signal() * renamed FP16_SIMD & BF16_SIMD to NEON_FP16 and NEON_BF16, respectively * removed extra whitespaces	2023-10-18 22:06:20 +03:00
Liutong HAN	a287605c3e	Clean up the Universal Intrinsic API.	2023-10-13 19:23:30 +08:00
Alexander Smorkalov	7e17f01b7b	Merge pull request #24368 from mshabunin:rvv-clang-17 RISC-V: added v0.12 intrinsics compatibility header	2023-10-12 10:28:54 +03:00
Maksim Shabunin	8edf37903d	RISC-V: added v0.12 intrinsics compatibility header	2023-10-06 20:16:57 +03:00
Sean McBride	5fb3869775	Merge pull request #23109 from seanm:misc-warnings * Fixed clang -Wnewline-eof warnings * Fixed all trivial clang -Wextra-semi and -Wc++98-compat-extra-semi warnings * Removed trailing semi from various macros * Fixed various -Wunused-macros warnings * Fixed some trivial -Wdocumentation warnings * Fixed some -Wdocumentation-deprecated-sync warnings * Fixed incorrect indentation * Suppressed some clang warnings in 3rd party code * Fixed QRCodeEncoder::Params documentation. --------- Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>	2023-10-06 13:33:21 +03:00
HAN Liutong	07bf9cb013	Merge pull request #24325 from hanliutong:rewrite Rewrite Universal Intrinsic code: float related part #24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: #23885 First patch, an example #23980 Core module #24058 ImgProc module, part 1 #24132 ImgProc module, part 2 #24166 ImgProc module, part 3 #24301 Features2d and calib3d module #24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD \|\| CV_SIMD_SCALABLE)` instead of `#if CV_SIMD \|\| CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-10-05 17:57:25 +03:00
Kumataro	b870ad46bf	Merge pull request #24074 from Kumataro/fix24057 Python: support tuple src for cv::add()/subtract()/... #24074 fix https://github.com/opencv/opencv/issues/24057 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ x The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-09-19 10:32:47 +03:00
HAN Liutong	f617fbe166	Merge pull request #24132 from hanliutong:rewrite-imgproc2 Rewrite Universal Intrinsic code by using new API: ImgProc module Part 2 #24132 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro in the opencv/modules/imgproc folder: rewrite them by using the new Universal Intrinsic API. This is the second part of the modification to the Imgproc module ( Part 1: #24058 ), And I tested this patch on RVV (QEMU) and AVX devices, `opencv_test_imgproc` is passed. The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-09-19 08:52:42 +03:00
Alexander Smorkalov	8f2e6640e3	Merge pull request #24288 from tailsu:sd/emscripten-3.1.45-fixes build fixes for emscripten 3.1.45	2023-09-19 08:09:18 +03:00
Stefan Dragnev	9b5a719d80	build fixes for emscripten 3.1.45	2023-09-18 15:38:31 +02:00
Yuriy Chernyshov	494d201fda	Add missing <sstream> includes	2023-09-05 22:04:26 +03:00
Kumataro	72bb8bb73c	core: arm64: v_round() works with round to nearest, ties to even.	2023-09-04 10:27:55 +03:00
Yuantao Feng	a308dfca98	core: add broadcast (#23965 ) * add broadcast_to with tests * change name * fix test * fix implicit type conversion * replace type of shape with InputArray * add perf test * add perf tests which takes care of axis * v2 from ficus expand * rename to broadcast * use randu in place of declare * doc improvement; smaller scale in perf * capture get_index by reference	2023-08-30 09:53:59 +03:00
Alexander Smorkalov	232c67bf76	Merge pull request #24140 from sthibaul:4.x Fix GNU/Hurd build	2023-08-11 12:32:22 +03:00
HAN Liutong	0dd7769bb1	Merge pull request #23980 from hanliutong:rewrite-core Rewrite Universal Intrinsic code by using new API: Core module. #23980 The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API. The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885. Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are: 1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited - ./modules/core/src/stat.simd.hpp - ./modules/core/src/matrix_transform.cpp - ./modules/core/src/matmul.simd.hpp 2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly. - ./modules/core/src/mathfuncs_core.simd.hpp ```cpp struct v_atan_f32 { explicit v_atan_f32(const float& scale) { ... } v_float32 compute(const v_float32& y, const v_float32& x) { ... } ... v_float32 val90; // sizeless type can not used in a class v_float32 val180; v_float32 val360; v_float32 s; }; ``` 3. The API interface does not support/does not match - ./modules/core/src/norm.cpp Use `v_popcount`, ~~waiting for #23966~~ Fixed - ./modules/core/src/has_non_zero.simd.hpp Use illegal Universal Intrinsic API: For float type, there is no logical operation `\|`. Further discussion needed ```cpp /** @brief Bitwise OR Only for integer types. / template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator\|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator\|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); ``` ```cpp #if CV_SIMD typedef v_float32 v_type; const v_type v_zero = vx_setzero_f32(); constexpr const int unrollCount = 8; int step = v_type::nlanes unrollCount; int len0 = len & -step; const float* srcSimdEnd = src+len0; int countSIMD = static_cast<int>((srcSimdEnd-src)/step); while(!res && countSIMD--) { v_type v0 = vx_load(src); src += v_type::nlanes; v_type v1 = vx_load(src); src += v_type::nlanes; .... src += v_type::nlanes; v0 \|= v1; //Illegal ? .... //res = v_check_any(((v0 \| v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ res = !v_check_all(((v0 \| v4) == v_zero)); } v_cleanup(); #endif ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-08-11 08:33:33 +03:00
Samuel Thibault	82de5b3a67	Fix GNU/Hurd build It has the usual Unix filesystem operations.	2023-08-10 22:43:46 +02:00
cudawarped	bea0c1b660	cuda: Fix GpuMat::copyTo and GpuMat::converTo python bindings	2023-08-01 15:09:37 +03:00
Alexander Smorkalov	b22c2505a8	Disable warning C5054 in VS 2022 C++20	2023-07-26 09:23:32 +03:00
Alexander Smorkalov	12acf5603a	Merge pull request #24001 from legrosbuffle:legrosbuffle-cvround-intrinsic Use intrinsics for `cvRound` on x86_64 `__GNUC__` (clang/gcc linux) too.	2023-07-23 09:53:18 +03:00
Clement Courbet	3cce299a78	Use intrinsics for `cvRound` on x86 and x86_64 `__GNUC__` (clang/gcc linux) too. We've measured a 7x improvement in speed for `cvRound` using the intrinsic.	2023-07-21 10:57:54 +03:00
Alexander Smorkalov	1f7025f028	Merge pull request #23920 from loongson-zn:4.x Fix LoongArch Macro Definition	2023-07-14 15:00:41 +03:00
Alexander Smorkalov	bd2695f01b	Merge pull request #23966 from hanliutong:popcount Add missing ”v_popcount“ for RVV and enable tests.	2023-07-13 12:22:46 +03:00

1 2 3 4 5 ...

2462 Commits