mirror of
https://github.com/opencv/opencv.git
synced 2024-12-15 01:39:10 +08:00
422d519703
3022 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
ryanking13
|
422d519703 | Enable file system on Emscripten | ||
Alexander Alekhin
|
40533dbf69
|
Merge pull request #24918 from opencv-pushbot:gitee/alalek/core_convertfp16_replacement
core(OpenCL): optimize convertTo() with CV_16F (convertFp16() replacement) #24918 relates #24909 relates #24917 relates #24892 Performance changes: - [x] 12700K (1 thread) + Intel iGPU |Name of Test|noOCL|convertFp16|convertTo BASE|convertTo PATCH| |---|:-:|:-:|:-:|:-:| |ConvertFP16FP32MatMat::OCL_Core|3.130|3.152|3.127|3.136| |ConvertFP16FP32MatUMat::OCL_Core|3.030|3.996|3.007|2.671| |ConvertFP16FP32UMatMat::OCL_Core|3.010|3.101|3.056|2.854| |ConvertFP16FP32UMatUMat::OCL_Core|3.016|3.298|2.072|2.061| |ConvertFP32FP16MatMat::OCL_Core|2.697|2.652|2.723|2.721| |ConvertFP32FP16MatUMat::OCL_Core|2.752|4.268|2.662|2.947| |ConvertFP32FP16UMatMat::OCL_Core|2.706|2.601|2.603|2.528| |ConvertFP32FP16UMatUMat::OCL_Core|2.704|3.215|1.999|1.988| Patched version is not worse than convertFp16 and convertTo baseline (except MatUMat 32->16, baseline uses CPU code+dst buffer map). There are still gaps against noOpenCL(CPU only) mode due to T-API implementation issues (unnecessary synchronization). - [x] 12700K + AMD dGPU |Name of Test|noOCL|convertFp16 dGPU|convertTo BASE dGPU|convertTo PATCH dGPU| |---|:-:|:-:|:-:|:-:| |ConvertFP16FP32MatMat::OCL_Core|3.130|3.133|3.172|3.087| |ConvertFP16FP32MatUMat::OCL_Core|3.030|1.713|9.559|1.729| |ConvertFP16FP32UMatMat::OCL_Core|3.010|6.515|6.309|4.452| |ConvertFP16FP32UMatUMat::OCL_Core|3.016|0.242|23.597|0.170| |ConvertFP32FP16MatMat::OCL_Core|2.697|2.641|2.713|2.689| |ConvertFP32FP16MatUMat::OCL_Core|2.752|4.076|6.483|4.191| |ConvertFP32FP16UMatMat::OCL_Core|2.706|9.042|16.481|1.834| |ConvertFP32FP16UMatUMat::OCL_Core|2.704|0.229|15.730|0.176| convertTo-baseline can't compile OpenCL kernel for FP16 properly - FIXED. dGPU has much more power, so results are x16-17 better than single cpu core. Patched version is not worse than convertFp16 and convertTo baseline. There are still gaps against noOpenCL(CPU only) mode due to T-API implementation issues (unnecessary synchronization) and required memory transfers. Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com> |
||
Sean McBride
|
e64857c561
|
Merge pull request #23736 from seanm:c++11-simplifications
Removed all pre-C++11 code, workarounds, and branches #23736 This removes a bunch of pre-C++11 workrarounds that are no longer necessary as C++11 is now required. It is a nice clean up and simplification. * No longer unconditionally #include <array> in cvdef.h, include explicitly where needed * Removed deprecated CV_NODISCARD, already unused in the codebase * Removed some pre-C++11 workarounds, and simplified some backwards compat defines * Removed CV_CXX_STD_ARRAY * Removed CV_CXX_MOVE_SEMANTICS and CV_CXX_MOVE * Removed all tests of CV_CXX11, now assume it's always true. This allowed removing a lot of dead code. * Updated some documentation consequently. * Removed all tests of CV_CXX11, now assume it's always true * Fixed links. --------- Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com> Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai> |
||
Zhuo Zhang
|
37b02d170f |
fix qnx-sdp-700 build
based on https://github.com/opencv/opencv/pull/24864 |
||
Zhuo Zhang
|
b04de14fbb |
Fix QNX build
Based on https://github.com/opencv/opencv/issues/24567 |
||
Brad Smith
|
3b287770b9 |
Corrections for FreeBSD ARM support
FreeBSD does not have the /proc file system. FreeBSD was added to the code path for aarch64 before the use of the /proc file system with |
||
Brad Smith
|
34a871c855 | Fix building on OpenBSD X86 | ||
Maksim Shabunin
|
adde942e34 | OCL: fix incompatibility with Mali ruintime | ||
Alexander Smorkalov
|
408730b7ab
|
Merge pull request #24618 from vrabaud:compilation
Fix compilation on some 32-bit windows |
||
Vincent Rabaud
|
0812659e92 |
Fix compilation on some 32-bit windows
I do not have more info on the platform as it is internal. Without this fix, the error is: core/src/arithm.simd.hpp:868:1: error: too few arguments provided to function-like macro invocation 868 | DEFINE_SIMD_ALL(cmp) | ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:93:5: note: expanded from macro 'DEFINE_SIMD_ALL' 93 | DEFINE_SIMD_NSAT(fun, __VA_ARGS__) | ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:89:5: note: expanded from macro 'DEFINE_SIMD_NSAT' 89 | DEFINE_SIMD_F64(fun, __VA_ARGS__) | ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:77:9: note: expanded from macro 'DEFINE_SIMD_F64' 77 | DEFINE_NOSIMD(__CV_CAT(fun, 64f), double, __VA_ARGS__) | ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:47:56: note: expanded from macro 'DEFINE_NOSIMD' 47 | DEFINE_NOSIMD_FUN(fun_name, c_type, __VA_ARGS__) | ^ ./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:860:9: note: macro 'DEFINE_NOSIMD_FUN' defined here 860 | #define DEFINE_NOSIMD_FUN(fun, _T1, _Tvec, ...) \ |
||
Hao Chen
|
c19adb4953 |
Change the lsx to baseline features.
This patch change lsx to baseline feature, and lasx to dispatch feature. Additionally, the runtime detection methods for lasx and lsx have been modified. |
||
Rostislav Vasilikhin
|
ea47cb3ffe
|
Merge pull request #24480 from savuor:backport_patch_nans
Backport to 4.x: patchNaNs() SIMD acceleration #24480 backport from #23098 connected PR in extra: [#1118@extra](https://github.com/opencv/opencv_extra/pull/1118) ### This PR contains: * new SIMD code for `patchNaNs()` * CPU perf test <details> <summary>Performance comparison</summary> Geometric mean (ms) |Name of Test|noopt|sse2|avx2|sse2 vs noopt (x-factor)|avx2 vs noopt (x-factor)| |---|:-:|:-:|:-:|:-:|:-:| |PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC1)|0.019|0.017|0.018|1.11|1.07| |PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC4)|0.037|0.037|0.033|1.00|1.10| |PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC1)|0.032|0.032|0.033|0.99|0.98| |PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC4)|0.072|0.072|0.070|1.00|1.03| |PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC1)|0.051|0.051|0.050|1.00|1.01| |PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC4)|0.137|0.138|0.128|0.99|1.06| |PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC1)|0.137|0.128|0.129|1.07|1.06| |PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC4)|0.450|0.450|0.448|1.00|1.01| |PatchNaNs::PatchNaNsFixture::(640x480, 32FC1)|0.149|0.029|0.020|5.13|7.44| |PatchNaNs::PatchNaNsFixture::(640x480, 32FC2)|0.304|0.058|0.040|5.25|7.65| |PatchNaNs::PatchNaNsFixture::(640x480, 32FC3)|0.448|0.086|0.059|5.22|7.55| |PatchNaNs::PatchNaNsFixture::(640x480, 32FC4)|0.601|0.133|0.083|4.51|7.23| |PatchNaNs::PatchNaNsFixture::(1280x720, 32FC1)|0.451|0.093|0.060|4.83|7.52| |PatchNaNs::PatchNaNsFixture::(1280x720, 32FC2)|0.892|0.184|0.126|4.85|7.06| |PatchNaNs::PatchNaNsFixture::(1280x720, 32FC3)|1.345|0.311|0.230|4.32|5.84| |PatchNaNs::PatchNaNsFixture::(1280x720, 32FC4)|1.831|0.546|0.436|3.35|4.20| |PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC1)|1.017|0.250|0.160|4.06|6.35| |PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC2)|2.077|0.646|0.605|3.21|3.43| |PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC3)|3.134|1.053|0.961|2.97|3.26| |PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC4)|4.222|1.436|1.288|2.94|3.28| |PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC1)|4.225|1.401|1.277|3.01|3.31| |PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC2)|8.310|2.953|2.635|2.81|3.15| |PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC3)|12.396|4.455|4.252|2.78|2.92| |PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC4)|17.174|5.831|5.824|2.95|2.95| </details> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake |
||
CNClareChen
|
d142a796d8
|
Merge pull request #23929 from CNClareChen:4.x
* Optimize some function with lasx. Optimize some function with lasx. #23929 This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake |
||
Alexander Smorkalov
|
1c0ca41b6e
|
Merge pull request #24371 from hanliutong:clean-up
Clean up the obsolete API of Universal Intrinsic |
||
Vadim Pisarevsky
|
ba4d6c859d
|
added detection & dispatching of some modern NEON instructions (NEON_FP16, NEON_BF16) (#24420)
* added more or less cross-platform (based on POSIX signal() semantics) method to detect various NEON extensions, such as FP16 SIMD arithmetics, BF16 SIMD arithmetics, SIMD dotprod etc. It could be propagated to other instruction sets if necessary. * hopefully fixed compile errors * continue to fix CI * another attempt to fix build on Linux aarch64 * * reverted to the original method to detect special arm neon instructions without signal() * renamed FP16_SIMD & BF16_SIMD to NEON_FP16 and NEON_BF16, respectively * removed extra whitespaces |
||
Liutong HAN
|
a287605c3e | Clean up the Universal Intrinsic API. | ||
Sean McBride
|
5fb3869775
|
Merge pull request #23109 from seanm:misc-warnings
* Fixed clang -Wnewline-eof warnings * Fixed all trivial clang -Wextra-semi and -Wc++98-compat-extra-semi warnings * Removed trailing semi from various macros * Fixed various -Wunused-macros warnings * Fixed some trivial -Wdocumentation warnings * Fixed some -Wdocumentation-deprecated-sync warnings * Fixed incorrect indentation * Suppressed some clang warnings in 3rd party code * Fixed QRCodeEncoder::Params documentation. --------- Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai> |
||
jvuillaumier
|
24fd39538e
|
Merge pull request #24233 from jvuillaumier:rotate_flip_hal_hooks
Add HAL implementation hooks to cv::flip() and cv::rotate() functions from core module #24233 Hello, This change proposes the addition of HAL hooks for cv::flip() and cv::rotate() functions from OpenCV core module. Flip and rotation are functions commonly available from 2D hardware accelerators. This is convenient provision to enable custom optimized implementation of image flip/rotation on systems embedding such accelerator. Thank you ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake |
||
HAN Liutong
|
07bf9cb013
|
Merge pull request #24325 from hanliutong:rewrite
Rewrite Universal Intrinsic code: float related part #24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: #23885 First patch, an example #23980 Core module #24058 ImgProc module, part 1 #24132 ImgProc module, part 2 #24166 ImgProc module, part 3 #24301 Features2d and calib3d module #24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake |
||
casualwinds
|
7b399c4248
|
Merge pull request #24280 from casualwind:parallel_opt
Optimization for parallelization when large core number #24280 **Problem description:** When the number of cores is large, OpenCV’s thread library may reduce performance when processing parallel jobs. **The reason for this problem:** When the number of cores (the thread pool initialized the threads, whose number is as same as the number of cores) is large, the main thread will spend too much time on waking up unnecessary threads. When a parallel job needs to be executed, the main thread will wake up all threads in sequence, and then wait for the signal for the job completion after waking up all threads. When the number of threads is larger than the parallel number of a job slices, there will be a situation where the main thread wakes up the threads in sequence and the awakened threads have completed the job, but the main thread is still waking up the other threads. The threads woken up by the main thread after this have nothing to do, and the broadcasts made by the waking threads take a lot of time, which reduce the performance. **Solution:** Reduce the time for the process of main thread waking up the worker threads through the following two methods: • The number of threads awakened by the main thread should be adjusted according to the parallel number of a job slices. If the number of threads is greater than the number of the parallel number of job slices, the total number of threads awakened should be reduced. • In the process of waking up threads in sequence, if the main thread finds that all parallel job slices have been allocated, it will jump out of the loop in time and wait for the signal for the job completion. **Performance Test:** The tests were run in the manner described by https://github.com/opencv/opencv/wiki/HowToUsePerfTests. At core number = 160, There are big performance gain in some cases. Take the following cases in the video module as examples: OpticalFlowPyrLK_self::Path_Idx_Cn_NPoints_WSize_Deriv::("cv/optflow/frames/VGA_%02d.png", 2, 1, (9, 9), 11, true) Performance improves 191%:0.185405ms ->0.0636496ms perf::DenseOpticalFlow_VariationalRefinement::(320x240, 10, 10) Performance improves 112%:23.88938ms -> 11.2562ms Among all the modules, the performance improvement is greatest on module video, and there are also certain improvements on other modules. At core number = 160, the times labeled below are the geometric mean of the average time of all cases for one module. The optimization is available on each module. overall | time(ms) | | | | | | | -- | -- | -- | -- | -- | -- | -- | -- | -- module name | gapi | dnn | features2d | objdetect | core | imgproc | stitching | video original | 0.185 | 1.586 | 9.998 | 11.846 | 0.205 | 0.215 | 164.409 | 0.803 optimized | 0.174 | 1.353 | 9.535 | 11.105 | 0.199 | 0.185 | 153.972 | 0.489 Performance improves | 6% | 17% | 5% | 7% | 3% | 16% | 7% | 64% Meanwhile, It is found that adjusting the order of test cases will have an impact on some test cases. For example, we used option --gtest-shuffle to run opencv_perf_gapi, the performance of TestPerformance::CmpWithScalarPerfTestFluid/CmpWithScalarPerfTest::(compare_f, CMP_GE, 1920x1080, 32FC1, { gapi.kernel_package }) case had 30% changes compared to the case without shuffle. I would like to ask if you have also encountered such a situation and could you share your experience? ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake |
||
Vincent Rabaud
|
3880d059b3
|
Merge pull request #24260 from vrabaud:ubsan
Fix undefined behavior arithmetic in copyMakeBorder and adjustROI. #24260 This is due to the undefined: negative int multiplied by size_t pointer increment. To test, compile with: ``` mkdir build cd build cmake ../ -DCMAKE_C_FLAGS_INIT="-fsanitize=undefined" -DCMAKE_CXX_FLAGS_INIT="-fsanitize=undefined" -DCMAKE_C_COMPILER="/usr/bin/clang" -DCMAKE_CXX_COMPILER="/usr/bin/clang++" -DCMAKE_SHARED_LINKER_FLAGS="-fsanitize=undefined -lubsan" ``` And run: ``` make -j opencv_test_core && ./bin/opencv_test_core --gtest_filter=*UndefinedBehavior* ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake |
||
Yuriy Chernyshov
|
eb20bb3b23 | Add missing sanitizer interface include | ||
Alexander Smorkalov
|
0367a12b92 | Check that cv::merge input matrices are not empty. | ||
Yuriy Chernyshov
|
494d201fda | Add missing <sstream> includes | ||
Yuantao Feng
|
a308dfca98
|
core: add broadcast (#23965)
* add broadcast_to with tests * change name * fix test * fix implicit type conversion * replace type of shape with InputArray * add perf test * add perf tests which takes care of axis * v2 from ficus expand * rename to broadcast * use randu in place of declare * doc improvement; smaller scale in perf * capture get_index by reference |
||
Kumataro
|
81cc89a3ce
|
Merge pull request #24179 from Kumataro:fix24145
* core:add OPENCV_IPP_MEAN/MINMAX/SUM option to enable IPP optimizations * fix: to use guard HAVE_IPP and ocv_append_source_file_compile_definitions() macro. * support OPENCV_IPP_ENABLE_ALL * add document for OPENCV_IPP_ENABLE_ALL * fix OPENCV_IPP_ENABLE_ALL comment |
||
Sean McBride
|
d792ebc5d2 |
Fixed buffer overrun; removed the last two uses of sprintf
Fixed an off-by-1 buffer resize, the space for the null termination was forgotten. Prefer snprintf, which can never overflow (if given the right size). In one case I cheated and used strcpy, because I cannot figure out the buffer size at that point in the code. |
||
Alexander Smorkalov
|
747b7cab6c
|
Merge pull request #23734 from seanm:unaligned-copy
Fixed invalid cast and unaligned memory access |
||
Alexander Smorkalov
|
232c67bf76
|
Merge pull request #24140 from sthibaul:4.x
Fix GNU/Hurd build |
||
HAN Liutong
|
0dd7769bb1
|
Merge pull request #23980 from hanliutong:rewrite-core
Rewrite Universal Intrinsic code by using new API: Core module. #23980 The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API. The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885. Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are: 1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited - ./modules/core/src/stat.simd.hpp - ./modules/core/src/matrix_transform.cpp - ./modules/core/src/matmul.simd.hpp 2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly. - ./modules/core/src/mathfuncs_core.simd.hpp ```cpp struct v_atan_f32 { explicit v_atan_f32(const float& scale) { ... } v_float32 compute(const v_float32& y, const v_float32& x) { ... } ... v_float32 val90; // sizeless type can not used in a class v_float32 val180; v_float32 val360; v_float32 s; }; ``` 3. The API interface does not support/does not match - ./modules/core/src/norm.cpp Use `v_popcount`, ~~waiting for #23966~~ Fixed - ./modules/core/src/has_non_zero.simd.hpp Use illegal Universal Intrinsic API: For float type, there is no logical operation `|`. Further discussion needed ```cpp /** @brief Bitwise OR Only for integer types. */ template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); ``` ```cpp #if CV_SIMD typedef v_float32 v_type; const v_type v_zero = vx_setzero_f32(); constexpr const int unrollCount = 8; int step = v_type::nlanes * unrollCount; int len0 = len & -step; const float* srcSimdEnd = src+len0; int countSIMD = static_cast<int>((srcSimdEnd-src)/step); while(!res && countSIMD--) { v_type v0 = vx_load(src); src += v_type::nlanes; v_type v1 = vx_load(src); src += v_type::nlanes; .... src += v_type::nlanes; v0 |= v1; //Illegal ? .... //res = v_check_any(((v0 | v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ res = !v_check_all(((v0 | v4) == v_zero)); } v_cleanup(); #endif ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake |
||
Samuel Thibault
|
82de5b3a67 |
Fix GNU/Hurd build
It has the usual Unix filesystem operations. |
||
Vincent Rabaud
|
423ab8ddb8 | Use void* | ||
Vincent Rabaud
|
20784d3da2 |
Fix undefined behavior with wrong function pointers called.
Details here: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58006 runtime error: call to function (unknown) through pointer to incorrect function type 'void (*)(const unsigned char **, const int *, unsigned char **, const int *, int, int)' |
||
Alexander Smorkalov
|
23f27d8dbe | Use OpenCV logging instead of std::cerr. | ||
Maksim Shabunin
|
3f0707234f | risc-v: fix unaligned loads and stores | ||
Liutong HAN
|
d17507052e | Rewrite SIMD code by using new Universal Intrinsic API. | ||
Alexander Smorkalov
|
bf06bc92aa | Merge branch '3.4' into merge-3.4 | ||
Paul Kim (김형준)
|
3b264d5877
|
Add pthread.h Inclusion if HAVE_PTHREADS_PF is defined
Single-case tested with success on Windows 11 with MinGW-w64 Standalone GCC v13.1.0 while building OpenCV 4.7.0 |
||
Dmitry Kurtaev
|
22b747eae2
|
Merge pull request #23702 from dkurt:py_rotated_rect
Python binding for RotatedRect #23702 ### Pull Request Readiness Checklist related: https://github.com/opencv/opencv/issues/23546#issuecomment-1562894602 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake |
||
Alexander Smorkalov
|
004801f1c5 | Merge remote-tracking branch 'origin/3.4' into merge-3.4 | ||
dizcza
|
e625b32841 | [opencv 3.x] back-ported tbb support ubuntu 22.04 | ||
Sean McBride
|
57da72d444 |
Fixed invalid cast and unaligned memory access
Although acceptible to Intel CPUs, it's still undefined behaviour according to the C++ standard. It can be replaced with memcpy, which makes the code simpler, and it generates the same assembly code with gcc and clang with -O2 (verified with godbolt). Also expanded the test to include other little endian CPUs by testing for __LITTLE_ENDIAN__. |
||
Pierre Chatelier
|
60b806f9b8
|
Merge pull request #22947 from chacha21:hasNonZero
Added cv::hasNonZero() #22947 `cv::hasNonZero()` is semantically equivalent to (`cv::countNonZero()>0`) but stops parsing the image when a non-zero value is found, for a performance gain - [X] I agree to contribute to the project under Apache 2 License. - [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake This pull request might be refused, but I submit it to know if further work is needed or if I just stop working on it. The idea is only a performance gain vs `countNonZero()>0` at the cost of more code. Reasons why it might be refused : - this is just more code - the execution time is "unfair"/"unpredictable" since it depends on the position of the first non-zero value - the user must be aware that default search is from first row/col to last row/col and has no way to customize that, even if his use case lets him know where a non zero could be found - the PR in its current state is using, for the ocl implementation, a mere `countNonZero()>0` ; there is not much sense in trying to break early the ocl kernel call when non-zero is encountered. So the ocl implementation does not bring any improvement. - there is no IPP function that can help (`countNonZero()` is based in `ippCountInRange`) - the PR in its current state might be slower than a call to `countNonZero()>0` in some cases (see "challenges" below) Reasons why it might be accepted : - the performance gain is huge on average, if we consider that "on average" means "non zero in the middle of the image" - the "missing" IPP implementation is replaced by an "Open-CV universal intrinsics" implementation - the PR in its current state is almost always faster than a call to `countNonZero()>0`, is only slightly slower in the worst cases, and not even for all matrices **Challenges** The worst case is either an all-zero matrix, or a non-zero at the very last position. In such a case, the `hasNonZero()` implementation will parse the whole matrix like `countNonZero()` would do. But we expect the performance to be the same in this case. And `ippCountInRange` is hard to beat ! There is also the case of very small matrices (<=32x32...) in 8b, where the SIMD can be hard to feed. For all cases but the worse, my custom `hasNonZero()` performs better than `ippCountInRange()` For the worst case, my custom `hasNonZero()` performs better than `ippCountInRange()` *except for large matrices of type CV_32S or CV_64F* (but surprisingly, not CV_32F). The difference is small, but it exists (and I don't understand why). For very small CV_8U matrices `ippCountInRange()` seems unbeatable. Here is the code that I use to check timings ``` //test cv::hasNonZero() vs (cv::countNonZero()>0) for different matrices sizes, types, strides... { cv::setRNGSeed(1234); const std::vector<cv::Size> sizes = {{32, 32}, {64, 64}, {128, 128}, {320, 240}, {512, 512}, {640, 480}, {1024, 768}, {2048, 2048}, {1031, 1000}}; const std::vector<int> types = {CV_8U, CV_16U, CV_32S, CV_32F, CV_64F}; const size_t iterations = 1000; for(const cv::Size& size : sizes) { for(const int type : types) { for(int c = 0 ; c<2 ; ++c) { const bool continuous = !c; for(int i = 0 ; i<4 ; ++i) { cv::Mat m = continuous ? cv::Mat::zeros(size, type) : cv::Mat(cv::Mat::zeros(cv::Size(2*size.width, size.height), type), cv::Rect(cv::Point(0, 0), size)); const bool nz = (i <= 2); const unsigned int nzOffsetRange = 10; const unsigned int nzOffset = cv::randu<unsigned int>()%nzOffsetRange; const cv::Point pos = (i == 0) ? cv::Point(nzOffset, 0) : (i == 1) ? cv::Point(size.width/2-nzOffsetRange/2+nzOffset, size.height/2) : (i == 2) ? cv::Point(size.width-1-nzOffset, size.height-1) : cv::Point(0, 0); std::cout << "============================================================" << std::endl; std::cout << "size:" << size << " type:" << type << " continuous = " << (continuous ? "true" : "false") << " iterations:" << iterations << " nz=" << (nz ? "true" : "false"); std::cout << " pos=" << ((i == 0) ? "begin" : (i == 1) ? "middle" : (i == 2) ? "end" : "none"); std::cout << std::endl; cv::Mat mask = cv::Mat::zeros(size, CV_8UC1); mask.at<unsigned char>(pos) = 0xFF; m.setTo(cv::Scalar::all(0)); m.setTo(cv::Scalar::all(nz ? 1 : 0), mask); std::vector<bool> results; std::vector<double> timings; { bool res = false; auto ref = cv::getTickCount(); for(size_t k = 0 ; k<iterations ; ++k) res = cv::hasNonZero(m); auto now = cv::getTickCount(); const bool error = (res != nz); if (error) printf("!!ERROR!!\r\n"); results.push_back(res); timings.push_back(1000.*(now-ref)/cv::getTickFrequency()); } { bool res = false; auto ref = cv::getTickCount(); for(size_t k = 0 ; k<iterations ; ++k) res = (cv::countNonZero(m)>0); auto now = cv::getTickCount(); const bool error = (res != nz); if (error) printf("!!ERROR!!\r\n"); results.push_back(res); timings.push_back(1000.*(now-ref)/cv::getTickFrequency()); } const size_t bestTimingIndex = (std::min_element(timings.begin(), timings.end())-timings.begin()); if ((bestTimingIndex != 0) || (std::find_if_not(results.begin(), results.end(), [&](bool r) {return (r == nz);}) != results.end())) { std::cout << "cv::hasNonZero\t\t=>" << results[0] << ((results[0] != nz) ? " ERROR" : "") << " perf:" << timings[0] << "ms => " << (iterations/timings[0]*1000) << " im/s" << ((bestTimingIndex == 0) ? " * " : "") << std::endl; std::cout << "cv::countNonZero\t=>" << results[1] << ((results[1] != nz) ? " ERROR" : "") << " perf:" << timings[1] << "ms => " << (iterations/timings[1]*1000) << " im/s" << ((bestTimingIndex == 1) ? " * " : "") << std::endl; } } } } } } ``` Here is a report of this benchmark (it only reports timings when `cv::countNonZero()` is faster) My CPU is an Intel Core I7 4790 @ 3.60Ghz ``` ============================================================ size:[32 x 32] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:0 continuous = false iterations:1000 nz=true pos=middle cv::hasNonZero =>1 perf:0.353764ms => 2.82674e+06 im/s cv::countNonZero =>1 perf:0.282044ms => 3.54555e+06 im/s * ============================================================ size:[32 x 32] type:0 continuous = false iterations:1000 nz=true pos=end cv::hasNonZero =>1 perf:0.610478ms => 1.63806e+06 im/s cv::countNonZero =>1 perf:0.283182ms => 3.5313e+06 im/s * ============================================================ size:[32 x 32] type:0 continuous = false iterations:1000 nz=false pos=none cv::hasNonZero =>0 perf:0.630115ms => 1.58701e+06 im/s cv::countNonZero =>0 perf:0.282044ms => 3.54555e+06 im/s * ============================================================ size:[32 x 32] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:5 continuous = false iterations:1000 nz=true pos=end cv::hasNonZero =>1 perf:0.607347ms => 1.64651e+06 im/s cv::countNonZero =>1 perf:0.467037ms => 2.14116e+06 im/s * ============================================================ size:[32 x 32] type:5 continuous = false iterations:1000 nz=false pos=none cv::hasNonZero =>0 perf:0.618162ms => 1.6177e+06 im/s cv::countNonZero =>0 perf:0.468175ms => 2.13595e+06 im/s * ============================================================ size:[32 x 32] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[32 x 32] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[32 x 32] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[32 x 32] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[32 x 32] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[64 x 64] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[64 x 64] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[64 x 64] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[64 x 64] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[128 x 128] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[128 x 128] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[128 x 128] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[128 x 128] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[320 x 240] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[320 x 240] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[320 x 240] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[320 x 240] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[512 x 512] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[512 x 512] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[512 x 512] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[512 x 512] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[640 x 480] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[640 x 480] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[640 x 480] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[640 x 480] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1024 x 768] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1024 x 768] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1024 x 768] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1024 x 768] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=true pos=end cv::hasNonZero =>1 perf:895.381ms => 1116.84 im/s cv::countNonZero =>1 perf:882.569ms => 1133.06 im/s * ============================================================ size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=false pos=none cv::hasNonZero =>0 perf:899.53ms => 1111.69 im/s cv::countNonZero =>0 perf:870.894ms => 1148.24 im/s * ============================================================ size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=true pos=end cv::hasNonZero =>1 perf:2018.92ms => 495.313 im/s cv::countNonZero =>1 perf:1966.37ms => 508.552 im/s * ============================================================ size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=false pos=none cv::hasNonZero =>0 perf:2005.87ms => 498.537 im/s cv::countNonZero =>0 perf:1992.78ms => 501.812 im/s * ============================================================ size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=false pos=none ============================================================ size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=true pos=begin ============================================================ size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=true pos=middle ============================================================ size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=true pos=end ============================================================ size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=false pos=none done ``` |
||
Dmitry Kurtaev
|
380caa1a87
|
Merge pull request #23691 from dkurt:pycv_float16_fixes
Import and export np.float16 in Python #23691 ### Pull Request Readiness Checklist * Also, fixes `cv::norm` with `NORM_INF` and `CV_16F` resolves https://github.com/opencv/opencv/issues/23687 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake |
||
Alexander Smorkalov
|
65487946cc | Added final constrants check to solveLP to filter out flating-point numeric issues. | ||
Alexander Smorkalov
|
d4861bfd1f | Merge remote-tracking branch 'origin/3.4' into merge-3.4 | ||
cudawarped
|
7539abecdb | cuda: add python bindings to allow GpuMat and Stream objects to be initialized from raw pointers | ||
Alexander Alekhin
|
04d71da6e7 | Merge pull request #23566 from seanm:atomic-bool | ||
Sean McBride
|
27e10efa66 |
Use std::atomic<bool> as it's necessary for correct thread safety
Now that C++11 is required, we can unconditionally use std::atomic in this case, which is more correct. |
||
Pierre Chatelier
|
6dd8a9b6ad
|
Merge pull request #13879 from chacha21:REDUCE_SUM2
add REDUCE_SUM2 #13879 proposal to add REDUCE_SUM2 to cv::reduce, an operation that sums up the square of elements |