Commit Graph

34664 Commits

Author SHA1 Message Date
Liutong HAN
35571be570
Merge pull request #26318 from hanliutong:rvv-intrin-m2
Use LMUL=2 in the RISC-V Vector (RVV) backend of Universal Intrinsic. #26318

The modification of this patch involves the RVV backend of Universal Intrinsic, replacing `LMUL=1` with `LMUL=2`.

Now each Universal Intrinsic type actually corresponds to two RVV vector registers, and each Intrinsic function also operates two vector registers. Considering that algorithms written using Universal Intrinsic usually do not use the maximum number of registers, this can help the RVV backend utilize more register resources without modifying the algorithm implementation

This patch is generally beneficial in performance.

We compiled OpenCV with `Clang-19.1.1` and `GCC-14.2.0` , ran it on `CanMV-k230` and `Banana-Pi F3`. Then we have four scenarios on combinations of compilers and devices. In `opencv_perf_core`, there are 3363 cases, of which:
- 901 (26.8%) cases achieved more than `5%` performance improvement in all four scenarios, and the average speedup of these test cases (compared to scalar) increased from `3.35x` to `4.35x`
- 75 (2.2%) cases had more than `5%` performance loss in all four scenarios, indicating that these cases are better with `LMUL=1` instead of `LMUL=2`. This involves `Mat_Transform`, `hasNonZero`, `KMeans`, `meanStdDev`, `merge` and `norm2`. Among them, `Mat_Transform` only has performance degradation in a few cases (`8UC3`), and the actual execution time of `hasNonZero` is so short that it can be ignored. For `KMeans`, `meanStdDev`, `merge` and `norm2`, we should be able to use the HAL to optimize/restore their performance. (In fact, we have already done this for `merge`  #26216 )

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-24 10:08:43 +03:00
Alexander Smorkalov
331412dfad
Merge pull request #26357 from dkurt:dkurt/ov_out_names_from_graph
OpenVINO friendly output names from non-compiled Model
2024-10-23 13:42:01 +03:00
Dmitry Kurtaev
d193554a5f OpenVINO friendly output names from non-compiled Model 2024-10-23 09:29:05 +03:00
Alexander Smorkalov
898a2a3811
Merge pull request #26353 from asmorkalov:as/ade_1.2e
ADE update to 0.1.2e
2024-10-23 08:10:16 +03:00
Hardik Kamboj
9fc7ca8ed1
Update py_thresholding.markdown
Changed "If the pixel value is smaller than the threshold" to "If the pixel value is smaller than or equal to the threshold" to make the line align with the working of the code.
2024-10-23 09:49:23 +05:30
Alexander Smorkalov
983086411f ADE update to 0.1.2e 2024-10-22 17:45:00 +03:00
Alexander Smorkalov
57ccbee25d
Merge pull request #26245 from cudawarped:cuda_update_to_npp_stream_ctx
cuda - update npp calls to use the new NppStreamContext API if available
2024-10-22 14:44:42 +03:00
Kumataro
4398e0b62b
Merge pull request #26340 from Kumataro:wa26339
doc: fix the position of toggle button #26340 

Close #26339 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-22 11:57:14 +03:00
Alexander Smorkalov
94d5ad09ff
Merge pull request #26284 from fzuuzf:enum_arithmetic_fixes_for_c++26
C++26 Deprecated Enum Arithmetic Conversion: Fix core/mat.inl.hpp
2024-10-21 15:47:53 +03:00
Alexander Smorkalov
e026a5ad8a
Merge pull request #26281 from kallaballa:clgl_device_discovery
Rewrote OpenCL-OpenGL-interop device discovery routine without extensions and with Apple support
2024-10-18 15:52:17 +03:00
Alexander Smorkalov
c79b72a838
Merge pull request #26335 from migueldaipre:4.x
fix: performance typo
2024-10-18 15:44:32 +03:00
Kumataro
35dbf32227
Merge pull request #26211 from Kumataro:fix26207
imgcodecs: implement imencodemulti() #26211

Close #26207
### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-18 14:44:55 +03:00
Miguel Daipré
888469a842
fix: performance typo 2024-10-18 08:37:32 -03:00
Septimiu Neaga
3919f33e21
Merge pull request #26293 from SeptimiuIoachimNeagaIntel:EISW-140103_optimization_flag
G-API: Introduce level optimization flag for ONNXRT backend #26293

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-17 10:22:08 +03:00
FantasqueX
489df18a13
Merge pull request #26313 from FantasqueX:ipp-warp-affine-border-value
Use border value in ipp version of warp affine #26313

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-17 08:50:30 +03:00
Alexander Smorkalov
d20c456ab7
Merge pull request #26320 from mshabunin:fix-cmake-in-list
build: set cmake policy for if(IN_LIST) support
2024-10-17 07:36:02 +03:00
Maksim Shabunin
8ba76e65e9 build: set cmake policy for if(IN_LIST) support 2024-10-16 22:40:47 +03:00
Suleyman TURKMEN
8e5dbc03fe
Merge pull request #26298 from sturkmen72:avif
Proposed solution for the issue 26297 #26298

closes #26297

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-14 11:23:02 +03:00
Alexander Smorkalov
1909ac8650
Merge pull request #26212 from jamacias:feature/TickMeter-lasttime
Enhance cv::TickMeter to be able to get the last elapsed time
2024-10-14 07:56:24 +03:00
Letu Ren
45b9398d68 Use generic SIMD in warpAffineBlocklineNN 2024-10-14 01:28:41 +08:00
Zach Lowry
08f7f13dfa
Merge pull request #26234 from zachlowry:apply-gcc6-fix-on-each-directory
Move the gcc6 compatibility check to occur on a per-directory basis, … #26234

Proposed fix for #26233 https://github.com/opencv/opencv/issues/26233

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-11 17:00:59 +03:00
kallaballa
3edcf410b6 more guarding 2024-10-11 02:18:14 +02:00
Alexander Smorkalov
0f234209da
Merge pull request #26278 from Quantizs:feature-create-face-recognizer-from-buffer
Added buffer-based model loading to FaceRecognizerSF
2024-10-10 17:17:00 +03:00
kallaballa
4cbb96b396 use new instead of malloc and guard it 2024-10-10 15:14:58 +02:00
kallaballa
50f6d54f87 renaming 2024-10-10 14:48:49 +02:00
Wanli
687e37e6a8
Merge pull request #25892 from WanliZhong:v_sincos
Add support for v_sin and v_cos (Sine and Cosine) #25892

This PR aims to implement `v_sincos(v_float16 x)`, `v_sincos(v_float32 x)` and `v_sincos(v_float64 x)`. 
Merged after https://github.com/opencv/opencv/pull/25891 and https://github.com/opencv/opencv/pull/26023

**NOTE:** 
Also, the patch changes already added `v_exp`, `v_log` and `v_erf` to pass parameters by reference instead of by value, to match API of other universal intrinsics.

TODO:
- [x] double and half float precision
- [x] tests for them
- [x] doc to explain the implementation

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-10 13:25:12 +03:00
Karsten Wiese
2a681bbb6b C++26 Deprecated Arithmetic Conversion: Fix core/mat.inl.hpp
Prefix enums with '+' to make clang c++26 add to them again.
2024-10-10 10:40:19 +02:00
kallaballa
63b5dee274 fixed bug: variable shadowing 2024-10-10 06:35:42 +02:00
kallaballa
8ba7389b21 properly size the devices array 2024-10-10 06:32:22 +02:00
kallaballa
885bbc643f renaming 2024-10-10 06:30:33 +02:00
kallaballa
dceeb47cd3 rewrote clgl device discovery 2024-10-10 00:02:56 +02:00
Alexander Smorkalov
69803e7b99
Merge pull request #26216 from hanliutong:rvv-hal-merge
Add the HAL implementation for the merge function on RISC-V Vector.
2024-10-09 17:07:57 +03:00
quantizs
e1b06371ad Added buffer-based model loading to FaceRecognizerSF
- Implemented a new `create` method in `FaceRecognizerSF` to allow model and configuration loading from memory buffers (std::vector<uchar>), similar to the existing functionality in `FaceDetectorYN`.
- Updated `face_recognize.cpp` with a new constructor in `FaceRecognizerSFImpl` that supports buffer-based loading for both model weights and network configuration.
- Ensured compatibility with both file-based and buffer-based model loading by maintaining consistent backend and target settings across both constructors.
- This change improves flexibility, allowing FaceRecognizerSF to be instantiated from memory buffers, which is useful for dynamic model loading scenarios such as embedded systems or applications where models are loaded in-memory.
2024-10-09 15:13:47 +02:00
Suleyman TURKMEN
e72efd0d32
Merge pull request #26260 from sturkmen72:upd_doc_4_x
Update Documentation #26260

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-09 09:09:51 +03:00
george
cefde84a76
Merge pull request #25909 from gblikas:patch-1
Update intrin_wasm.hpp #25909

See https://github.com/microsoft/vcpkg/issues/33443 for some build context when using 

```vcpkg install opencv4:wasm32-emscripten```

`__EMSCRIPTEN_major__`, `__EMSCRIPTEN_minor__` and `__EMSCRIPTEN_tiny__` in `emsdk` >= 3.1.4 are in a header, as opposed to command line. 

We could potentially be more aggressive with how I'm checking this property; let me know if I should make the change. 

It should also be suggested that `-msimd128` is auto-included in the associated portfile for opencv, but that's a separate issue. Someone let me know if I should also make that change as well. 

Special thanks to https://github.com/youar for supporting this work; please inform if applying a copyright-header is appropriate attribution.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-09 08:36:10 +03:00
Alexander Smorkalov
7d9014e09e
Merge pull request #26263 from mlourakis:4.x
inversion checks
2024-10-08 20:50:15 +03:00
Kumataro
40428d919d
Merge pull request #26259 from Kumataro:fix26258
core: C-API cleanup: RNG algorithms in core(4.x) #26259

- replace CV_RAND_UNI and NORMAL to cv::RNG::UNIFORM and cv::RNG::NORMAL.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-08 15:55:00 +03:00
Alexander Smorkalov
28efc21530
Merge pull request #26187 from inayd:26130-fixFillPolyBoundaries
Fix fillPoly drawing over boundaries
2024-10-07 17:13:03 +03:00
Alexander Smorkalov
cda9f4197e
Merge pull request #26266 from mshabunin:fix-rvv071-build
RISC-V: fix build with RVV 0.7.1
2024-10-07 16:14:01 +03:00
Maksim Shabunin
73d68f3f49 RISC-V: fix build with RVV 0.7.1 2024-10-07 12:53:23 +03:00
Manolis Lourakis
fa6d6520c7
inversion checks
Extra checks for corner cases in 3x3 matrix inversion
2024-10-06 17:24:15 +03:00
cudawarped
e375d5786b cuda - update npp calls to use the new NppStreamContext API if available 2024-10-03 15:13:04 +03:00
Alexander Smorkalov
3901426d85
Merge pull request #26241 from asmorkalov:as/kelidicv-0.2
Updated KleidiCV HAL to version 0.2. #26241

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-03 15:04:25 +03:00
Alexander Smorkalov
ae1fb8c033
Merge pull request #26224 from mshabunin:cpp-videoio-backport
C-API cleanup: backport videoio changes from 5.x
2024-10-03 14:41:20 +03:00
Wanli
783fe72756
Resolve Compilation Error for v_func Function in SIMD Emulator (#25891)
* use 2 parms for now to identify the error

* Revert "use 2 parms for now to identify the error"

This reverts commit 86faf993a7.

* replace += with =

* add v_log ref

* refactor intrin_math code

* Add include guard to `intrin_math.hpp` to prevent multiple inclusions

* rename VX to V; make fp64 impl in neon be optional

* add v_setall, v_setzero for all backends; rewrite the intrin_math

* fix error on rvv_scalable

* let v_erf use v_exp_default_32f function

* 1. replaced 'v_setzero(VecType dummy)' with 'v_setzero_<VecType>()'
2. replaced 'v_setall(LaneType x, VecType dummy)' with 'v_setall_<VecType>(LaneType x)'
3. added tests for the new v_setzero_<> and v_setall_<>.

* gcc does not seem to like static_assert in functions even when they are not used

* trying to fix compile errors in Debug mode on Linux

---------

Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
2024-10-02 21:28:48 +03:00
Alexander Smorkalov
73b3b24c56
Merge pull request #26236 from asmorkalov:as/HAL_pyrlk_hack_documentation
Added HAL documentation note for out-of-bound hack in optical flow LK.
2024-10-02 17:49:09 +03:00
Alexander Smorkalov
1aa325a460 Added HAL documentation note for out-of-bound hack in optical flow LK. 2024-10-02 12:38:25 +03:00
Alexander Smorkalov
292ee28913
Merge pull request #26230 from mshabunin:cpp-photo-4x
C-API cleanup: inpaint algorithms in photo (4.x)
2024-10-02 08:13:39 +03:00
Alexander Smorkalov
b8eed54ced
Merge pull request #26228 from mshabunin:cpp-features2d-4x
C-API cleanup: use AutoBuffer in MSER (4.x)
2024-10-02 08:11:28 +03:00
inayd
93a882d2e2 Fix fillPoly drawing over boundaries 2024-10-01 21:17:42 +02:00