opencv/modules
Liutong HAN 35571be570
Merge pull request #26318 from hanliutong:rvv-intrin-m2
Use LMUL=2 in the RISC-V Vector (RVV) backend of Universal Intrinsic. #26318

The modification of this patch involves the RVV backend of Universal Intrinsic, replacing `LMUL=1` with `LMUL=2`.

Now each Universal Intrinsic type actually corresponds to two RVV vector registers, and each Intrinsic function also operates two vector registers. Considering that algorithms written using Universal Intrinsic usually do not use the maximum number of registers, this can help the RVV backend utilize more register resources without modifying the algorithm implementation

This patch is generally beneficial in performance.

We compiled OpenCV with `Clang-19.1.1` and `GCC-14.2.0` , ran it on `CanMV-k230` and `Banana-Pi F3`. Then we have four scenarios on combinations of compilers and devices. In `opencv_perf_core`, there are 3363 cases, of which:
- 901 (26.8%) cases achieved more than `5%` performance improvement in all four scenarios, and the average speedup of these test cases (compared to scalar) increased from `3.35x` to `4.35x`
- 75 (2.2%) cases had more than `5%` performance loss in all four scenarios, indicating that these cases are better with `LMUL=1` instead of `LMUL=2`. This involves `Mat_Transform`, `hasNonZero`, `KMeans`, `meanStdDev`, `merge` and `norm2`. Among them, `Mat_Transform` only has performance degradation in a few cases (`8UC3`), and the actual execution time of `hasNonZero` is so short that it can be ignored. For `KMeans`, `meanStdDev`, `merge` and `norm2`, we should be able to use the HAL to optimize/restore their performance. (In fact, we have already done this for `merge`  #26216 )

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-24 10:08:43 +03:00
..
calib3d inversion checks 2024-10-06 17:24:15 +03:00
core Merge pull request #26318 from hanliutong:rvv-intrin-m2 2024-10-24 10:08:43 +03:00
dnn OpenVINO friendly output names from non-compiled Model 2024-10-23 09:29:05 +03:00
features2d C-API cleanup: use AutoBuffer in MSER 2024-10-01 18:44:22 +03:00
flann Prevent signed integer overflow in LshTable 2024-05-24 23:47:36 +02:00
gapi ADE update to 0.1.2e 2024-10-22 17:45:00 +03:00
highgui highgui: Make GThread mandatory with GTK 2024-07-15 16:30:39 +02:00
imgcodecs Merge pull request #26211 from Kumataro:fix26207 2024-10-18 14:44:55 +03:00
imgproc Merge pull request #26313 from FantasqueX:ipp-warp-affine-border-value 2024-10-17 08:50:30 +03:00
java Merge pull request #26009 from alexlyulkov:al/unify-build-gradle 2024-08-21 09:24:28 +03:00
js fix: performance typo 2024-10-18 08:37:32 -03:00
ml speedup random forest getVotes method. 2024-08-20 15:35:59 +07:00
objc fix: resolve Swift method name conflicts by adding missing namespace 2024-07-18 00:20:17 +08:00
objdetect Added buffer-based model loading to FaceRecognizerSF 2024-10-09 15:13:47 +02:00
photo C-API cleanup: inpaint algorithms in photo 2024-10-01 20:10:35 +03:00
python Merge pull request #25643 from cpoerschke:issue-25635-find-existing-file-tests 2024-08-05 15:28:16 +03:00
stitching Merge pull request #26022 from Kumataro:fix26016 2024-08-23 12:35:13 +03:00
ts ts: add some missing override markers 2024-09-13 12:48:05 +03:00
video Added HAL documentation note for out-of-bound hack in optical flow LK. 2024-10-02 12:38:25 +03:00
videoio C-API cleanup: backport videoio changes from 5.x 2024-10-01 17:06:08 +03:00
world cmake: use /INCREMENTAL:NO with MSVS 2015 2023-12-07 19:46:27 +00:00