opencv

mirror of https://github.com/opencv/opencv.git synced 2025-07-24 14:06:27 +08:00

Author	SHA1	Message	Date
Alexander Smorkalov	50072f8d4f	Merge pull request #27089 from amane-ame:hist_hal_rvv Add RISC-V HAL implementation for cv::equalizeHist	2025-03-20 12:21:00 +03:00
天音あめ	46fbe1895a	Merge pull request #27096 from amane-ame:moments_hal_rvv Add RISC-V HAL implementation for cv::moments #27096 This patch implements `cv_hal_imageMoments` using native intrinsics, optimizing the performance of `cv::moments` for data types `CV_16U/CV_16S/CV_32F/CV_64F`. Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0. ``` $ ./opencv_test_imgproc --gtest_filter="Moments" $ ./opencv_perf_imgproc --gtest_filter="Moments" --perf_min_samples=1000 --perf_force_samples=1000 ``` ![image](https://github.com/user-attachments/assets/0efbae10-c022-4f15-a81c-682514cdb372) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-20 10:50:06 +03:00
amane-ame	b902a8e792	Add equalize_hist. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-18 15:53:05 +08:00
Yuantao Feng	8207549638	Merge pull request #26991 from fengyuentau:4x/core/norm2hal_rvv core: improve norm of hal rvv #26991 Merge with https://github.com/opencv/opencv_extra/pull/1241 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-18 09:42:55 +03:00
天音あめ	0142231e4d	Merge pull request #27072 from amane-ame:thresh_hal_rvv Add RISC-V HAL implementation for cv::threshold and cv::adaptiveThreshold #27072 This patch implements `cv_hal_threshold_otsu` and `cv_hal_adaptiveThreshold` using native intrinsics, optimizing the performance of `cv::threshold(THRESH_OTSU)` and `cv::adaptiveThreshold`. Since UI is as fast as HAL `cv_hal_rvv::threshold::threshold` so `cv_hal_threshold` is not redirected, but this part of HAL is keeped because `cv_hal_threshold_otsu` depends on it. Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0. ``` $ ./opencv_test_imgproc --gtest_filter="thresh:Thresh" $ ./opencv_perf_imgproc --gtest_filter="otsu:adaptiveThreshold" --perf_min_samples=1000 --perf_force_samples=1000 ``` ![image](https://github.com/user-attachments/assets/4bb953f8-8589-4af1-8f1c-99e2c506be3c) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-18 09:24:00 +03:00
GenshinImpactStarts	2090407002	Merge pull request #26999 from GenshinImpactStarts:polar_to_cart [HAL RVV] unify and impl polar_to_cart \| add perf test #26999 ### Summary 1. Implement through the existing `cv_hal_polarToCart32f` and `cv_hal_polarToCart64f` interfaces. 2. Add `polarToCart` performance tests 3. Make `cv::polarToCart` use CALL_HAL in the same way as `cv::cartToPolar` 4. To achieve the 3rd point, the original implementation was moved, and some modifications were made. Tested through: ```sh opencv_test_core --gtest_filter="PolarToCart:Core_CartPolar_reverse" opencv_perf_core --gtest_filter="PolarToCart" --perf_min_samples=300 --perf_force_samples=300 ``` ### HAL performance test *UPDATE: Current implementation is no more depending on vlen. NOTE: Due to the 4th point in the summary above, the `scalar` and `ui` test is based on the modified code of this PR. The impact of this patch on `scalar` and `ui` is evaluated in the next section, `Effect of Point 4`. Vlen 256 (Muse Pi): ``` Name of Test scalar ui rvv ui rvv vs vs scalar scalar (x-factor) (x-factor) PolarToCart::PolarToCartFixture::(127x61, 32FC1) 0.315 0.110 0.034 2.85 9.34 PolarToCart::PolarToCartFixture::(127x61, 64FC1) 0.423 0.163 0.045 2.59 9.34 PolarToCart::PolarToCartFixture::(640x480, 32FC1) 13.695 4.325 1.278 3.17 10.71 PolarToCart::PolarToCartFixture::(640x480, 64FC1) 17.719 7.118 2.105 2.49 8.42 PolarToCart::PolarToCartFixture::(1280x720, 32FC1) 40.678 13.114 3.977 3.10 10.23 PolarToCart::PolarToCartFixture::(1280x720, 64FC1) 53.124 21.298 6.519 2.49 8.15 PolarToCart::PolarToCartFixture::(1920x1080, 32FC1) 95.158 29.465 8.894 3.23 10.70 PolarToCart::PolarToCartFixture::(1920x1080, 64FC1) 119.262 47.743 14.129 2.50 8.44 ``` ### Effect of Point 4 To make `cv::polarToCart` behave the same as `cv::cartToPolar`, the implementation detail of the former has been moved to the latter's location (from `mathfuncs.cpp` to `mathfuncs_core.simd.hpp`). #### Reason for Changes: This function works as follows: $y = \text{mag} \times \sin(\text{angle})$ and $x = \text{mag} \times \cos(\text{angle})$. The original implementation first calculates the values of $\sin$ and $\cos$, storing the results in the output buffers $x$ and $y$, and then multiplies the result by $\text{mag}$. However, when the function is used as an in-place operation (one of the output buffers is also an input buffer), the original implementation allocates an extra buffer to store the $\sin$ and $\cos$ values in case the $\text{mag}$ value gets overwritten. This extra buffer allocation prevents `cv::polarToCart` from functioning in the same way as `cv::cartToPolar`. Therefore, the multiplication is now performed immediately without storing intermediate values. Since the original implementation also had AVX2 optimizations, I have applied the same optimizations to the AVX2 version of this implementation. UPDATE*: UI use v_sincos from #25892 now. The original implementation has AVX2 optimizations but is slower much than current UI so it's removed, and AVX2 perf test is below. Scalar implementation isn't changed because it's faster than using UI's method. #### Test Result `scalar` and `ui` test is done on Muse PI, and AVX2 test is done on Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz. `scalar` test: ``` Name of Test orig pr pr vs orig (x-factor) PolarToCart::PolarToCartFixture::(127x61, 32FC1) 0.333 0.294 1.13 PolarToCart::PolarToCartFixture::(127x61, 64FC1) 0.385 0.403 0.96 PolarToCart::PolarToCartFixture::(640x480, 32FC1) 14.749 12.343 1.19 PolarToCart::PolarToCartFixture::(640x480, 64FC1) 19.419 16.743 1.16 PolarToCart::PolarToCartFixture::(1280x720, 32FC1) 44.155 37.822 1.17 PolarToCart::PolarToCartFixture::(1280x720, 64FC1) 62.108 50.358 1.23 PolarToCart::PolarToCartFixture::(1920x1080, 32FC1) 99.011 85.769 1.15 PolarToCart::PolarToCartFixture::(1920x1080, 64FC1) 127.740 112.874 1.13 ``` `ui` test: ``` Name of Test orig pr pr vs orig (x-factor) PolarToCart::PolarToCartFixture::(127x61, 32FC1) 0.306 0.110 2.77 PolarToCart::PolarToCartFixture::(127x61, 64FC1) 0.455 0.163 2.79 PolarToCart::PolarToCartFixture::(640x480, 32FC1) 13.381 4.325 3.09 PolarToCart::PolarToCartFixture::(640x480, 64FC1) 21.851 7.118 3.07 PolarToCart::PolarToCartFixture::(1280x720, 32FC1) 39.975 13.114 3.05 PolarToCart::PolarToCartFixture::(1280x720, 64FC1) 67.006 21.298 3.15 PolarToCart::PolarToCartFixture::(1920x1080, 32FC1) 90.362 29.465 3.07 PolarToCart::PolarToCartFixture::(1920x1080, 64FC1) 129.637 47.743 2.72 ``` AVX2 test: ``` Name of Test orig pr pr vs orig (x-factor) PolarToCart::PolarToCartFixture::(127x61, 32FC1) 0.019 0.009 2.11 PolarToCart::PolarToCartFixture::(127x61, 64FC1) 0.022 0.013 1.74 PolarToCart::PolarToCartFixture::(640x480, 32FC1) 0.788 0.355 2.22 PolarToCart::PolarToCartFixture::(640x480, 64FC1) 1.102 0.618 1.78 PolarToCart::PolarToCartFixture::(1280x720, 32FC1) 2.383 1.042 2.29 PolarToCart::PolarToCartFixture::(1280x720, 64FC1) 3.758 2.316 1.62 PolarToCart::PolarToCartFixture::(1920x1080, 32FC1) 5.577 2.559 2.18 PolarToCart::PolarToCartFixture::(1920x1080, 64FC1) 9.710 6.424 1.51 ``` A slight performance loss occurs because the check for whether $mag$ is nullptr is performed with every calculation, instead of being done once per batch. This is to reuse current `SinCos_32f` function. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-17 14:16:09 +03:00
Alexander Smorkalov	0a39f98bee	Merge pull request #27067 from amane-ame:sepfilter_optimize Optimize RISC-V HAL cv::sepFilter	2025-03-17 09:21:33 +03:00
Liutong HAN	6eaaaa410e	Merge pull request #27056 from hanliutong:rvv-hal-copyright [RVV HAL] Add copyright and replace '#pragma once'. #27056 Add copyright and in RVV HAL, since other companies or teams may join the development and add their copyright. And the '#pragma once' are replaced. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-15 17:25:31 +03:00
amane-ame	2c16f3b7d2	Optimize cv::sepFilter. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-14 18:33:57 +08:00
GenshinImpactStarts	2a8d4b8e43	Merge pull request #27000 from GenshinImpactStarts:cart_to_polar [HAL RVV] reuse atan \| impl cart_to_polar \| add perf test #27000 Implement through the existing `cv_hal_cartToPolar32f` and `cv_hal_cartToPolar64f` interfaces. Add `cartToPolar` performance tests. cv_hal_rvv::fast_atan is modified to make it more reusable because it's needed in cartToPolar. UPDATE: UI enabled. Since the vec type of RVV can't be stored in struct. UI implementation of `v_atan_f32` is modified. Both `fastAtan` and `cartToPolar` are affected so the test result for `atan` is also appended. I have tested the modified UI on RVV and AVX2 and no regressions appears. Perf test done on MUSE-PI. AVX2 test done on Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz. ```sh $ opencv_test_core --gtest_filter="CartToPolar:Core_CartPolar_reverse:Phase" $ opencv_perf_core --gtest_filter="CartToPolar:phase" --perf_min_samples=300 --perf_force_samples=300 ``` Test result between enabled UI and HAL: ``` Name of Test ui rvv rvv vs ui (x-factor) CartToPolar::CartToPolarFixture::(127x61, 32FC1) 0.106 0.059 1.80 CartToPolar::CartToPolarFixture::(127x61, 64FC1) 0.155 0.070 2.20 CartToPolar::CartToPolarFixture::(640x480, 32FC1) 4.188 2.317 1.81 CartToPolar::CartToPolarFixture::(640x480, 64FC1) 6.593 2.889 2.28 CartToPolar::CartToPolarFixture::(1280x720, 32FC1) 12.600 7.057 1.79 CartToPolar::CartToPolarFixture::(1280x720, 64FC1) 19.860 8.797 2.26 CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 28.295 15.809 1.79 CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 44.573 19.398 2.30 phase32f::VectorLength::128 0.002 0.002 1.20 phase32f::VectorLength::1000 0.008 0.006 1.32 phase32f::VectorLength::131072 1.061 0.731 1.45 phase32f::VectorLength::524288 3.997 2.976 1.34 phase32f::VectorLength::1048576 8.001 5.959 1.34 phase64f::VectorLength::128 0.002 0.002 1.33 phase64f::VectorLength::1000 0.012 0.008 1.58 phase64f::VectorLength::131072 1.648 0.931 1.77 phase64f::VectorLength::524288 6.836 3.837 1.78 phase64f::VectorLength::1048576 14.060 7.540 1.86 ``` Test result before and after enabling UI on RVV: ``` Name of Test perf perf perf ui ui ui orig pr pr vs perf ui orig (x-factor) CartToPolar::CartToPolarFixture::(127x61, 32FC1) 0.141 0.106 1.33 CartToPolar::CartToPolarFixture::(127x61, 64FC1) 0.187 0.155 1.20 CartToPolar::CartToPolarFixture::(640x480, 32FC1) 5.990 4.188 1.43 CartToPolar::CartToPolarFixture::(640x480, 64FC1) 8.370 6.593 1.27 CartToPolar::CartToPolarFixture::(1280x720, 32FC1) 18.214 12.600 1.45 CartToPolar::CartToPolarFixture::(1280x720, 64FC1) 25.365 19.860 1.28 CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 40.437 28.295 1.43 CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 56.699 44.573 1.27 phase32f::VectorLength::128 0.003 0.002 1.54 phase32f::VectorLength::1000 0.016 0.008 1.90 phase32f::VectorLength::131072 2.048 1.061 1.93 phase32f::VectorLength::524288 8.219 3.997 2.06 phase32f::VectorLength::1048576 16.426 8.001 2.05 phase64f::VectorLength::128 0.003 0.002 1.44 phase64f::VectorLength::1000 0.020 0.012 1.60 phase64f::VectorLength::131072 2.621 1.648 1.59 phase64f::VectorLength::524288 10.780 6.836 1.58 phase64f::VectorLength::1048576 22.723 14.060 1.62 ``` Test result before and after modifying UI on AVX2: ``` Name of Test perf perf perf avx2 avx2 avx2 orig pr pr vs perf avx2 orig (x-factor) CartToPolar::CartToPolarFixture::(127x61, 32FC1) 0.006 0.005 1.14 CartToPolar::CartToPolarFixture::(127x61, 64FC1) 0.010 0.009 1.08 CartToPolar::CartToPolarFixture::(640x480, 32FC1) 0.273 0.264 1.03 CartToPolar::CartToPolarFixture::(640x480, 64FC1) 0.511 0.487 1.05 CartToPolar::CartToPolarFixture::(1280x720, 32FC1) 0.760 0.723 1.05 CartToPolar::CartToPolarFixture::(1280x720, 64FC1) 2.009 1.937 1.04 CartToPolar::CartToPolarFixture::(1920x1080, 32FC1) 1.996 1.923 1.04 CartToPolar::CartToPolarFixture::(1920x1080, 64FC1) 5.721 5.509 1.04 phase32f::VectorLength::128 0.000 0.000 0.98 phase32f::VectorLength::1000 0.001 0.001 0.97 phase32f::VectorLength::131072 0.105 0.111 0.95 phase32f::VectorLength::524288 0.402 0.402 1.00 phase32f::VectorLength::1048576 0.775 0.767 1.01 phase64f::VectorLength::128 0.000 0.000 1.00 phase64f::VectorLength::1000 0.001 0.001 1.01 phase64f::VectorLength::131072 0.163 0.162 1.01 phase64f::VectorLength::524288 0.669 0.653 1.02 phase64f::VectorLength::1048576 1.660 1.634 1.02 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-13 15:56:56 +03:00
GenshinImpactStarts	e30697fd42	Merge pull request #27002 from GenshinImpactStarts:magnitude [HAL RVV] impl magnitude \| add perf test #27002 Implement through the existing `cv_hal_magnitude32f` and `cv_hal_magnitude64f` interfaces. UPDATE: UI is enabled. The only difference between UI and HAL now is HAL use a approximate `sqrt`. Perf test done on MUSE-PI. ```sh $ opencv_test_core --gtest_filter="Magnitude" $ opencv_perf_core --gtest_filter="Magnitude" --perf_min_samples=300 --perf_force_samples=300 ``` Test result between enabled UI and HAL: ``` Name of Test ui rvv rvv vs ui (x-factor) Magnitude::MagnitudeFixture::(127x61, 32FC1) 0.029 0.016 1.75 Magnitude::MagnitudeFixture::(127x61, 64FC1) 0.057 0.036 1.57 Magnitude::MagnitudeFixture::(640x480, 32FC1) 1.063 0.648 1.64 Magnitude::MagnitudeFixture::(640x480, 64FC1) 2.261 1.530 1.48 Magnitude::MagnitudeFixture::(1280x720, 32FC1) 3.261 2.118 1.54 Magnitude::MagnitudeFixture::(1280x720, 64FC1) 6.802 4.682 1.45 Magnitude::MagnitudeFixture::(1920x1080, 32FC1) 7.287 4.738 1.54 Magnitude::MagnitudeFixture::(1920x1080, 64FC1) 15.226 10.334 1.47 ``` Test result before and after enabling UI: ``` Name of Test orig pr pr vs orig (x-factor) Magnitude::MagnitudeFixture::(127x61, 32FC1) 0.032 0.029 1.11 Magnitude::MagnitudeFixture::(127x61, 64FC1) 0.067 0.057 1.17 Magnitude::MagnitudeFixture::(640x480, 32FC1) 1.228 1.063 1.16 Magnitude::MagnitudeFixture::(640x480, 64FC1) 2.786 2.261 1.23 Magnitude::MagnitudeFixture::(1280x720, 32FC1) 3.762 3.261 1.15 Magnitude::MagnitudeFixture::(1280x720, 64FC1) 8.549 6.802 1.26 Magnitude::MagnitudeFixture::(1920x1080, 32FC1) 8.408 7.287 1.15 Magnitude::MagnitudeFixture::(1920x1080, 64FC1) 18.884 15.226 1.24 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-13 08:34:11 +03:00
GenshinImpactStarts	60de3ff24f	Merge pull request #27015 from GenshinImpactStarts:sqrt [HAL RVV] impl sqrt and invSqrt #27015 Implement through the existing interfaces `cv_hal_sqrt32f`, `cv_hal_sqrt64f`, `cv_hal_invSqrt32f`, `cv_hal_invSqrt64f`. Perf test done on MUSE-PI and CanMV K230. Because the performance of scalar is much worse than universal intrinsic, only ui and hal rvv is compared. In RVV's UI, `invSqrt` is computed using `1 / sqrt()`. This patch first uses `frsqrt` and then applies the Newton-Raphson method to achieve higher precision. For the initial value, I tried using the famous [fast inverse square root algorithm](https://en.wikipedia.org/wiki/Fast_inverse_square_root), which involves one bit shift and one subtraction. However, on both MUSE-PI and CanMV K230, the performance was slightly lower (about 3%), so I chose to use `frsqrt` for the initial value instead. BTW, I think this patch can directly replace RVV's UI. UPDATE: Due to strange vector registers allocation strategy in clang, for `invSqrt`, clang use LMUL m4 while gcc use LMUL m8, which leads to some performance loss in clang. So the test for clang is appended. ```sh $ opencv_test_core --gtest_filter="Core_HAL/mathfuncs." $ opencv_perf_core --gtest_filter="SqrtFixture." --perf_min_samples=300 --perf_force_samples=300 ``` CanMV K230: ``` Name of Test ui rvv rvv vs ui (x-factor) Sqrt::SqrtFixture::(127x61, 5, false) 0.052 0.027 1.96 Sqrt::SqrtFixture::(127x61, 5, true) 0.101 0.026 3.80 Sqrt::SqrtFixture::(127x61, 6, false) 0.106 0.059 1.79 Sqrt::SqrtFixture::(127x61, 6, true) 0.207 0.058 3.55 Sqrt::SqrtFixture::(640x480, 5, false) 1.988 0.956 2.08 Sqrt::SqrtFixture::(640x480, 5, true) 3.920 0.948 4.13 Sqrt::SqrtFixture::(640x480, 6, false) 4.179 2.342 1.78 Sqrt::SqrtFixture::(640x480, 6, true) 8.220 2.290 3.59 Sqrt::SqrtFixture::(1280x720, 5, false) 5.969 2.881 2.07 Sqrt::SqrtFixture::(1280x720, 5, true) 11.731 2.857 4.11 Sqrt::SqrtFixture::(1280x720, 6, false) 12.533 7.031 1.78 Sqrt::SqrtFixture::(1280x720, 6, true) 24.643 6.917 3.56 Sqrt::SqrtFixture::(1920x1080, 5, false) 13.423 6.483 2.07 Sqrt::SqrtFixture::(1920x1080, 5, true) 26.379 6.436 4.10 Sqrt::SqrtFixture::(1920x1080, 6, false) 28.200 15.833 1.78 Sqrt::SqrtFixture::(1920x1080, 6, true) 55.434 15.565 3.56 ``` MUSE-PI: ``` GCC \| clang Name of Test ui rvv rvv \| ui rvv rvv vs \| vs ui \| ui (x-factor) \| (x-factor) Sqrt::SqrtFixture::(127x61, 5, false) 0.027 0.018 1.46 \| 0.027 0.016 1.65 Sqrt::SqrtFixture::(127x61, 5, true) 0.050 0.017 2.98 \| 0.050 0.017 2.99 Sqrt::SqrtFixture::(127x61, 6, false) 0.053 0.031 1.72 \| 0.052 0.032 1.64 Sqrt::SqrtFixture::(127x61, 6, true) 0.100 0.030 3.31 \| 0.101 0.035 2.86 Sqrt::SqrtFixture::(640x480, 5, false) 0.955 0.483 1.98 \| 0.959 0.499 1.92 Sqrt::SqrtFixture::(640x480, 5, true) 1.873 0.489 3.83 \| 1.873 0.520 3.60 Sqrt::SqrtFixture::(640x480, 6, false) 2.027 1.163 1.74 \| 2.037 1.218 1.67 Sqrt::SqrtFixture::(640x480, 6, true) 3.961 1.153 3.44 \| 3.961 1.341 2.95 Sqrt::SqrtFixture::(1280x720, 5, false) 2.916 1.538 1.90 \| 2.912 1.598 1.82 Sqrt::SqrtFixture::(1280x720, 5, true) 5.735 1.534 3.74 \| 5.726 1.661 3.45 Sqrt::SqrtFixture::(1280x720, 6, false) 6.121 3.585 1.71 \| 6.109 3.725 1.64 Sqrt::SqrtFixture::(1280x720, 6, true) 12.059 3.501 3.44 \| 12.053 4.080 2.95 Sqrt::SqrtFixture::(1920x1080, 5, false) 6.540 3.535 1.85 \| 6.540 3.643 1.80 Sqrt::SqrtFixture::(1920x1080, 5, true) 12.943 3.445 3.76 \| 12.908 3.706 3.48 Sqrt::SqrtFixture::(1920x1080, 6, false) 13.714 8.062 1.70 \| 13.711 8.376 1.64 Sqrt::SqrtFixture::(1920x1080, 6, true) 27.011 7.989 3.38 \| 27.115 9.245 2.93 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-12 08:34:27 +03:00
Alexander Smorkalov	a48e78cdfc	Merge pull request #27026 from amane-ame/filter_hal_rvv Add RISC-V HAL implementation for cv::filter series	2025-03-11 16:09:45 +03:00
amane-ame	2dd72201af	Remove CV_ASSERT. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-11 18:37:58 +08:00
amane-ame	d9ec808b15	Use the macro from interface.h. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-11 17:44:55 +08:00
Alexander Smorkalov	4be88e934f	Merge pull request #27010 from GenshinImpactStarts/exp_log [HAL RVV] impl exp and log \| add log perf test	2025-03-11 10:51:03 +03:00
Alexander Smorkalov	3236436892	Merge pull request #27036 from CodeLinaro:xuezha_3rdPost Fix gaussianBlur5x5 performance regression	2025-03-10 18:21:20 +03:00
Xue Zhang	accebdecf7	Fix gaussianBlur5x5 performance regression	2025-03-10 16:16:56 +05:30
Alexander Smorkalov	316b5d7b08	Merge pull request #27031 from sturkmen72:libjpeg-turbo_ver_3.1.0 Libjpeg-turbo update to version 3.1.0	2025-03-10 13:44:00 +03:00
amane-ame	54da5c3e77	Add some algorithm comments. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-10 16:42:58 +08:00
GenshinImpactStarts	830d031213	Merge pull request #26977 from GenshinImpactStarts:helper_hal_rvv [Refactor](HAL RVV): Consolidate Helpers for Code Reusability #26977 This PR introduces a new helper file with utility types and templates to standardize function interfaces. This refactor allows us to avoid duplicate code when types differ but logic remains the same. The `flip` and `minmax` implementations have been updated to use the new generic helpers, replacing the previously defined, redundant classes. Due to the large number of functions, not all interfaces are unified yet. Future development can extend the types as needed. While the usage of function templates is currently limited, this will ease future development. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-10 10:36:48 +03:00
amane-ame	02253dd76b	Copy cv::borderInterpolate from core. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-10 15:26:41 +08:00
quic-xuezha	797068853f	Merge pull request #27033 from CodeLinaro:xuezha_3rdPost Fix assert failure in Sobel test when enable FastCV #27033 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-10 10:24:28 +03:00
Suleyman TURKMEN	6d161c25ef	Update libjpeg-turbo version:3.1.0	2025-03-09 00:02:20 +03:00
GenshinImpactStarts	0fed1fa184	fix exp, log \| enable ui for log \| strengthen test Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-07 17:11:26 +00:00
GenshinImpactStarts	524d8ae01c	impl exp and log \| add log perf test Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-07 17:11:26 +00:00
amane-ame	e06502a254	Add Morph for MORPH_ERODE and MORPH_DILATE. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-08 00:35:50 +08:00
amane-ame	a2d784b6f5	Add sepFilter. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-07 20:56:04 +08:00
天音あめ	e89e2fd7ea	Merge pull request #27007 from amane-ame:color_hal_rvv Add RISC-V HAL implementation for cv::cvtColor #27007 This patch implements the following functions in RVV_HAL using native intrinsics, optimizing the performance of `cv::cvtColor` for all possible data types and modes (except for `COLOR_Bayer`, `COLOR_YUV2GRAY_420` and `COLOR_mRGBA`, as these modes have no HAL interface): ``` cv_hal_cvtBGRtoBGR cv_hal_cvtBGRtoBGR5x5 cv_hal_cvtBGR5x5toBGR cv_hal_cvtBGRtoGray cv_hal_cvtGraytoBGR cv_hal_cvtBGR5x5toGray cv_hal_cvtGraytoBGR5x5 cv_hal_cvtBGRtoYUV cv_hal_cvtYUVtoBGR cv_hal_cvtBGRtoXYZ cv_hal_cvtXYZtoBGR cv_hal_cvtBGRtoHSV cv_hal_cvtHSVtoBGR cv_hal_cvtBGRtoLab cv_hal_cvtLabtoBGR cv_hal_cvtTwoPlaneYUVtoBGR cv_hal_cvtBGRtoTwoPlaneYUV cv_hal_cvtThreePlaneYUVtoBGR cv_hal_cvtBGRtoThreePlaneYUV cv_hal_cvtOnePlaneYUVtoBGR cv_hal_cvtOnePlaneBGRtoYUV ``` Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0. ``` $ ./opencv_test_imgproc --gtest_filter="Color-Bayer" $ ./opencv_perf_imgproc --gtest_filter="Color-Bayer" --gtest_also_run_disabled_tests --perf_min_samples=100 --perf_force_samples=100 ``` View the full perf table here: [hal_rvv_color.pdf](https://github.com/user-attachments/files/19055417/hal_rvv_color.pdf) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable	2025-03-07 11:24:48 +03:00
天音あめ	00956d5c15	Merge pull request #26892 from amane-ame:solve_hal_rvv Add RISC-V HAL implementation for cv::solve #26892 This patch implements `cv_hal_LU/cv_hal_Cholesky/cv_hal_SVD/cv_hal_QR` function in RVV_HAL using native intrinsics, optimizing the performance for `cv::solve` with method `DECOMP_LU/DECOMP_SVD/DECOMP_CHOLESKY/DECOMP_QR` and data types `32FC1/64FC1`. Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0. ``` $ ./opencv_test_core --gtest_filter="Solve:SVD:Cholesky" $ ./opencv_perf_core --gtest_filter="SolveTest" --perf_min_samples=100 --perf_force_samples=100 ``` The tail of the perf table is shown below since the table is too long. View the full perf table here: [hal_rvv_solve.pdf](https://github.com/user-attachments/files/18725067/hal_rvv_solve.pdf) <img width="1078" alt="Untitled" src="https://github.com/user-attachments/assets/c01d849c-f000-4bcc-bfe0-a302d6605d9e" /> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-07 11:14:09 +03:00
天音あめ	bb525fe91d	Merge pull request #26865 from amane-ame:dxt_hal_rvv Add RISC-V HAL implementation for cv::dft and cv::dct #26865 This patch implements `static cv::DFT` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::dft` and `cv::dct` with data types `32FC1/64FC1/32FC2/64FC2`. The reason I chose to create a new `cv_hal_dftOcv` interface is that if I were to use the existing interfaces (`cv_hal_dftInit1D` and `cv_hal_dft1D`), it would require handling and parsing the dft flags within HAL, as well as performing preprocessing operations such as handling unit roots. Since these operations are not performance hotspots and do not require optimization, reusing the existing interfaces would result in copying approximately 300 lines of code from `core/src/dxt.cpp` into HAL, which I believe is unnecessary. Moreover, if I insert the new interface into `static cv::DFT`, both `static cv::RealDFT` and `static cv::DCT` can be optimized as well. The processing performed before and after calling `static cv::DFT` in these functions is also not a performance hotspot. Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0. ``` $ opencv_test_core --gtest_filter="DFT" $ opencv_perf_core --gtest_filter="dft:dct" --perf_min_samples=30 --perf_force_samples=30 ``` The head of the perf table is shown below since the table is too long. View the full perf table here: [hal_rvv_dxt.pdf](https://github.com/user-attachments/files/18622645/hal_rvv_dxt.pdf) <img width="1017" alt="Untitled" src="https://github.com/user-attachments/assets/609856e7-9c7d-4a95-9923-45c1b77eb3a2" /> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-07 11:08:41 +03:00
GenshinImpactStarts	57a78cb9df	Merge pull request #26941 from GenshinImpactStarts:lut_hal_rvv Impl hal_rvv LUT \| Add more LUT test #26941 Implement through the existing `cv_hal_lut` interfaces. Add more LUT accuracy and performance tests: - Accuracy test: Multi-channel table tests are added, and the boundary of `randu` used for generating test data is broadened to make the test more robust. - Performance test: Multi-channel input and multi-channel table tests are added. Perf test done on - MUSE-PI (vlen=256) - Compiler: gcc 14.2 (riscv-collab/riscv-gnu-toolchain Nightly: December 16, 2024) ```sh $ opencv_test_core --gtest_filter="Core_LUT" $ opencv_perf_core --gtest_filter="SizePrm_LUT" --perf_min_samples=300 --perf_force_samples=300 ``` ```sh Geometric mean (ms) Name of Test scalar ui rvv ui rvv vs vs scalar scalar (x-factor) (x-factor) LUT::SizePrm::320x240 0.248 0.249 0.052 1.00 4.74 LUT::SizePrm::640x480 0.277 0.275 0.085 1.01 3.28 LUT::SizePrm::1920x1080 0.950 0.947 0.634 1.00 1.50 LUT_multi2::SizePrm::320x240 2.051 2.045 2.049 1.00 1.00 LUT_multi2::SizePrm::640x480 2.128 2.134 2.125 1.00 1.00 LUT_multi2::SizePrm::1920x1080 7.397 7.380 7.390 1.00 1.00 LUT_multi::SizePrm::320x240 0.715 0.747 0.154 0.96 4.64 LUT_multi::SizePrm::640x480 0.741 0.766 0.257 0.97 2.88 LUT_multi::SizePrm::1920x1080 2.766 2.765 1.925 1.00 1.44 ``` This optimization is achieved by loading the entire lookup table into vector registers. Due to register size limitations, the optimization is only effective under the following conditions: - For the U8C1 table type, the optimization works when `vlen >= 256` - For U16C1, it works when `vlen >= 512` - For U32C1, it works when `vlen >= 1024` Since I don’t have real hardware with `vlen > 256`, the corresponding accuracy tests were conducted on QEMU built from the `riscv-collab/riscv-gnu-toolchain`. This patch does not implement optimizations for multi-channel tables. Previous attempts: 1. For the U8C1 table type, when `vlen = 128`, it is possible to use four `u8m4` vectors to load the entire table, perform gathering, and merge the results. However, the performance is almost the same as the scalar version. 2. Loading part of the table and repeatedly loading the source data is faster for small sizes. But as the table size grows, the performance quickly degrades compared to the scalar version. 3. Using `vluxei8` as a general solution does not show any performance improvement. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-06 11:17:00 +03:00
amane-ame	83104bed32	Add Filter2D. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-03-06 14:10:06 +08:00
天音あめ	cbcfd772ce	Merge pull request #26958 from amane-ame:pyramids_hal_rvv Add RISC-V HAL implementation for cv::pyrDown and cv::pyrUp #26958 This patch implements `cv_hal_pyrdown/cv_hal_pyrup` function in RVV_HAL using native intrinsics, optimizing the performance for `cv::pyrDown`, `cv::pyrUp` and `cv::buildPyramids` with data types `{8U,16S,32F} x {C1,C2,C3,C4,Cn}`. Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0. ``` $ ./opencv_test_imgproc --gtest_filter="pyr:Pyr" $ ./opencv_perf_imgproc --gtest_filter="pyr:Pyr" --perf_min_samples=300 --perf_force_samples=300 ``` <img width="1112" alt="Untitled" src="https://github.com/user-attachments/assets/235a9fba-0d29-434e-8a10-498212bac657" /> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-03-04 15:41:15 +03:00
GenshinImpactStarts	33d632f85e	impl hal_rvv norm_hamming Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>	2025-02-25 02:31:02 +00:00
GenshinImpactStarts	6a6a5a765d	Merge pull request #26943 from GenshinImpactStarts:flip_hal_rvv Impl RISC-V HAL for cv::flip \| Add perf test for flip #26943 Implement through the existing `cv_hal_flip` interfaces. Add perf test for `cv::flip`. The reason why select these args for testing: - size: copied from perf_lut - type: - U8C1: basic situation - U8C3: unaligned element size - U8C4: large element size Tested on - MUSE-PI (vlen=256) - Compiler: gcc 14.2 (riscv-collab/riscv-gnu-toolchain Nightly: December 16, 2024) ```sh $ opencv_test_core --gtest_filter="Core_Flip/ElemWiseTest." $ opencv_perf_core --gtest_filter="Size_MatType_FlipCode" --perf_min_samples=300 --perf_force_samples=300 ``` ``` Geometric mean (ms) Name of Test scalar ui rvv ui rvv vs vs scalar scalar (x-factor) (x-factor) flip::Size_MatType_FlipCode::(320x240, 8UC1, FLIP_X) 0.026 0.033 0.031 0.81 0.84 flip::Size_MatType_FlipCode::(320x240, 8UC1, FLIP_XY) 0.206 0.212 0.091 0.97 2.26 flip::Size_MatType_FlipCode::(320x240, 8UC1, FLIP_Y) 0.185 0.189 0.082 0.98 2.25 flip::Size_MatType_FlipCode::(320x240, 8UC3, FLIP_X) 0.070 0.084 0.084 0.83 0.83 flip::Size_MatType_FlipCode::(320x240, 8UC3, FLIP_XY) 0.616 0.612 0.235 1.01 2.62 flip::Size_MatType_FlipCode::(320x240, 8UC3, FLIP_Y) 0.587 0.603 0.204 0.97 2.88 flip::Size_MatType_FlipCode::(320x240, 8UC4, FLIP_X) 0.263 0.110 0.109 2.40 2.41 flip::Size_MatType_FlipCode::(320x240, 8UC4, FLIP_XY) 0.930 0.831 0.316 1.12 2.95 flip::Size_MatType_FlipCode::(320x240, 8UC4, FLIP_Y) 1.175 1.129 0.313 1.04 3.75 flip::Size_MatType_FlipCode::(640x480, 8UC1, FLIP_X) 0.303 0.118 0.111 2.57 2.73 flip::Size_MatType_FlipCode::(640x480, 8UC1, FLIP_XY) 0.949 0.836 0.405 1.14 2.34 flip::Size_MatType_FlipCode::(640x480, 8UC1, FLIP_Y) 0.784 0.783 0.409 1.00 1.92 flip::Size_MatType_FlipCode::(640x480, 8UC3, FLIP_X) 1.084 0.360 0.355 3.01 3.06 flip::Size_MatType_FlipCode::(640x480, 8UC3, FLIP_XY) 3.768 3.348 1.364 1.13 2.76 flip::Size_MatType_FlipCode::(640x480, 8UC3, FLIP_Y) 4.361 4.473 1.296 0.97 3.37 flip::Size_MatType_FlipCode::(640x480, 8UC4, FLIP_X) 1.252 0.469 0.451 2.67 2.78 flip::Size_MatType_FlipCode::(640x480, 8UC4, FLIP_XY) 5.732 5.220 1.303 1.10 4.40 flip::Size_MatType_FlipCode::(640x480, 8UC4, FLIP_Y) 5.041 5.105 1.203 0.99 4.19 flip::Size_MatType_FlipCode::(1920x1080, 8UC1, FLIP_X) 2.382 0.903 0.903 2.64 2.64 flip::Size_MatType_FlipCode::(1920x1080, 8UC1, FLIP_XY) 8.606 7.508 2.581 1.15 3.33 flip::Size_MatType_FlipCode::(1920x1080, 8UC1, FLIP_Y) 8.421 8.535 2.219 0.99 3.80 flip::Size_MatType_FlipCode::(1920x1080, 8UC3, FLIP_X) 6.312 2.416 2.429 2.61 2.60 flip::Size_MatType_FlipCode::(1920x1080, 8UC3, FLIP_XY) 29.174 26.055 12.761 1.12 2.29 flip::Size_MatType_FlipCode::(1920x1080, 8UC3, FLIP_Y) 25.373 25.500 13.382 1.00 1.90 flip::Size_MatType_FlipCode::(1920x1080, 8UC4, FLIP_X) 7.620 3.204 3.115 2.38 2.45 flip::Size_MatType_FlipCode::(1920x1080, 8UC4, FLIP_XY) 32.876 29.310 12.976 1.12 2.53 flip::Size_MatType_FlipCode::(1920x1080, 8UC4, FLIP_Y) 28.831 29.094 14.919 0.99 1.93 ``` The optimization for vlen <= 256 and > 256 are different, but I have no real hardware with vlen > 256. So accuracy tests for that like 512 and 1024 are conducted on QEMU built from the `riscv-collab/riscv-gnu-toolchain`. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-02-24 08:56:23 +03:00
Dmitry Kurtaev	7a2b048c92	Merge pull request #26923 from dkurt:merge_rvv_opt Further optimization of cv::merge RVV HAL for 8U and 16S #26923 ### Pull Request Readiness Checklist * Banana Pi BF3 (SpacemiT K1) RISC-V * Compiler: Syntacore Clang 18.1.4 (build 2024.12) ``` Geometric mean (ms) Name of Test baseline pr pr merge vs baseline merge (x-factor) merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 2) 0.013 0.003 3.76 merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 3) 0.020 0.006 3.46 merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 4) 0.026 0.010 2.61 merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 5) 0.043 0.028 1.56 merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 6) 0.054 0.035 1.53 merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 7) 0.065 0.050 1.30 merge::Size_SrcDepth_DstChannels::(127x61, 8UC1, 8) 0.070 0.036 1.95 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 2) 0.015 0.008 1.82 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 3) 0.022 0.015 1.48 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 4) 0.029 0.018 1.63 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 5) 0.067 0.044 1.54 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 6) 0.088 0.056 1.58 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 7) 0.104 0.076 1.38 merge::Size_SrcDepth_DstChannels::(127x61, 16SC1, 8) 0.116 0.065 1.79 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 2) 0.421 0.176 2.39 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 3) 0.792 0.284 2.79 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 4) 1.090 0.370 2.95 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 5) 1.835 1.399 1.31 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 6) 2.389 1.776 1.35 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 7) 3.000 2.471 1.21 merge::Size_SrcDepth_DstChannels::(640x480, 8UC1, 8) 3.178 2.104 1.51 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 2) 0.490 0.377 1.30 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 3) 1.348 0.602 2.24 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 4) 1.827 0.813 2.25 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 5) 3.283 2.692 1.22 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 6) 4.922 3.334 1.48 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 7) 5.725 4.399 1.30 merge::Size_SrcDepth_DstChannels::(640x480, 16SC1, 8) 6.278 4.748 1.32 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 2) 1.267 0.603 2.10 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 3) 2.394 0.934 2.56 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 4) 3.236 1.434 2.26 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 5) 5.398 4.345 1.24 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 6) 7.127 5.459 1.31 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 7) 8.590 7.298 1.18 merge::Size_SrcDepth_DstChannels::(1280x720, 8UC1, 8) 9.360 6.152 1.52 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 2) 1.482 1.242 1.19 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 3) 4.008 1.817 2.21 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 4) 6.079 2.468 2.46 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 5) 11.300 8.644 1.31 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 6) 15.125 12.126 1.25 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 7) 17.555 14.804 1.19 merge::Size_SrcDepth_DstChannels::(1280x720, 16SC1, 8) 18.890 14.163 1.33 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 2) 2.910 1.326 2.19 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 3) 5.351 1.997 2.68 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 4) 7.290 2.629 2.77 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 5) 12.426 9.611 1.29 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 6) 16.453 12.162 1.35 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 7) 19.420 16.190 1.20 merge::Size_SrcDepth_DstChannels::(1920x1080, 8UC1, 8) 20.588 13.699 1.50 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 2) 3.400 2.640 1.29 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 3) 8.986 3.952 2.27 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 4) 11.972 5.273 2.27 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 5) 20.544 17.996 1.14 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 6) 28.677 22.086 1.30 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 7) 32.958 27.713 1.19 merge::Size_SrcDepth_DstChannels::(1920x1080, 16SC1, 8) 36.499 27.439 1.33 ``` See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2025-02-20 17:28:28 +03:00
Vincent Rabaud	a6bfd87943	Bump openjp2 to v2.5.3 This should quiet some fuzzer bugs	2025-02-20 10:46:33 +03:00
Alexander Smorkalov	acc9084044	Move OpenVX integrations to imgproc to OpenVX HAL Covered functions: - medianBlur - Sobel - Canny - pyrDown - BoxFilter - equalizeHist - GaussianBlur - remap - threshold	2025-02-15 09:55:37 +03:00
Alexander Smorkalov	1de6e20463	Move OpenVX implementation for FAST to HAL.	2025-02-14 17:47:48 +03:00
Alexander Smorkalov	58e557d059	Merge pull request #26903 from asmorkalov:as/openvx_hal Migrate remaning OpenVX integrations to OpenVX HAL (core) #26903 Tested with OpenVX 1.2 & 1.3 sample implementation. Steps to build and test: ``` git clone git@github.com:KhronosGroup/OpenVX-sample-impl.git cd OpenVX-sample-impl python3 Build.py --os=Linux --conf=Release cd .. mkdir build cmake -DWITH_OPENVX=ON -DOPENVX_ROOT=/mnt/Projects/Projects/OpenVX-sample-impl/install/Linux/x64/Release/ ../opencv make -j8 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-02-14 11:55:20 +03:00
Alexander Smorkalov	5921aae2b3	Switch to static instance of FastCV on Linux.	2025-02-13 15:58:25 +03:00
lve-gh	d8c2f0bcdf	Merge pull request #26884 from lve-gh:split8u_rvv_hal [HAL] split8u RVV 1.0 #26884 ### Pull Request Readiness Checklist * Banana Pi BF3 (SpacemiT K1) * Compiler: Syntacore Clang 18.1.4 (build 2024.12) ``` Geometric mean (ms) Name of Test baseline hal hal ui vs baseline ui (x-factor) split::Size_Depth_Channels::(127x61, 8UC1, 2) 0.012 0.004 3.12 split::Size_Depth_Channels::(127x61, 8UC1, 3) 0.019 0.006 2.91 split::Size_Depth_Channels::(127x61, 8UC1, 4) 0.028 0.011 2.64 split::Size_Depth_Channels::(127x61, 8UC1, 5) 0.067 0.033 2.02 split::Size_Depth_Channels::(127x61, 8UC1, 6) 0.084 0.040 2.11 split::Size_Depth_Channels::(127x61, 8UC1, 7) 0.103 0.055 1.88 split::Size_Depth_Channels::(127x61, 8UC1, 8) 0.113 0.032 3.50 split::Size_Depth_Channels::(640x480, 8UC1, 2) 0.454 0.179 2.54 split::Size_Depth_Channels::(640x480, 8UC1, 3) 0.677 0.298 2.27 split::Size_Depth_Channels::(640x480, 8UC1, 4) 0.901 0.410 2.20 split::Size_Depth_Channels::(640x480, 8UC1, 5) 3.781 3.010 1.26 split::Size_Depth_Channels::(640x480, 8UC1, 6) 4.886 4.009 1.22 split::Size_Depth_Channels::(640x480, 8UC1, 7) 5.777 4.770 1.21 split::Size_Depth_Channels::(640x480, 8UC1, 8) 4.596 1.330 3.46 split::Size_Depth_Channels::(1280x720, 8UC1, 2) 1.377 0.709 1.94 split::Size_Depth_Channels::(1280x720, 8UC1, 3) 2.091 1.034 2.02 split::Size_Depth_Channels::(1280x720, 8UC1, 4) 2.744 1.573 1.74 split::Size_Depth_Channels::(1280x720, 8UC1, 5) 9.542 6.284 1.52 split::Size_Depth_Channels::(1280x720, 8UC1, 6) 11.114 7.850 1.42 split::Size_Depth_Channels::(1280x720, 8UC1, 7) 14.083 11.879 1.19 split::Size_Depth_Channels::(1280x720, 8UC1, 8) 13.524 3.865 3.50 split::Size_Depth_Channels::(1920x1080, 8UC1, 2) 3.108 1.395 2.23 split::Size_Depth_Channels::(1920x1080, 8UC1, 3) 4.659 2.128 2.19 split::Size_Depth_Channels::(1920x1080, 8UC1, 4) 6.127 2.818 2.17 split::Size_Depth_Channels::(1920x1080, 8UC1, 5) 26.733 16.625 1.61 split::Size_Depth_Channels::(1920x1080, 8UC1, 6) 31.242 22.414 1.39 split::Size_Depth_Channels::(1920x1080, 8UC1, 7) 35.968 27.658 1.30 split::Size_Depth_Channels::(1920x1080, 8UC1, 8) 29.997 8.655 3.47 ``` See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2025-02-11 17:57:05 +03:00
天音あめ	2e909c38dc	Merge pull request #26804 from amane-ame:norm_hal_rvv Add RISC-V HAL implementation for cv::norm and cv::normalize #26804 This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`. `cv::normalize` also calls `minMaxIdx`, #26789 implements RVV_HAL for this. Tested on MUSE-PI for both gcc 14.2 and clang 20.0. ``` $ opencv_test_core --gtest_filter="Norm" $ opencv_perf_core --gtest_filter="norm" --perf_min_samples=300 --perf_force_samples=300 ``` The head of the perf table is shown below since the table is too long. View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf) <img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" /> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-02-06 19:34:54 +03:00
Alexander Smorkalov	b7663086fb	Do not rely on cv namespace in HAL.	2025-02-06 10:00:28 +03:00
天音あめ	13b2caffe0	Merge pull request #26789 from amane-ame:minmax_hal_rvv Add RISC-V HAL implementation for minMaxIdx #26789 On the RISC-V platform, `minMaxIdx` cannot benefit from Universal Intrinsics because the UI-optimized `minMaxIdx` only supports `CV_SIMD128` (and does not accept `CV_SIMD_SCALABLE` for RVV). `1d701d1690/modules/core/src/minmax.cpp (L209-L214)` This patch implements `minMaxIdx` function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel. Tested on MUSE-PI for both gcc 14.2 and clang 20.0. ``` $ opencv_test_core --gtest_filter="MinMaxLoc" $ opencv_perf_core --gtest_filter="minMaxLoc" ``` <img width="1122" alt="Untitled" src="https://github.com/user-attachments/assets/6a246852-87af-42c5-a50b-c349c2765f3f" /> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-01-31 14:26:49 +03:00
Horror Proton	86241653a7	Add RISC-V HAL implementation for cv::phase	2025-01-29 12:07:59 +08:00
eplankin	ae57c54d83	Merge pull request #26463 from eplankin:icv_update_2022.0.0 Update IPP integration #26463 Please merge together with https://github.com/opencv/opencv_3rdparty/pull/88 Supported IPP version was updated to IPP 2022.0.0 for Linux and Windows. 32-bit binaries are dropped since this release. Previous update: https://github.com/opencv/opencv/pull/25935	2025-01-27 17:02:36 +03:00
Kumataro	3e1fafefbe	Merge pull request #26802 from Kumataro:fix26801 3rdparty:ittnotify: update to v3.25.4 #26802 Close https://github.com/opencv/opencv/issues/26801 See https://github.com/opencv/opencv/pull/26797 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2025-01-20 10:54:13 +03:00
Alexander Smorkalov	9c33baebbd	Merge pull request #26675 from hanliutong:rvv-hal-fix Add test cases and fix bugs in the RISC-V Vector HAL.	2024-12-29 18:09:21 +03:00

1 2 3 4 5 ...

903 Commits