mirror of
https://github.com/opencv/opencv.git
synced 2025-06-24 21:10:56 +08:00
![]() Enable SIMD_SCALABLE for exp and sqrt #26886 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` CPU - Banana Pi k1, compiler - clang 18.1.4 ``` ``` Geometric mean (ms) Name of Test baseline hal ui hal ui vs vs baseline baseline (x-factor) (x-factor) Exp::ExpFixture::(127x61, 32FC1) 0.358 -- 0.033 -- 10.70 Exp::ExpFixture::(640x480, 32FC1) 14.304 -- 1.167 -- 12.26 Exp::ExpFixture::(1280x720, 32FC1) 42.785 -- 3.538 -- 12.09 Exp::ExpFixture::(1920x1080, 32FC1) 96.206 -- 7.927 -- 12.14 Exp::ExpFixture::(127x61, 64FC1) 0.433 0.050 0.098 8.59 4.40 Exp::ExpFixture::(640x480, 64FC1) 17.315 1.935 3.813 8.95 4.54 Exp::ExpFixture::(1280x720, 64FC1) 52.181 5.877 11.519 8.88 4.53 Exp::ExpFixture::(1920x1080, 64FC1) 117.082 13.157 25.854 8.90 4.53 ``` Additionally, this PR brings Sqrt optimization with UI: ``` Geometric mean (ms) Name of Test baseline ui ui vs baseline (x-factor) Sqrt::SqrtFixture::(127x61, 5, false) 0.111 0.027 4.11 Sqrt::SqrtFixture::(127x61, 6, false) 0.149 0.053 2.82 Sqrt::SqrtFixture::(640x480, 5, false) 4.374 0.967 4.52 Sqrt::SqrtFixture::(640x480, 6, false) 5.885 2.046 2.88 Sqrt::SqrtFixture::(1280x720, 5, false) 12.960 2.915 4.45 Sqrt::SqrtFixture::(1280x720, 6, false) 17.648 6.107 2.89 Sqrt::SqrtFixture::(1920x1080, 5, false) 29.178 6.524 4.47 Sqrt::SqrtFixture::(1920x1080, 6, false) 39.709 13.670 2.90 ``` Reference Muller, J.-M. Elementary Functions: Algorithms and Implementation. 2nd ed. Boston: Birkhäuser, 2006. https://www.springer.com/gp/book/9780817643720 |
||
---|---|---|
.. | ||
cuda | ||
opencl | ||
perf_abs.cpp | ||
perf_addWeighted.cpp | ||
perf_allocation.cpp | ||
perf_arithm.cpp | ||
perf_bitwise.cpp | ||
perf_compare.cpp | ||
perf_convertTo.cpp | ||
perf_cvround.cpp | ||
perf_dft.cpp | ||
perf_dot.cpp | ||
perf_inRange.cpp | ||
perf_io_base64.cpp | ||
perf_lut.cpp | ||
perf_main.cpp | ||
perf_mat.cpp | ||
perf_math.cpp | ||
perf_merge.cpp | ||
perf_minmaxloc.cpp | ||
perf_norm.cpp | ||
perf_precomp.hpp | ||
perf_reduce.cpp | ||
perf_sort.cpp | ||
perf_split.cpp | ||
perf_stat.cpp | ||
perf_umat.cpp |