mirror of
https://github.com/opencv/opencv.git
synced 2025-01-10 22:28:13 +08:00
ea0f9336e2
imgproc: fix perf regressions on the c4 kernels of warpAffine / warpPerspective / remap #26454 ## Performance Previous performance regressions on c4 kernels are mainly on A311D https://github.com/opencv/opencv/pull/26348. Regressions on c3 kernels on intel platform will be fixed in another pull request. M2 ``` Geometric mean (ms) Name of Test base patch patch vs base (x-factor) WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.338 0.163 2.08 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.310 0.107 2.90 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.344 0.162 2.13 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.313 0.111 2.83 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.676 0.333 2.03 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.640 0.240 2.66 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.212 0.885 1.37 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 1.153 0.756 1.53 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.950 0.475 2.00 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 1.158 0.500 2.32 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 3.441 3.106 1.11 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 3.351 2.837 1.18 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.336 0.163 2.07 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.314 0.124 2.54 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.385 0.226 1.70 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.364 0.183 1.99 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.541 0.290 1.87 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.523 0.243 2.16 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.540 1.239 1.24 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 1.504 1.134 1.33 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.751 0.465 1.62 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.958 0.507 1.89 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 3.785 3.487 1.09 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 3.602 3.280 1.10 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.331 0.153 2.16 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.304 0.128 2.37 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.329 0.156 2.11 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.306 0.121 2.53 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 2.046 0.930 2.20 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 2.122 1.391 1.53 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 2.035 0.954 2.13 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 2.127 1.410 1.51 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.329 0.157 2.09 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.306 0.124 2.47 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.327 0.158 2.08 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.308 0.127 2.43 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 2.039 0.948 2.15 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 2.175 1.373 1.58 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 2.065 0.956 2.16 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 2.158 1.372 1.57 ``` Intel i7-12700K: ``` Geometric mean (ms) Name of Test base patch patch vs base (x-factor) WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.140 0.051 2.77 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.140 0.054 2.57 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.140 0.050 2.78 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.143 0.054 2.64 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.297 0.118 2.51 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.296 0.130 2.28 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.481 0.304 1.58 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.470 0.309 1.52 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.381 0.184 2.07 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.811 0.781 1.04 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.297 1.063 1.22 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 1.275 1.171 1.09 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.135 0.057 2.36 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.134 0.062 2.16 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.155 0.076 2.04 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.150 0.079 1.90 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.229 0.114 2.02 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.227 0.120 1.89 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.560 0.444 1.26 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.529 0.442 1.20 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.326 0.192 1.70 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.805 0.762 1.06 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.395 1.255 1.11 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 1.381 1.306 1.06 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.138 0.049 2.81 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.134 0.053 2.53 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.137 0.049 2.79 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.134 0.053 2.51 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 1.362 1.352 1.01 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 3.124 3.038 1.03 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.354 1.351 1.00 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 3.142 3.049 1.03 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 0.140 0.052 2.70 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 0.136 0.056 2.43 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 0.139 0.051 2.70 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 0.135 0.056 2.41 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 1.335 1.345 0.99 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 3.117 3.024 1.03 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.327 1.319 1.01 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 3.126 3.026 1.03 ``` A311D ``` Geometric mean (ms) Name of Test base patch patch vs base (x-factor) WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 1.762 1.361 1.29 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 2.390 2.005 1.19 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.747 1.238 1.41 WarpAffine::TestWarpAffine::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 2.399 2.016 1.19 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 3.917 3.104 1.26 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 5.995 5.172 1.16 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 6.711 5.460 1.23 WarpAffine::TestWarpAffine::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 8.017 6.890 1.16 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 6.269 5.596 1.12 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 10.301 9.507 1.08 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 18.871 17.375 1.09 WarpAffine::TestWarpAffine::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 20.365 18.227 1.12 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 2.083 1.514 1.38 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 2.966 2.309 1.28 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 2.358 1.715 1.37 WarpPerspective::TestWarpPerspective::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 3.220 2.464 1.31 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 3.763 3.014 1.25 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 5.777 4.940 1.17 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 8.791 7.819 1.12 WarpPerspective::TestWarpPerspective::(1280x720, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 10.165 8.426 1.21 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 6.047 5.293 1.14 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 9.851 9.023 1.09 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 31.739 29.323 1.08 WarpPerspective::TestWarpPerspective::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 32.439 29.236 1.11 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 1.759 1.441 1.22 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 2.681 2.270 1.18 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 1.774 1.425 1.24 map1_32fc1::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 2.672 2.252 1.19 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 14.079 9.334 1.51 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 17.770 16.155 1.10 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 15.872 11.192 1.42 map1_32fc1::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 19.167 15.342 1.25 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 2.284 1.545 1.48 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 3.040 2.231 1.36 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 2.280 1.380 1.65 map1_32fc2::TestRemap::(640x480, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 2.882 2.185 1.32 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 16UC4) 15.877 11.381 1.40 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_CONSTANT, 32FC4) 19.521 16.106 1.21 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 16UC4) 15.950 11.532 1.38 map1_32fc2::TestRemap::(1920x1080, INTER_LINEAR, BORDER_REPLICATE, 32FC4) 19.699 16.276 1.21 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake |
||
---|---|---|
.. | ||
3rdparty/SoftFloat | ||
cmake/parallel | ||
doc | ||
include/opencv2 | ||
misc | ||
perf | ||
src | ||
test | ||
CMakeLists.txt |