opencv/modules
Rostislav Vasilikhin ea47cb3ffe
Merge pull request #24480 from savuor:backport_patch_nans
Backport to 4.x: patchNaNs() SIMD acceleration #24480

backport from #23098
connected PR in extra: [#1118@extra](https://github.com/opencv/opencv_extra/pull/1118)

### This PR contains:
* new SIMD code for `patchNaNs()`
* CPU perf test

<details>
<summary>Performance comparison</summary>

Geometric mean (ms)

|Name of Test|noopt|sse2|avx2|sse2 vs noopt (x-factor)|avx2 vs noopt (x-factor)|
|---|:-:|:-:|:-:|:-:|:-:|
|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC1)|0.019|0.017|0.018|1.11|1.07|
|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC4)|0.037|0.037|0.033|1.00|1.10|
|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC1)|0.032|0.032|0.033|0.99|0.98|
|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC4)|0.072|0.072|0.070|1.00|1.03|
|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC1)|0.051|0.051|0.050|1.00|1.01|
|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC4)|0.137|0.138|0.128|0.99|1.06|
|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC1)|0.137|0.128|0.129|1.07|1.06|
|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC4)|0.450|0.450|0.448|1.00|1.01|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC1)|0.149|0.029|0.020|5.13|7.44|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC2)|0.304|0.058|0.040|5.25|7.65|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC3)|0.448|0.086|0.059|5.22|7.55|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC4)|0.601|0.133|0.083|4.51|7.23|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC1)|0.451|0.093|0.060|4.83|7.52|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC2)|0.892|0.184|0.126|4.85|7.06|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC3)|1.345|0.311|0.230|4.32|5.84|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC4)|1.831|0.546|0.436|3.35|4.20|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC1)|1.017|0.250|0.160|4.06|6.35|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC2)|2.077|0.646|0.605|3.21|3.43|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC3)|3.134|1.053|0.961|2.97|3.26|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC4)|4.222|1.436|1.288|2.94|3.28|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC1)|4.225|1.401|1.277|3.01|3.31|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC2)|8.310|2.953|2.635|2.81|3.15|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC3)|12.396|4.455|4.252|2.78|2.92|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC4)|17.174|5.831|5.824|2.95|2.95|

</details>

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-03 08:58:07 +03:00
..
calib3d Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
core Merge pull request #24480 from savuor:backport_patch_nans 2023-11-03 08:58:07 +03:00
dnn Merge pull request #24409 from fengyuentau:norm_kernel 2023-11-01 14:33:57 +03:00
features2d Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
flann Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
gapi fix: supress GCC13 warnings (#24434) 2023-10-26 09:00:58 +03:00
highgui Fix the issue of missing imshow icons when linking OpenCV as a static library (https://github.com/opencv/opencv-python/issues/585) 2023-10-07 11:18:32 +08:00
imgcodecs Merge pull request #24405 from kochanczyk:4.x 2023-10-30 11:58:08 +03:00
imgproc Merge pull request #24371 from hanliutong:clean-up 2023-10-20 12:50:26 +03:00
java Fail Java test suite, execution, if one of test failed. 2023-10-01 18:31:04 +03:00
js Merge pull request #24288 from tailsu:sd/emscripten-3.1.45-fixes 2023-09-19 08:09:18 +03:00
ml Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
objc Backport 5.x: Support for module names that start from digit in ObjC bindings generator. 2023-05-25 11:45:59 +03:00
objdetect Merge pull request #23894 from kallaballa:blobFromImagesWithParams 2023-10-20 14:27:40 +03:00
photo Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
python Fixed Python signatures in Doxygen documentation. 2023-10-30 17:28:03 +03:00
stitching fix: supress GCC13 warnings (#24434) 2023-10-26 09:00:58 +03:00
ts Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
video Merge pull request #24461 from fengyuentau:tracker_vit_backend_target 2023-10-27 14:12:44 +03:00
videoio Merge pull request #24363 from cudawarped:videoio_ffmpeg_add_stream_encapsulation 2023-10-25 13:21:01 +03:00
world cmake: VERSION_GREATER_EQUAL is not supported in CMake 3.5.1 2022-12-26 17:41:53 +00:00