opencv/modules/core/src
Rostislav Vasilikhin ea47cb3ffe
Merge pull request #24480 from savuor:backport_patch_nans
Backport to 4.x: patchNaNs() SIMD acceleration #24480

backport from #23098
connected PR in extra: [#1118@extra](https://github.com/opencv/opencv_extra/pull/1118)

### This PR contains:
* new SIMD code for `patchNaNs()`
* CPU perf test

<details>
<summary>Performance comparison</summary>

Geometric mean (ms)

|Name of Test|noopt|sse2|avx2|sse2 vs noopt (x-factor)|avx2 vs noopt (x-factor)|
|---|:-:|:-:|:-:|:-:|:-:|
|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC1)|0.019|0.017|0.018|1.11|1.07|
|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC4)|0.037|0.037|0.033|1.00|1.10|
|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC1)|0.032|0.032|0.033|0.99|0.98|
|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC4)|0.072|0.072|0.070|1.00|1.03|
|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC1)|0.051|0.051|0.050|1.00|1.01|
|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC4)|0.137|0.138|0.128|0.99|1.06|
|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC1)|0.137|0.128|0.129|1.07|1.06|
|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC4)|0.450|0.450|0.448|1.00|1.01|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC1)|0.149|0.029|0.020|5.13|7.44|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC2)|0.304|0.058|0.040|5.25|7.65|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC3)|0.448|0.086|0.059|5.22|7.55|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC4)|0.601|0.133|0.083|4.51|7.23|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC1)|0.451|0.093|0.060|4.83|7.52|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC2)|0.892|0.184|0.126|4.85|7.06|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC3)|1.345|0.311|0.230|4.32|5.84|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC4)|1.831|0.546|0.436|3.35|4.20|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC1)|1.017|0.250|0.160|4.06|6.35|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC2)|2.077|0.646|0.605|3.21|3.43|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC3)|3.134|1.053|0.961|2.97|3.26|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC4)|4.222|1.436|1.288|2.94|3.28|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC1)|4.225|1.401|1.277|3.01|3.31|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC2)|8.310|2.953|2.635|2.81|3.15|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC3)|12.396|4.455|4.252|2.78|2.92|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC4)|17.174|5.831|5.824|2.95|2.95|

</details>

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-03 08:58:07 +03:00
..
cuda Fix GpuMat to correctly calculate dataend when using GpuMat::create(). 2022-02-02 14:25:46 +00:00
opencl Merge pull request #13879 from chacha21:REDUCE_SUM2 2023-04-28 20:42:52 +03:00
parallel Fixed most clang -Wextra-semi warnings 2022-09-27 18:06:46 -04:00
utils Fix GNU/Hurd build 2023-08-10 22:43:46 +02:00
algorithm.cpp compatibility: keep Ptr<FileStorage> stubs till OpenCV 5.0 2022-12-16 00:47:44 +00:00
alloc.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
arithm_ipp.hpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
arithm.cpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
arithm.dispatch.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
arithm.simd.hpp Merge pull request #24325 from hanliutong:rewrite 2023-10-05 17:57:25 +03:00
array.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-12-03 12:32:49 +00:00
async.cpp Merge pull request #19985 from fpetrogalli:disable_threads 2021-07-08 20:21:21 +00:00
batch_distance.cpp
bindings_utils.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-09-02 15:24:04 +00:00
buffer_area.cpp core: remove unnecessary pointer cleanup in BufferArea 2022-07-24 11:58:17 +03:00
bufferpool.impl.hpp
channels.cpp Use void* 2023-07-20 15:53:57 +02:00
check.cpp Add missing <sstream> includes 2023-09-05 22:04:26 +03:00
command_line_parser.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-10 20:50:26 +00:00
conjugate_gradient.cpp
convert_c.cpp core: rework code locality 2021-03-02 23:24:28 +00:00
convert_scale.dispatch.cpp Deprecated convertTypeStr and made new variant that also takes the buffer size 2023-04-26 09:48:15 -04:00
convert_scale.simd.hpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
convert.dispatch.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-12-17 21:05:34 +00:00
convert.hpp Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
convert.simd.hpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
copy.cpp Merge pull request #24260 from vrabaud:ubsan 2023-09-14 15:16:28 +03:00
count_non_zero.dispatch.cpp Merge remote-tracking branch 'origin/3.4' into merge-3.4 2023-04-21 10:55:04 +03:00
count_non_zero.simd.hpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
cuda_gpu_mat_nd.cpp Merge pull request #19259 from nglee:dev_gpumatnd1 2021-02-05 20:30:37 +00:00
cuda_gpu_mat.cpp
cuda_host_mem.cpp build: fix warnings 2019-03-05 14:47:04 +03:00
cuda_info.cpp
cuda_stream.cpp cuda: add python bindings to allow GpuMat and Stream objects to be initialized from raw pointers 2023-05-22 11:02:04 +03:00
datastructs.cpp Merge pull request #21937 from Kumataro:4.x-fix-21911 2022-05-13 17:32:05 +00:00
directx.cpp bugfix convertFromD3D11Texture2D 2022-03-03 07:21:53 +09:00
directx.inc.hpp Merge pull request #13972 from Mainvooid:add_cuda_support_for_D3D11_interop 2019-03-24 18:34:09 +03:00
downhill_simplex.cpp Fix spelling typos 2019-12-27 12:46:53 +00:00
dxt.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
gl_core_3_1.cpp
gl_core_3_1.hpp
glob.cpp [build][option] Build option to disable filesystem support. 2021-05-11 12:54:54 +00:00
hal_internal.cpp Add missing sanitizer interface include 2023-09-13 12:15:34 +03:00
hal_internal.hpp core: include version.hpp in cvdef.h, fix precomp.hpp usage 2021-02-16 11:10:45 +00:00
hal_replacement.hpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
has_non_zero.dispatch.cpp Merge pull request #22947 from chacha21:hasNonZero 2023-06-09 13:37:20 +03:00
has_non_zero.simd.hpp Merge pull request #24325 from hanliutong:rewrite 2023-10-05 17:57:25 +03:00
intel_gpu_gemm.inl.hpp core(ocl): buffer bounds in intelblas_gemm_buffer_NT 2021-09-10 12:10:41 +00:00
kmeans.cpp kmeans: assertion "There can't be more clusters than elements" 2022-01-08 23:42:21 +01:00
lapack.cpp Merge pull request #24325 from hanliutong:rewrite 2023-10-05 17:57:25 +03:00
lda.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-04-01 18:11:55 +03:00
logger.cpp core(logger): strip path prefix 2022-12-07 23:58:36 +00:00
lpsolver.cpp Merge remote-tracking branch 'origin/3.4' into merge-3.4 2023-06-20 09:56:57 +03:00
lut.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-10 20:50:26 +00:00
mathfuncs_core.dispatch.cpp Merge pull request #23443 from eplankin:3.4 2023-04-07 09:14:42 +00:00
mathfuncs_core.simd.hpp Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
mathfuncs.cpp Merge pull request #24480 from savuor:backport_patch_nans 2023-11-03 08:58:07 +03:00
mathfuncs.hpp
matmul.dispatch.cpp Deprecated convertTypeStr and made new variant that also takes the buffer size 2023-04-26 09:48:15 -04:00
matmul.simd.hpp Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
matrix_c.cpp core: rework code locality 2021-03-02 23:24:28 +00:00
matrix_decomp.cpp
matrix_expressions.cpp core(MatExpr): fix warning in case of e.s == (0, 0, 0, 0) 2020-05-01 07:29:57 +00:00
matrix_iterator.cpp core: include version.hpp in cvdef.h, fix precomp.hpp usage 2021-02-16 11:10:45 +00:00
matrix_operations.cpp Merge pull request #13879 from chacha21:REDUCE_SUM2 2023-04-28 20:42:52 +03:00
matrix_sparse.cpp Merge pull request #21107 from take1014:remove_assert_21038 2021-11-27 18:34:52 +00:00
matrix_transform.cpp Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
matrix_wrap.cpp core: rework code locality 2021-03-02 23:24:28 +00:00
matrix.cpp Merge pull request #24260 from vrabaud:ubsan 2023-09-14 15:16:28 +03:00
mean.dispatch.cpp Merge pull request #24179 from Kumataro:fix24145 2023-08-23 22:53:11 +03:00
mean.simd.hpp Merge pull request #24325 from hanliutong:rewrite 2023-10-05 17:57:25 +03:00
merge.dispatch.cpp Check that cv::merge input matrices are not empty. 2023-09-08 12:36:46 +03:00
merge.simd.hpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
minmax.cpp Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
norm.cpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
ocl_disabled.impl.hpp Merge pull request #19755 from mikhail-nikolskiy:ffmpeg-umat 2021-05-14 16:48:50 +00:00
ocl.cpp Use OpenCV logging instead of std::cerr. 2023-07-19 10:49:54 +03:00
opengl.cpp More fixes for OpenCL error reporting. 2022-11-28 09:47:51 +03:00
out.cpp Merge pull request #22149 from seanm:sprintf 2022-06-25 06:48:22 +03:00
ovx.cpp Fixed compilation on windows with openvx 2020-01-06 06:32:56 +03:00
parallel_impl.cpp Merge pull request #24280 from casualwind:parallel_opt 2023-09-27 16:21:20 +03:00
parallel_impl.hpp
parallel.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
pca.cpp
persistence_base64_encoding.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
persistence_base64_encoding.hpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
persistence_impl.hpp core(persistence): avoid NULL pointer dereference 2022-01-18 04:56:43 +00:00
persistence_json.cpp Merge pull request #22149 from seanm:sprintf 2022-06-25 06:48:22 +03:00
persistence_types.cpp Merge pull request #23055 from seanm:sprintf2 2023-04-18 09:22:59 +03:00
persistence_xml.cpp Fixed buffer overrun; removed the last two uses of sprintf 2023-08-16 20:04:17 -04:00
persistence_yml.cpp Merge pull request #23055 from seanm:sprintf2 2023-04-18 09:22:59 +03:00
persistence.cpp Fixed invalid cast and unaligned memory access 2023-06-09 18:56:49 -04:00
persistence.hpp Merge pull request #23055 from seanm:sprintf2 2023-04-18 09:22:59 +03:00
precomp.hpp Merge pull request #13879 from chacha21:REDUCE_SUM2 2023-04-28 20:42:52 +03:00
rand.cpp core: rework code locality 2021-03-02 23:24:28 +00:00
softfloat.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
split.dispatch.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-26 17:34:42 +03:00
split.simd.hpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
stat_c.cpp core: rework code locality 2021-03-02 23:24:28 +00:00
stat.dispatch.cpp
stat.hpp
stat.simd.hpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
stl.cpp
sum.dispatch.cpp Merge pull request #24179 from Kumataro:fix24145 2023-08-23 22:53:11 +03:00
sum.simd.hpp Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
system.cpp Merge pull request #23929 from CNClareChen:4.x 2023-10-20 14:20:09 +03:00
tables.cpp
trace.cpp Merge pull request #21937 from Kumataro:4.x-fix-21911 2022-05-13 17:32:05 +00:00
types.cpp Merge pull request #23702 from dkurt:py_rotated_rect 2023-06-22 15:09:53 +03:00
umatrix.cpp Deprecated convertTypeStr and made new variant that also takes the buffer size 2023-04-26 09:48:15 -04:00
umatrix.hpp
va_intel.cpp Fallback to vaCreateImage + vaPutImage/vaGetImage when vaDeriveImage fails 2022-01-31 17:12:37 -05:00
va_wrapper.impl.hpp Fix libva dynamic loading 2022-03-15 19:08:20 +03:00