opencv/modules
HAN Liutong 0dd7769bb1
Merge pull request #23980 from hanliutong:rewrite-core
Rewrite Universal Intrinsic code by using new API: Core module. #23980

The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API.

The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885.

Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are:

1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited
- ./modules/core/src/stat.simd.hpp
- ./modules/core/src/matrix_transform.cpp
- ./modules/core/src/matmul.simd.hpp

2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly.
- ./modules/core/src/mathfuncs_core.simd.hpp 
```cpp
struct v_atan_f32
{
    explicit v_atan_f32(const float& scale)
    {
...
    }

    v_float32 compute(const v_float32& y, const v_float32& x)
    {
...
    }

...
    v_float32 val90; // sizeless type can not used in a class
    v_float32 val180;
    v_float32 val360;
    v_float32 s;
};
```

3. The API interface does not support/does not match

- ./modules/core/src/norm.cpp 
Use `v_popcount`, ~~waiting for #23966~~ Fixed
- ./modules/core/src/has_non_zero.simd.hpp
Use illegal Universal Intrinsic API: For float type, there is no logical operation `|`. Further discussion needed

```cpp
/** @brief Bitwise OR

Only for integer types. */
template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b);
template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b);
```

```cpp
#if CV_SIMD
    typedef v_float32 v_type;
    const v_type v_zero = vx_setzero_f32();
    constexpr const int unrollCount = 8;
    int step = v_type::nlanes * unrollCount;
    int len0 = len & -step;
    const float* srcSimdEnd = src+len0;

    int countSIMD = static_cast<int>((srcSimdEnd-src)/step);
    while(!res && countSIMD--)
    {
        v_type v0 = vx_load(src);
        src += v_type::nlanes;
        v_type v1 = vx_load(src);
        src += v_type::nlanes;
....
        src += v_type::nlanes;
        v0 |= v1; //Illegal ?
....
        //res = v_check_any(((v0 | v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ
        res = !v_check_all(((v0 | v4) == v_zero));
    }

    v_cleanup();
#endif
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-08-11 08:33:33 +03:00
..
calib3d Merge pull request #24035 from vrabaud:calibration 2023-07-27 19:36:33 +03:00
core Merge pull request #23980 from hanliutong:rewrite-core 2023-08-11 08:33:33 +03:00
dnn Merge pull request #24122 from fengyuentau:remove_tengine 2023-08-09 09:26:02 +03:00
features2d Merge remote-tracking branch 'origin/3.4' into merge-3.4 2023-05-24 14:37:48 +03:00
flann Merge pull request #24028 from VadimLevin:dev/vlevin/fix-flann-python-bindings 2023-07-21 12:44:56 +03:00
gapi Merge pull request #24059 from TolyaTalamanov:at/add-onnx-cuda-execution-provider 2023-08-02 14:13:07 +03:00
highgui Use OpenCV logging instead of std::cerr. 2023-07-19 10:49:54 +03:00
imgcodecs Use OpenCV logging instead of std::cerr. 2023-07-19 10:49:54 +03:00
imgproc Merge pull request #24042 from vrabaud:circle 2023-07-26 20:00:22 +03:00
java build: w/a compiler warnings for GCC 11-12 and Clang 13, reduce build output 2023-07-10 11:27:59 +03:00
js if browser supports wasm but only asm.js path provided use asm.js as fallback 2023-06-17 09:38:57 +03:00
ml Merge remote-tracking branch 'origin/3.4' into merge-3.4 2023-04-21 10:55:04 +03:00
objc Backport 5.x: Support for module names that start from digit in ObjC bindings generator. 2023-05-25 11:45:59 +03:00
objdetect update aruco bytesList docs 2023-07-13 13:50:07 +03:00
photo Deprecated convertTypeStr and made new variant that also takes the buffer size 2023-04-26 09:48:15 -04:00
python cuda: Fix GpuMat::copyTo and GpuMat::converTo python bindings 2023-08-01 15:09:37 +03:00
stitching Merge pull request #23740 from Peekabooc:4.x 2023-06-09 13:40:02 +03:00
ts cuda: add SkipTestException handling 2023-07-17 18:03:40 +03:00
video Warning supression fix for XCode 13.1 and newer. Backport #23203 2023-02-06 11:12:05 +03:00
videoio Merge pull request #24133 from alexlyulkov:al/fixed-msmf-webcam 2023-08-10 11:48:38 +03:00
world cmake: VERSION_GREATER_EQUAL is not supported in CMake 3.5.1 2022-12-26 17:41:53 +00:00