Commit Graph

18 Commits

Author SHA1 Message Date
YooLc
f20facc60a
Merge pull request #27060 from YooLc:hal-rvv-integral
[hal_rvv] Add cv::integral implementation and more types of input for test #27060

This patch introduces an RVV-optimized implementation of `cv::integral()` in hal_rvv, along with performance and accuracy tests for all valid input/output type combinations specified in `modules/imgproc/src/hal_replacement.hpp`:
2a8d4b8e43/modules/imgproc/src/hal_replacement.hpp (L960-L974)

The vectorized prefix sum algorithm follows the approach described in [Prefix Sum with SIMD - Algorithmica](https://en.algorithmica.org/hpc/algorithms/prefix/).

I intentionally omitted support for the following cases by returning `CV_HAL_ERROR_NOT_IMPLEMENTED`, as they are harder to implement or show limited performance gains:
1. **Tilted Sum**: The data access pattern for tilted sums requires multi-row operations, making effective vectorization difficult.
2. **3-channel images (`cn == 3`)**: Current implementation requires `VLEN/SEW` (a.k.a. number of elements in a vector register) to be a multiple of channel count, which 3-channel formats typically cannot satisfy.
    - Support for 1, 2 and 4 channel images is implemented
4. **Small images (`!(width >> 8 || height >> 8)`)**: The scalar implementation demonstrates better performance for images with limited dimensions. 
    - This is the same as `3rdparty/ndsrvp/src/integral.cpp` 09c71aed14/3rdparty/ndsrvp/src/integral.cpp (L24-L26)

Test configuration:

- Platform: SpacemiT Muse Pi (K1 @ 1.60 Ghz)
- Toolchain: GCC 14.2.0
- `integral_sqsum_full` test is disabled by default, so `--gtest_also_run_disabled_tests` is needed

Test results:

```plaintext
Geometric mean (ms)

                                     Name of Test                                       imgproc-gcc-scalar imgproc-gcc-hal  imgproc-gcc-hal  
                                                                                                                                   vs        
                                                                                                                           imgproc-gcc-scalar
                                                                                                                               (x-factor)      
integral::Size_MatType_OutMatDepth::(640x480, 8UC1, CV_32F)                                   1.973             1.415             1.39       
integral::Size_MatType_OutMatDepth::(640x480, 8UC1, CV_32S)                                   1.343             1.351             0.99       
integral::Size_MatType_OutMatDepth::(640x480, 8UC1, CV_64F)                                   2.021             2.756             0.73       
integral::Size_MatType_OutMatDepth::(640x480, 8UC2, CV_32F)                                   4.695             2.874             1.63       
integral::Size_MatType_OutMatDepth::(640x480, 8UC2, CV_32S)                                   4.028             2.801             1.44       
integral::Size_MatType_OutMatDepth::(640x480, 8UC2, CV_64F)                                   5.965             4.926             1.21       
integral::Size_MatType_OutMatDepth::(640x480, 8UC4, CV_32F)                                   9.970             4.440             2.25       
integral::Size_MatType_OutMatDepth::(640x480, 8UC4, CV_32S)                                   7.934             4.244             1.87       
integral::Size_MatType_OutMatDepth::(640x480, 8UC4, CV_64F)                                   14.696            8.431             1.74       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC1, CV_32F)                                  5.949             4.108             1.45       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC1, CV_32S)                                  4.064             4.080             1.00       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC1, CV_64F)                                  6.137             7.975             0.77       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC2, CV_32F)                                  13.896            8.721             1.59       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC2, CV_32S)                                  10.948            8.513             1.29       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC2, CV_64F)                                  18.046           15.234             1.18       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC4, CV_32F)                                  35.105           13.778             2.55       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC4, CV_32S)                                  27.135           13.417             2.02       
integral::Size_MatType_OutMatDepth::(1280x720, 8UC4, CV_64F)                                  43.477           25.616             1.70       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC1, CV_32F)                                 13.386            9.281             1.44       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC1, CV_32S)                                 9.159             9.194             1.00       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC1, CV_64F)                                 13.776           17.836             0.77       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC2, CV_32F)                                 31.943           19.435             1.64       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC2, CV_32S)                                 24.747           18.946             1.31       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC2, CV_64F)                                 35.925           33.943             1.06       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC4, CV_32F)                                 66.493           29.692             2.24       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC4, CV_32S)                                 54.737           28.250             1.94       
integral::Size_MatType_OutMatDepth::(1920x1080, 8UC4, CV_64F)                                 91.880           57.495             1.60            
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC1, CV_32F)                             4.384             4.016             1.09       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC1, CV_32S)                             3.676             3.960             0.93       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC1, CV_64F)                             5.620             5.224             1.08       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC2, CV_32F)                             9.971             7.696             1.30       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC2, CV_32S)                             8.934             7.632             1.17       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC2, CV_64F)                             9.927             9.759             1.02       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC4, CV_32F)                             21.556           12.288             1.75       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC4, CV_32S)                             21.261           12.089             1.76       
integral_sqsum::Size_MatType_OutMatDepth::(640x480, 8UC4, CV_64F)                             23.989           16.278             1.47       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC1, CV_32F)                            15.232           11.752             1.30       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC1, CV_32S)                            12.976           11.721             1.11       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC1, CV_64F)                            16.450           15.627             1.05       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC2, CV_32F)                            25.932           23.243             1.12       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC2, CV_32S)                            24.750           23.019             1.08       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC2, CV_64F)                            28.228           29.605             0.95       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC4, CV_32F)                            61.665           37.477             1.65       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC4, CV_32S)                            61.536           37.126             1.66       
integral_sqsum::Size_MatType_OutMatDepth::(1280x720, 8UC4, CV_64F)                            73.989           48.994             1.51       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC1, CV_32F)                           49.640           26.529             1.87       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC1, CV_32S)                           35.869           26.417             1.36       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC1, CV_64F)                           34.378           35.056             0.98       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC2, CV_32F)                           82.138           52.661             1.56       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC2, CV_32S)                           54.644           52.089             1.05       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC2, CV_64F)                           75.073           66.670             1.13       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC4, CV_32F)                          143.283           83.943             1.71       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC4, CV_32S)                          156.851           82.378             1.90       
integral_sqsum::Size_MatType_OutMatDepth::(1920x1080, 8UC4, CV_64F)                          521.594           111.375            4.68            
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC1, DEPTH_32F_32F))          3.529             2.787             1.27       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC1, DEPTH_32F_64F))          4.396             3.998             1.10       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC1, DEPTH_32S_32F))          3.229             2.774             1.16       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC1, DEPTH_32S_32S))          2.945             2.780             1.06       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC1, DEPTH_32S_64F))          3.857             3.995             0.97       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC1, DEPTH_64F_64F))          5.872             5.228             1.12       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (16UC1, DEPTH_64F_64F))         6.075             5.277             1.15       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (16SC1, DEPTH_64F_64F))         5.680             5.296             1.07       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC1, DEPTH_32F_32F))         3.355             2.896             1.16       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC1, DEPTH_32F_64F))         4.183             4.000             1.05       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC1, DEPTH_64F_64F))         6.237             5.143             1.21       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (64FC1, DEPTH_64F_64F))         4.753             4.783             0.99       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC2, DEPTH_32F_32F))          8.021             5.793             1.38       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC2, DEPTH_32F_64F))          9.963             7.704             1.29       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC2, DEPTH_32S_32F))          7.864             5.720             1.37       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC2, DEPTH_32S_32S))          7.141             5.699             1.25       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC2, DEPTH_32S_64F))          9.228             7.646             1.21       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC2, DEPTH_64F_64F))          9.940             9.759             1.02       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (16UC2, DEPTH_64F_64F))         10.606            9.716             1.09       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (16SC2, DEPTH_64F_64F))         9.933             9.751             1.02       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC2, DEPTH_32F_32F))         7.986             5.962             1.34       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC2, DEPTH_32F_64F))         9.243             7.598             1.22       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC2, DEPTH_64F_64F))         10.573            9.425             1.12       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (64FC2, DEPTH_64F_64F))         11.029            8.977             1.23       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC4, DEPTH_32F_32F))          17.236            8.881             1.94       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC4, DEPTH_32F_64F))          20.905           12.322             1.70       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC4, DEPTH_32S_32F))          16.011            8.666             1.85       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC4, DEPTH_32S_32S))          15.932            8.507             1.87       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC4, DEPTH_32S_64F))          20.713           12.115             1.71       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (8UC4, DEPTH_64F_64F))          23.953           16.284             1.47       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (16UC4, DEPTH_64F_64F))         25.127           16.341             1.54       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (16SC4, DEPTH_64F_64F))         24.950           16.441             1.52       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC4, DEPTH_32F_32F))         17.261            8.906             1.94       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC4, DEPTH_32F_64F))         21.944           12.073             1.82       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (32FC4, DEPTH_64F_64F))         25.921           15.539             1.67       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(640x480, (64FC4, DEPTH_64F_64F))         27.938           14.824             1.88       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC1, DEPTH_32F_32F))         11.156            8.260             1.35       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC1, DEPTH_32F_64F))         14.777           11.869             1.24       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC1, DEPTH_32S_32F))         9.693             8.221             1.18       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC1, DEPTH_32S_32S))         9.023             8.256             1.09       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC1, DEPTH_32S_64F))         13.276           11.821             1.12       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC1, DEPTH_64F_64F))         15.406           15.618             0.99       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (16UC1, DEPTH_64F_64F))        16.799           15.749             1.07       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (16SC1, DEPTH_64F_64F))        15.054           15.806             0.95       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC1, DEPTH_32F_32F))        10.055            7.999             1.26       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC1, DEPTH_32F_64F))        13.506           11.253             1.20       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC1, DEPTH_64F_64F))        14.952           15.021             1.00       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (64FC1, DEPTH_64F_64F))        13.761           14.002             0.98       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC2, DEPTH_32F_32F))         22.677           17.330             1.31       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC2, DEPTH_32F_64F))         26.283           23.237             1.13       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC2, DEPTH_32S_32F))         20.126           17.118             1.18       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC2, DEPTH_32S_32S))         19.337           17.041             1.13       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC2, DEPTH_32S_64F))         24.973           23.004             1.09       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC2, DEPTH_64F_64F))         29.959           29.585             1.01       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (16UC2, DEPTH_64F_64F))        33.598           29.599             1.14       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (16SC2, DEPTH_64F_64F))        46.213           29.741             1.55       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC2, DEPTH_32F_32F))        33.077           17.556             1.88       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC2, DEPTH_32F_64F))        33.960           22.991             1.48       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC2, DEPTH_64F_64F))        41.792           28.803             1.45       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (64FC2, DEPTH_64F_64F))        34.660           28.532             1.21       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC4, DEPTH_32F_32F))         52.989           27.659             1.92       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC4, DEPTH_32F_64F))         62.418           37.515             1.66       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC4, DEPTH_32S_32F))         50.902           27.310             1.86       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC4, DEPTH_32S_32S))         47.301           27.019             1.75       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC4, DEPTH_32S_64F))         61.982           37.140             1.67       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (8UC4, DEPTH_64F_64F))         79.403           49.041             1.62       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (16UC4, DEPTH_64F_64F))        86.550           49.180             1.76       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (16SC4, DEPTH_64F_64F))        85.715           49.468             1.73       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC4, DEPTH_32F_32F))        63.932           28.019             2.28       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC4, DEPTH_32F_64F))        68.180           36.858             1.85       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (32FC4, DEPTH_64F_64F))        83.063           46.483             1.79       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1280x720, (64FC4, DEPTH_64F_64F))        91.990           44.545             2.07       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC1, DEPTH_32F_32F))        25.503           18.609             1.37       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC1, DEPTH_32F_64F))        29.544           26.635             1.11       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC1, DEPTH_32S_32F))        22.581           18.514             1.22       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC1, DEPTH_32S_32S))        20.860           18.547             1.12       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC1, DEPTH_32S_64F))        26.046           26.373             0.99       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC1, DEPTH_64F_64F))        34.831           34.997             1.00       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (16UC1, DEPTH_64F_64F))       36.428           35.214             1.03       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (16SC1, DEPTH_64F_64F))       32.435           35.314             0.92       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC1, DEPTH_32F_32F))       22.548           18.845             1.20       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC1, DEPTH_32F_64F))       28.589           25.790             1.11       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC1, DEPTH_64F_64F))       32.625           33.791             0.97       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (64FC1, DEPTH_64F_64F))       30.158           31.889             0.95       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC2, DEPTH_32F_32F))        53.374           38.938             1.37       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC2, DEPTH_32F_64F))        73.892           52.747             1.40       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC2, DEPTH_32S_32F))        47.392           38.572             1.23       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC2, DEPTH_32S_32S))        45.638           38.225             1.19       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC2, DEPTH_32S_64F))        69.966           52.156             1.34       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC2, DEPTH_64F_64F))        68.560           66.963             1.02       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (16UC2, DEPTH_64F_64F))       71.487           65.420             1.09       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (16SC2, DEPTH_64F_64F))       68.127           65.718             1.04       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC2, DEPTH_32F_32F))       72.967           39.987             1.82       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC2, DEPTH_32F_64F))       63.933           51.408             1.24       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC2, DEPTH_64F_64F))       73.334           63.354             1.16       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (64FC2, DEPTH_64F_64F))       80.983           60.778             1.33       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC4, DEPTH_32F_32F))       116.981           59.908             1.95       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC4, DEPTH_32F_64F))       155.085           83.974             1.85       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC4, DEPTH_32S_32F))       109.567           58.525             1.87       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC4, DEPTH_32S_32S))       105.457           57.124             1.85       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC4, DEPTH_32S_64F))       157.325           82.485             1.91       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (8UC4, DEPTH_64F_64F))       265.776           111.577            2.38       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (16UC4, DEPTH_64F_64F))      585.218           110.583            5.29       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (16SC4, DEPTH_64F_64F))      585.418           111.302            5.26       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC4, DEPTH_32F_32F))      126.456           60.415             2.09       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC4, DEPTH_32F_64F))      169.278           81.460             2.08       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (32FC4, DEPTH_64F_64F))      281.256           104.732            2.69       
integral_sqsum_full::Size_MatType_OutMatDepthArray::(1920x1080, (64FC4, DEPTH_64F_64F))      620.885           99.953             6.21       
```

The vectorized implementation shows progressively better acceleration for larger image sizes and higher channel counts, achieving up to 6.21× speedup for 64FC4 (1920×1080) inputs with `DEPTH_64F_64F` configuration.

This is my first time proposing patch for the OpenCV Project 🥹, if there's anything that can be improved, please tell me.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2025-04-21 09:50:13 +03:00
Sean McBride
5fb3869775
Merge pull request #23109 from seanm:misc-warnings
* Fixed clang -Wnewline-eof warnings
* Fixed all trivial clang -Wextra-semi and -Wc++98-compat-extra-semi warnings
* Removed trailing semi from various macros
* Fixed various -Wunused-macros warnings
* Fixed some trivial -Wdocumentation warnings
* Fixed some -Wdocumentation-deprecated-sync warnings
* Fixed incorrect indentation
* Suppressed some clang warnings in 3rd party code
* Fixed QRCodeEncoder::Params documentation.

---------

Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
2023-10-06 13:33:21 +03:00
Alexander Alekhin
fd09413566
Merge pull request #16731 from alalek:issue_16708
* imgproc(integral): avoid OOB access

* imgproc(test): fix integral perf check

- FP32 computation is not accurate

* imgproc(integral): tune loop limits
2020-03-04 19:28:04 +00:00
Everton Constantino
76e403cf25 Merge pull request #15440 from everton1984:new_integral_tests
* Adding all possible data type interactions to the perf tests since some
use SIMD acceleration and others do not.

* Disabling full tests by default.

* Giving proper names, removing magic numbers and sanity checks of new
performance tests for the integral function.

* Giving proper names, making array static.
2019-09-04 19:14:00 +03:00
Brad Kelly
507f8add1c Implementing AVX512 Support for 2 and 4 channel mats for CV_64F format 2019-02-19 11:31:20 -08:00
Brad Kelly
0165ffa90d Implementing AVX512 support for 3 channel cv::integral for CV_64F 2019-01-14 16:11:01 -08:00
Alexander Alekhin
4a297a2443 ts: refactor OpenCV tests
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
2018-02-03 19:39:47 +00:00
Anna Kogan
ed4c687d45 Increased time limits, eliminated extra params 2013-01-31 16:18:52 +04:00
Anna Kogan
8877066846 increased time-limits 2012-11-28 18:17:33 +04:00
OpenCV Buildbot
04384a71e4 Normalize line endings and whitespace 2012-10-17 15:32:23 +04:00
Andrey Kamaev
d46659f6cc Set sanity thresholds for several performance tests 2012-04-20 15:33:34 +00:00
Andrey Kamaev
65f5343ed5 Refactored performance tests. TEST_CYCLE macro is renamed to TEST_CYCLE_N; SIMPLE_TEST_CYCLE is renamed to TEST_CYCLE; from now 100 iterations are default for performance tests 2011-12-29 16:46:16 +00:00
Kirill Kornyakov
7113122312 integral perf tests renamed for readability 2011-11-07 13:06:48 +00:00
Kirill Kornyakov
7cd4761fc8 integral perf tests renamed for readability 2011-11-07 13:05:52 +00:00
Andrey Kamaev
583ceef6a5 Terga optimization for integral_8u32s 2011-10-31 08:00:20 +00:00
Kirill Kornyakov
20e49c8286 added perf test for cv::erode 2011-10-06 12:42:14 +00:00
Andrey Pavlenko
b58b04a31c Perf Tests: updates to cvtColor() & integral() perf tests 2011-09-07 12:50:56 +00:00
Andrey Kamaev
5cce038958 Performance testing branch is merged back into trunk 2011-09-06 15:30:28 +00:00