opencv/modules
Yuantao Feng 23b244d3a3
Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp
dnn: optimize activations with v_exp #25881

Merge with https://github.com/opencv/opencv_extra/pull/1191.

This PR optimizes the following activations:

- [x] Swish
- [x] Mish
- [x] Elu
- [x] Celu
- [x] Selu
- [x] HardSwish

### Performance (Updated on 2024-07-18)

#### AmLogic A311D2 (ARM Cortex A73 + A53)

```
Geometric mean (ms)

            Name of Test              activations activations.patch activations.patch
                                                                              vs
                                                                         activations
                                                                          (x-factor)
Celu::Layer_Elementwise::OCV/CPU        115.859          27.930              4.15
Elu::Layer_Elementwise::OCV/CPU          27.846          27.003              1.03
Gelu::Layer_Elementwise::OCV/CPU         0.657           0.602               1.09
HardSwish::Layer_Elementwise::OCV/CPU    31.885          6.781               4.70
Mish::Layer_Elementwise::OCV/CPU         35.729          32.089              1.11
Selu::Layer_Elementwise::OCV/CPU         61.955          27.850              2.22
Swish::Layer_Elementwise::OCV/CPU        30.819          26.688              1.15
```

#### Apple M1

```
Geometric mean (ms)

               Name of Test                activations activations.patch activations.patch
                                                                                   vs
                                                                              activations
                                                                               (x-factor)
Celu::Layer_Elementwise::OCV/CPU              16.184          2.118               7.64
Celu::Layer_Elementwise::OCV/CPU_FP16         16.280          2.123               7.67
Elu::Layer_Elementwise::OCV/CPU               9.123           1.878               4.86
Elu::Layer_Elementwise::OCV/CPU_FP16          9.085           1.897               4.79
Gelu::Layer_Elementwise::OCV/CPU              0.089           0.081               1.11
Gelu::Layer_Elementwise::OCV/CPU_FP16         0.086           0.074               1.17
HardSwish::Layer_Elementwise::OCV/CPU         1.560           1.555               1.00
HardSwish::Layer_Elementwise::OCV/CPU_FP16    1.536           1.523               1.01
Mish::Layer_Elementwise::OCV/CPU              6.077           2.476               2.45
Mish::Layer_Elementwise::OCV/CPU_FP16         5.990           2.496               2.40
Selu::Layer_Elementwise::OCV/CPU              11.351          1.976               5.74
Selu::Layer_Elementwise::OCV/CPU_FP16         11.533          1.985               5.81
Swish::Layer_Elementwise::OCV/CPU             4.687           1.890               2.48
Swish::Layer_Elementwise::OCV/CPU_FP16        4.715           1.873               2.52
```

#### Intel i7-12700K

```
Geometric mean (ms)

            Name of Test              activations activations.patch activations.patch
                                                                    vs
                                                               activations
                                                                (x-factor)
Celu::Layer_Elementwise::OCV/CPU        17.106       3.560         4.81
Elu::Layer_Elementwise::OCV/CPU          5.064       3.478         1.46
Gelu::Layer_Elementwise::OCV/CPU         0.036       0.035         1.04
HardSwish::Layer_Elementwise::OCV/CPU    2.914       2.893         1.01
Mish::Layer_Elementwise::OCV/CPU         3.820       3.529         1.08
Selu::Layer_Elementwise::OCV/CPU        10.799       3.593         3.01
Swish::Layer_Elementwise::OCV/CPU        3.651       3.473         1.05
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-19 16:03:19 +03:00
..
calib3d Use CV_LOG_DEBUG for debug logging in chessboard detector. 2024-07-15 16:11:27 +03:00
core Merge pull request #25883 from hanliutong:rvv-intrin-upgrade 2024-07-19 11:41:42 +03:00
dnn Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp 2024-07-19 16:03:19 +03:00
features2d Merge pull request #25424 from mshabunin:fix-features2d-test 2024-04-17 14:19:05 +03:00
flann Prevent signed integer overflow in LshTable 2024-05-24 23:47:36 +02:00
gapi Merge pull request #25817 from lamiayous:ly/extend_onnxrt_gapi_backend_handle_i32_i64_type 2024-07-12 11:38:43 +03:00
highgui highgui: Make GThread mandatory with GTK 2024-07-15 16:30:39 +02:00
imgcodecs Merge pull request #25931 from zihaomu:clean_code 2024-07-18 17:18:37 +03:00
imgproc Merge pull request #25898 from Octopus136:issue-25853 2024-07-19 09:08:19 +03:00
java Merge pull request #25856 from alexlyulkov:al/android-optional-kotlin 2024-07-04 13:26:37 +03:00
js Merge pull request #25757 from dkurt:d.kurtaev/opencv_js_tests_old_emsdk 2024-06-17 12:46:10 +03:00
ml Partially back-port #25075 to 4.x 2024-03-05 12:15:39 +03:00
objc fix: resolve Swift method name conflicts by adding missing namespace 2024-07-18 00:20:17 +08:00
objdetect Merge pull request #25722 from AleksandrPanov:update_testSeveralBoardsWithCustomIds 2024-06-06 20:01:33 +03:00
photo photo: doc: Fix window range for fastNlMeansDenoisingMulti 2024-06-21 21:04:22 +08:00
python Merge pull request #25792 from asmorkalov:as/HAL_fast_GaussianBlur 2024-07-12 15:03:33 +03:00
stitching Partially back-port #25075 to 4.x 2024-03-05 12:15:39 +03:00
ts Merge pull request #25792 from asmorkalov:as/HAL_fast_GaussianBlur 2024-07-12 15:03:33 +03:00
video Merge pull request #25771 from fengyuentau:vittrack_black_input 2024-06-18 12:48:28 +03:00
videoio throw() -> noexcept 2024-07-16 06:36:52 -07:00
world cmake: use /INCREMENTAL:NO with MSVS 2015 2023-12-07 19:46:27 +00:00