opencv/modules/dnn/perf
Wanli 6ee71fee88
Merge pull request #24547 from WanliZhong:refactor_conv_perf_test
Classify and extend convolution and depthwise performance tests #24547

This PR aims to:
1. Extend the test cases from models: `YOLOv5`, `YOLOv8`, `EfficientNet`, `YOLOX`, `YuNet`, `SFace`, `MPPalm`, `MPHand`, `MPPose`, `ViTTrack`, `PPOCRv3`, `CRNN`, `PPHumanSeg`. (371 new test cases are added)

2. Classify the existing convolution performance test to below cases
    - CONV_1x1
    - CONV_3x3_S1_D1 (winograd)
    - CONV
    - DEPTHWISE

3. Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned):
(i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved.
(ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]`
(iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 31], [32, 63], [64, 95]... `

> **Reproduced**: 1. follow step in https://github.com/alalek/opencv/commit/dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code [classify conv.txt](https://github.com/opencv/opencv/files/13522228/classify.conv.txt)


**Performance test result on Apple M2**

**Test result details**:  [M2.md](https://github.com/opencv/opencv/files/13379189/M2.md)

**Additional test result details with FP16**:  [m2_results_with_fp16.zip](https://github.com/opencv/opencv/files/13491070/m2_results_with_fp16.zip)


**Brief summary for 4.8.1 vs 4.7.0 or 4.6.0**: 
1. `CONV_1x1_S1_D1` dropped significant with small or large input shape.
2. `DEPTHWISE_5x5 ` dropped a little compared with 4.7.0. 

---

**Performance test result on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)**: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.

**Test result details**: [INTEL.md](https://github.com/opencv/opencv/files/13374093/INTEL.md)
**Brief summary for 4.8.1 vs 4.5.5**: 
1. `CONV_5x5_S1_D1` dropped significant. 
2. `CONV_1x1_S1_D1`, `CONV_3x3_S1_D1`, `DEPTHWISE_3x3_S1_D1`, `DEPTHWISW_3x3_S2_D1` dropped with small input shape.

---

TODO:
- [x] Perform tests on arm with each opencv version
- [x] Perform tests on x86 with each opencv version
- [x] Split each test classification with single test config
- [x] test enable fp16
2023-12-11 21:35:33 +03:00
..
perf_caffe.cpp Merge pull request #24120 from dkurt:actualize_dnn_links 2023-08-16 15:46:11 +03:00
perf_common.cpp cmake: fix build of dnn tests with shared common code 2019-03-31 08:52:25 +00:00
perf_convolution1d.cpp Merge pull request #18783 from sl-sergei:fix_conv1d 2020-11-13 22:22:10 +00:00
perf_convolution3d.cpp Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-11-13 22:29:14 +00:00
perf_convolution.cpp Merge pull request #24547 from WanliZhong:refactor_conv_perf_test 2023-12-11 21:35:33 +03:00
perf_einsum.cpp Merge pull request #24509 from Abdurrahheem:ash/dev_einsum_fast_gemm 2023-11-16 16:20:17 +03:00
perf_gemm.cpp Merge pull request #24309 from dkurt:gemm_ov_hotfix 2023-09-27 10:25:45 +03:00
perf_layer.cpp Merge pull request #24378 from fengyuentau:instance_norm 2023-11-07 12:59:10 +03:00
perf_main.cpp Merge pull request #11897 from Jakub-Golinowski:hpx_backend 2018-08-31 16:23:26 +03:00
perf_net.cpp Merge pull request #24298 from WanliZhong:extend_perf_net_test 2023-10-04 13:05:32 +03:00
perf_precomp.hpp dnn(perf): fix and merge Convolution tests 2018-08-31 15:02:19 +03:00
perf_recurrent.cpp Merge pull request #20658 from smbz:lstm_optimisation 2021-11-29 21:43:00 +00:00