opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-03 08:19:52 +08:00

History

Wanli 6ee71fee88 Merge pull request #24547 from WanliZhong:refactor_conv_perf_test Classify and extend convolution and depthwise performance tests #24547 This PR aims to: 1. Extend the test cases from models: `YOLOv5`, `YOLOv8`, `EfficientNet`, `YOLOX`, `YuNet`, `SFace`, `MPPalm`, `MPHand`, `MPPose`, `ViTTrack`, `PPOCRv3`, `CRNN`, `PPHumanSeg`. (371 new test cases are added) 2. Classify the existing convolution performance test to below cases - CONV_1x1 - CONV_3x3_S1_D1 (winograd) - CONV - DEPTHWISE 3. Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned): (i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved. (ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]` (iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 31], [32, 63], [64, 95]... ` > Reproduced: 1. follow step in https://github.com/alalek/opencv/commit/dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code [classify conv.txt](https://github.com/opencv/opencv/files/13522228/classify.conv.txt) Performance test result on Apple M2 Test result details: [M2.md](https://github.com/opencv/opencv/files/13379189/M2.md) Additional test result details with FP16: [m2_results_with_fp16.zip](https://github.com/opencv/opencv/files/13491070/m2_results_with_fp16.zip) Brief summary for 4.8.1 vs 4.7.0 or 4.6.0: 1. `CONV_1x1_S1_D1` dropped significant with small or large input shape. 2. `DEPTHWISE_5x5 ` dropped a little compared with 4.7.0. --- Performance test result on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html): 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads. Test result details: [INTEL.md](https://github.com/opencv/opencv/files/13374093/INTEL.md) Brief summary for 4.8.1 vs 4.5.5: 1. `CONV_5x5_S1_D1` dropped significant. 2. `CONV_1x1_S1_D1`, `CONV_3x3_S1_D1`, `DEPTHWISE_3x3_S1_D1`, `DEPTHWISW_3x3_S2_D1` dropped with small input shape. --- TODO: - [x] Perform tests on arm with each opencv version - [x] Perform tests on x86 with each opencv version - [x] Split each test classification with single test config - [x] test enable fp16		2023-12-11 21:35:33 +03:00
..
perf_caffe.cpp	Merge pull request #24120 from dkurt:actualize_dnn_links	2023-08-16 15:46:11 +03:00
perf_common.cpp	cmake: fix build of dnn tests with shared common code	2019-03-31 08:52:25 +00:00
perf_convolution1d.cpp	Merge pull request #18783 from sl-sergei:fix_conv1d	2020-11-13 22:22:10 +00:00
perf_convolution3d.cpp	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2020-11-13 22:29:14 +00:00
perf_convolution.cpp	Merge pull request #24547 from WanliZhong:refactor_conv_perf_test	2023-12-11 21:35:33 +03:00
perf_einsum.cpp	Merge pull request #24509 from Abdurrahheem:ash/dev_einsum_fast_gemm	2023-11-16 16:20:17 +03:00
perf_gemm.cpp	Merge pull request #24309 from dkurt:gemm_ov_hotfix	2023-09-27 10:25:45 +03:00
perf_layer.cpp	Merge pull request #24378 from fengyuentau:instance_norm	2023-11-07 12:59:10 +03:00
perf_main.cpp	Merge pull request #11897 from Jakub-Golinowski:hpx_backend	2018-08-31 16:23:26 +03:00
perf_net.cpp	Merge pull request #24298 from WanliZhong:extend_perf_net_test	2023-10-04 13:05:32 +03:00
perf_precomp.hpp	dnn(perf): fix and merge Convolution tests	2018-08-31 15:02:19 +03:00
perf_recurrent.cpp	Merge pull request #20658 from smbz:lstm_optimisation	2021-11-29 21:43:00 +00:00