opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-15 18:09:11 +08:00

History

Yuantao Feng 23b244d3a3 Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp dnn: optimize activations with v_exp #25881 Merge with https://github.com/opencv/opencv_extra/pull/1191. This PR optimizes the following activations: - [x] Swish - [x] Mish - [x] Elu - [x] Celu - [x] Selu - [x] HardSwish ### Performance (Updated on 2024-07-18) #### AmLogic A311D2 (ARM Cortex A73 + A53) ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 115.859 27.930 4.15 Elu::Layer_Elementwise::OCV/CPU 27.846 27.003 1.03 Gelu::Layer_Elementwise::OCV/CPU 0.657 0.602 1.09 HardSwish::Layer_Elementwise::OCV/CPU 31.885 6.781 4.70 Mish::Layer_Elementwise::OCV/CPU 35.729 32.089 1.11 Selu::Layer_Elementwise::OCV/CPU 61.955 27.850 2.22 Swish::Layer_Elementwise::OCV/CPU 30.819 26.688 1.15 ``` #### Apple M1 ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 16.184 2.118 7.64 Celu::Layer_Elementwise::OCV/CPU_FP16 16.280 2.123 7.67 Elu::Layer_Elementwise::OCV/CPU 9.123 1.878 4.86 Elu::Layer_Elementwise::OCV/CPU_FP16 9.085 1.897 4.79 Gelu::Layer_Elementwise::OCV/CPU 0.089 0.081 1.11 Gelu::Layer_Elementwise::OCV/CPU_FP16 0.086 0.074 1.17 HardSwish::Layer_Elementwise::OCV/CPU 1.560 1.555 1.00 HardSwish::Layer_Elementwise::OCV/CPU_FP16 1.536 1.523 1.01 Mish::Layer_Elementwise::OCV/CPU 6.077 2.476 2.45 Mish::Layer_Elementwise::OCV/CPU_FP16 5.990 2.496 2.40 Selu::Layer_Elementwise::OCV/CPU 11.351 1.976 5.74 Selu::Layer_Elementwise::OCV/CPU_FP16 11.533 1.985 5.81 Swish::Layer_Elementwise::OCV/CPU 4.687 1.890 2.48 Swish::Layer_Elementwise::OCV/CPU_FP16 4.715 1.873 2.52 ``` #### Intel i7-12700K ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 17.106 3.560 4.81 Elu::Layer_Elementwise::OCV/CPU 5.064 3.478 1.46 Gelu::Layer_Elementwise::OCV/CPU 0.036 0.035 1.04 HardSwish::Layer_Elementwise::OCV/CPU 2.914 2.893 1.01 Mish::Layer_Elementwise::OCV/CPU 3.820 3.529 1.08 Selu::Layer_Elementwise::OCV/CPU 10.799 3.593 3.01 Swish::Layer_Elementwise::OCV/CPU 3.651 3.473 1.05 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake		2024-07-19 16:03:19 +03:00
..
perf_caffe.cpp	Merge pull request #24120 from dkurt:actualize_dnn_links	2023-08-16 15:46:11 +03:00
perf_common.cpp	cmake: fix build of dnn tests with shared common code	2019-03-31 08:52:25 +00:00
perf_convolution1d.cpp	Merge pull request #18783 from sl-sergei:fix_conv1d	2020-11-13 22:22:10 +00:00
perf_convolution3d.cpp	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2020-11-13 22:29:14 +00:00
perf_convolution.cpp	dnn(test): skip very long debug tests, reduce test time	2023-12-25 08:44:06 +00:00
perf_einsum.cpp	Merge pull request #24509 from Abdurrahheem:ash/dev_einsum_fast_gemm	2023-11-16 16:20:17 +03:00
perf_gemm.cpp	Merge pull request #24694 from fengyuentau:matmul_refactor	2023-12-19 19:36:41 +03:00
perf_layer.cpp	Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp	2024-07-19 16:03:19 +03:00
perf_main.cpp	Merge pull request #11897 from Jakub-Golinowski:hpx_backend	2018-08-31 16:23:26 +03:00
perf_net.cpp	Fix proto and weights mess in dnn performance tests.	2024-02-07 09:16:09 +03:00
perf_precomp.hpp	dnn(perf): fix and merge Convolution tests	2018-08-31 15:02:19 +03:00
perf_recurrent.cpp	Merge pull request #20658 from smbz:lstm_optimisation	2021-11-29 21:43:00 +00:00