opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-15 09:49:13 +08:00

History

Yuantao Feng 23b244d3a3 Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp dnn: optimize activations with v_exp #25881 Merge with https://github.com/opencv/opencv_extra/pull/1191. This PR optimizes the following activations: - [x] Swish - [x] Mish - [x] Elu - [x] Celu - [x] Selu - [x] HardSwish ### Performance (Updated on 2024-07-18) #### AmLogic A311D2 (ARM Cortex A73 + A53) ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 115.859 27.930 4.15 Elu::Layer_Elementwise::OCV/CPU 27.846 27.003 1.03 Gelu::Layer_Elementwise::OCV/CPU 0.657 0.602 1.09 HardSwish::Layer_Elementwise::OCV/CPU 31.885 6.781 4.70 Mish::Layer_Elementwise::OCV/CPU 35.729 32.089 1.11 Selu::Layer_Elementwise::OCV/CPU 61.955 27.850 2.22 Swish::Layer_Elementwise::OCV/CPU 30.819 26.688 1.15 ``` #### Apple M1 ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 16.184 2.118 7.64 Celu::Layer_Elementwise::OCV/CPU_FP16 16.280 2.123 7.67 Elu::Layer_Elementwise::OCV/CPU 9.123 1.878 4.86 Elu::Layer_Elementwise::OCV/CPU_FP16 9.085 1.897 4.79 Gelu::Layer_Elementwise::OCV/CPU 0.089 0.081 1.11 Gelu::Layer_Elementwise::OCV/CPU_FP16 0.086 0.074 1.17 HardSwish::Layer_Elementwise::OCV/CPU 1.560 1.555 1.00 HardSwish::Layer_Elementwise::OCV/CPU_FP16 1.536 1.523 1.01 Mish::Layer_Elementwise::OCV/CPU 6.077 2.476 2.45 Mish::Layer_Elementwise::OCV/CPU_FP16 5.990 2.496 2.40 Selu::Layer_Elementwise::OCV/CPU 11.351 1.976 5.74 Selu::Layer_Elementwise::OCV/CPU_FP16 11.533 1.985 5.81 Swish::Layer_Elementwise::OCV/CPU 4.687 1.890 2.48 Swish::Layer_Elementwise::OCV/CPU_FP16 4.715 1.873 2.52 ``` #### Intel i7-12700K ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 17.106 3.560 4.81 Elu::Layer_Elementwise::OCV/CPU 5.064 3.478 1.46 Gelu::Layer_Elementwise::OCV/CPU 0.036 0.035 1.04 HardSwish::Layer_Elementwise::OCV/CPU 2.914 2.893 1.01 Mish::Layer_Elementwise::OCV/CPU 3.820 3.529 1.08 Selu::Layer_Elementwise::OCV/CPU 10.799 3.593 3.01 Swish::Layer_Elementwise::OCV/CPU 3.651 3.473 1.05 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake		2024-07-19 16:03:19 +03:00
..
cityscapes_semsegm_test_enet.py	Misc. modules/ typos	2018-02-12 07:09:43 -05:00
imagenet_cls_test_alexnet.py	change fcn8s-heavy-pascal tests from caffe to onnx	2024-05-03 00:15:09 +08:00
imagenet_cls_test_googlenet.py	Misc. modules/ typos	2018-02-12 07:09:43 -05:00
imagenet_cls_test_inception.py	fix 4.x links	2021-12-22 13:24:30 +00:00
npy_blob.cpp	dnn: fix precomp.hpp usage	2018-02-28 17:06:26 +03:00
npy_blob.hpp	dnn: fix precomp.hpp usage	2018-02-28 17:06:26 +03:00
pascal_semsegm_test_fcn.py	change fcn8s-heavy-pascal tests from caffe to onnx	2024-05-03 00:15:09 +08:00
test_backends.cpp	Merge pull request #24834 from fengyuentau:cuda_naryeltwise_broadcast	2024-01-11 10:04:46 +03:00
test_caffe_importer.cpp	dnn(test): skip very long debug tests, reduce test time	2023-12-25 08:44:06 +00:00
test_common.cpp	cmake: fix build of dnn tests with shared common code	2019-03-31 08:52:25 +00:00
test_common.hpp	Merge pull request #25880 from Jamim:fix/cuda-no-fp16	2024-07-10 12:39:30 +03:00
test_common.impl.hpp	Merge pull request #25181 from dkurt:release_conv_weights	2024-03-25 09:03:28 +03:00
test_darknet_importer.cpp	dnn(test): skip very long debug tests, reduce test time	2023-12-25 08:44:06 +00:00
test_googlenet.cpp	Merge pull request #22275 from zihaomu:fp16_support_conv	2023-05-17 09:38:33 +03:00
test_graph_simplifier.cpp	Merge pull request #25271 from fengyuentau:matmul_bias	2024-03-29 17:35:23 +03:00
test_ie_models.cpp	Fix for OpenVINO 2024.0	2024-03-18 15:05:50 +04:00
test_int8_layers.cpp	Merge pull request #25779 from fengyuentau:dnn/fix_onnx_depthtospace	2024-06-21 19:28:22 +03:00
test_layers.cpp	Skip Test_Caffe_layers.Concat with Vulkan due to sporadic failures.	2024-05-17 11:54:25 +03:00
test_main.cpp	Merge pull request #23109 from seanm:misc-warnings	2023-10-06 13:33:21 +03:00
test_misc.cpp	Merge pull request #23736 from seanm:c++11-simplifications	2024-01-19 16:53:08 +03:00
test_model.cpp	Made fcn-resnet50-12.onnx model optional.	2024-05-03 16:14:22 +03:00
test_nms.cpp	batched nms impl	2022-11-29 15:32:34 +08:00
test_onnx_conformance_layer_filter__cuda_denylist.inl.hpp	Merge pull request #25630 from fengyuentau:nary-multi-thread	2024-07-03 10:09:05 +03:00
test_onnx_conformance_layer_filter__cuda_fp16_denylist.inl.hpp	Merge pull request #25630 from fengyuentau:nary-multi-thread	2024-07-03 10:09:05 +03:00
test_onnx_conformance_layer_filter__halide_denylist.inl.hpp	Merge pull request #24353 from alexlyulkov:al/fixed-cumsum-layer	2023-10-03 13:58:25 +03:00
test_onnx_conformance_layer_filter__openvino.inl.hpp	Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp	2024-07-19 16:03:19 +03:00
test_onnx_conformance_layer_filter__vulkan_denylist.inl.hpp	Merge pull request #25630 from fengyuentau:nary-multi-thread	2024-07-03 10:09:05 +03:00
test_onnx_conformance_layer_filter_opencv_all_denylist.inl.hpp	Merge pull request #25630 from fengyuentau:nary-multi-thread	2024-07-03 10:09:05 +03:00
test_onnx_conformance_layer_filter_opencv_cpu_denylist.inl.hpp	Merge pull request #21865 from rogday:nary_eltwise_layers	2022-07-19 06:14:05 +03:00
test_onnx_conformance_layer_filter_opencv_denylist.inl.hpp	move global skip out of if loop, and add opencv_deny_list	2023-03-13 22:16:51 +08:00
test_onnx_conformance_layer_filter_opencv_ocl_fp16_denylist.inl.hpp	Merge pull request #25630 from fengyuentau:nary-multi-thread	2024-07-03 10:09:05 +03:00
test_onnx_conformance_layer_filter_opencv_ocl_fp32_denylist.inl.hpp	implementation of scatter and scatternd with conformance tests enabled	2022-10-17 11:30:32 +08:00
test_onnx_conformance_layer_parser_denylist.inl.hpp	Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp	2024-07-19 16:03:19 +03:00
test_onnx_conformance.cpp	Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp	2024-07-19 16:03:19 +03:00
test_onnx_importer.cpp	Merge pull request #25147 from fengyuentau:dnn/elementwise_layers/speedup	2024-07-08 14:24:36 +03:00
test_precomp.hpp	dnn: reduce set of ignored warnings	2018-11-15 13:15:59 +03:00
test_tf_importer.cpp	dnn(test): skip very long debug tests, reduce test time	2023-12-25 08:44:06 +00:00
test_tflite_importer.cpp	Merge pull request #25613 from CNOCycle:tflite/ops	2024-05-31 19:31:21 +03:00
test_torch_importer.cpp	DNN: add the Winograd fp16 support (#23654 )	2023-11-20 13:45:37 +03:00