opencv/modules/dnn/src
Yuantao Feng 0521a3a384
Merge pull request #24476 from fengyuentau:attention_layer
dnn: add attention layer #24476

Resolves #24609

Merge with: https://github.com/opencv/opencv_extra/pull/1128.

Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.

TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
    - [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack

## Benchmarks

Platform: Macbook Air M1.

### Attention Subgraph

Input scale: [1, 197, 768].

|                        | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75      | 3.68        | 3.22     |
| w/o Attention          | 9.06      | 9.01        | 8.24     |
| ORT (python)           | 4.32      | 2.63        | 2.50     |

### ViTs

All data in millisecond (ms).

| ViTs     | With Attention | Without Attention | ORT    |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77         | 365.35            | 109.70 |
| vit_b_32 | 89.92          | 116.22            | 30.36  |
| vit_l_16 | 1593.32        | 1730.74           | 419.92 |
| vit_l_32 | 468.11         | 577.41            | 134.12 |
| VitTrack | 3.80           | 3.87              | 2.25   |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-20 19:35:07 +03:00
..
caffe Merge pull request #24613 from WanliZhong:softmax_default_axis 2023-12-15 10:41:42 +03:00
cuda Merge pull request #24647 from fengyuentau:cuda_sub 2023-12-06 13:46:24 +03:00
cuda4dnn Merge pull request #24694 from fengyuentau:matmul_refactor 2023-12-19 19:36:41 +03:00
darknet Merge pull request #24613 from WanliZhong:softmax_default_axis 2023-12-15 10:41:42 +03:00
int8layers Clean up the Universal Intrinsic API. 2023-10-13 19:23:30 +08:00
layers Merge pull request #24476 from fengyuentau:attention_layer 2023-12-20 19:35:07 +03:00
ocl4dnn tune for opencl 2022-08-14 17:47:48 +08:00
onnx Merge pull request #24476 from fengyuentau:attention_layer 2023-12-20 19:35:07 +03:00
opencl Merge pull request #24552 from fengyuentau:layernorm_backends 2023-11-21 15:33:01 +03:00
tensorflow Merge pull request #24613 from WanliZhong:softmax_default_axis 2023-12-15 10:41:42 +03:00
tflite Merge pull request #24004 from dkurt:tflite_new_layers 2023-07-21 09:13:37 +03:00
torch Merge pull request #24613 from WanliZhong:softmax_default_axis 2023-12-15 10:41:42 +03:00
vkcom speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
webnn Merge pull request #20406 from MarkGHX:gsoc_2021_webnn 2021-11-23 21:15:31 +00:00
backend.cpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
backend.hpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
debug_utils.cpp fix model diagnostic tool 2022-01-18 01:22:22 +03:00
dnn_common.hpp speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
dnn_params.cpp cmake: revise OPENCV_DNN_BACKEND_DEFAULT integration 2023-09-10 13:11:36 +00:00
dnn_read.cpp Migrate Android Face Detection sample to DNN. 2023-11-29 11:02:44 +03:00
dnn_utils.cpp Merge pull request #24539 from LaurentBerger:blobrecttoimage 2023-12-19 20:00:04 +03:00
dnn.cpp dnn: fix index access 2022-03-19 06:54:07 +00:00
factory.hpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
graph_simplifier.cpp Merge pull request #24577 from dkurt:dnn_graph_match_stack 2023-11-24 10:40:32 +03:00
graph_simplifier.hpp Merge pull request #24483 from dkurt:dnn_fusion_commutative_ops 2023-11-08 16:26:33 +03:00
halide_scheduler.cpp Merge pull request #22656 from dkurt:halide_fixes 2022-10-21 17:49:49 +03:00
halide_scheduler.hpp
ie_ngraph.cpp Merge pull request #23987 from dkurt:openvino_int8_backend 2023-09-28 16:24:43 +03:00
ie_ngraph.hpp Merge pull request #23987 from dkurt:openvino_int8_backend 2023-09-28 16:24:43 +03:00
init.cpp Merge pull request #24476 from fengyuentau:attention_layer 2023-12-20 19:35:07 +03:00
layer_factory.cpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
layer_internals.hpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
layer.cpp speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
legacy_backend.cpp Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
legacy_backend.hpp dnn: split dnn.cpp code 2022-03-08 19:22:46 +00:00
math_utils.hpp Implement ctc prefix beam search decode for TextRecognitionModel. 2021-08-12 20:33:31 +08:00
model.cpp DNN: add the Winograd fp16 support (#23654) 2023-11-20 13:45:37 +03:00
net_cann.cpp Merge pull request #23936 from SaltFish-T:4.x 2023-07-27 14:21:30 +03:00
net_impl_backend.cpp Merge pull request #23987 from dkurt:openvino_int8_backend 2023-09-28 16:24:43 +03:00
net_impl_fuse.cpp fix the issue in layer fused 2023-08-16 09:34:59 +08:00
net_impl.cpp Use OpenCV logging instead of std::cerr. 2023-07-19 10:49:54 +03:00
net_impl.hpp speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
net_openvino.cpp Merge pull request #23987 from dkurt:openvino_int8_backend 2023-09-28 16:24:43 +03:00
net_quantization.cpp add enableWinograd API for Net. 2022-10-09 09:33:07 +08:00
net.cpp add enableWinograd API for Net. 2022-10-09 09:33:07 +08:00
nms.cpp batched nms impl 2022-11-29 15:32:34 +08:00
nms.inl.hpp boost NMS performance 2021-03-10 15:59:26 +00:00
op_cann.cpp Merge pull request #23319 from fengyuentau:fix_zoo_issue_136 2023-03-13 21:46:33 +03:00
op_cann.hpp Merge pull request #23936 from SaltFish-T:4.x 2023-07-27 14:21:30 +03:00
op_cuda.cpp Let part of the operators in nary_eltwise support cuda 2022-11-02 14:08:21 +08:00
op_cuda.hpp transfer output blobs in background 2020-07-04 12:55:12 +05:30
op_halide.cpp Merge pull request #24167 from autoantwort:missing-include 2023-08-17 09:34:19 +00:00
op_halide.hpp
op_inf_engine.cpp Merge pull request #22957 from dkurt:new_openvino_api 2022-12-23 16:58:41 +00:00
op_inf_engine.hpp Merge pull request #23987 from dkurt:openvino_int8_backend 2023-09-28 16:24:43 +03:00
op_timvx.cpp Merge pull request #21036 from fengyuentau:timvx_backend_support 2022-03-31 21:42:11 +00:00
op_timvx.hpp Merge pull request #21036 from fengyuentau:timvx_backend_support 2022-03-31 21:42:11 +00:00
op_vkcom.cpp speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
op_vkcom.hpp speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
op_webnn.cpp dnn: split dnn.cpp code 2022-03-08 19:22:46 +00:00
op_webnn.hpp Merge pull request #20406 from MarkGHX:gsoc_2021_webnn 2021-11-23 21:15:31 +00:00
plugin_api.hpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
plugin_wrapper.impl.hpp dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
precomp.hpp speed up vulkan dnn, and support ios and apple m1 chip. (#23349) 2023-05-18 20:02:27 +03:00
registry.cpp Merge pull request #22275 from zihaomu:fp16_support_conv 2023-05-17 09:38:33 +03:00