opencv/modules
Yuantao Feng 5aa5c39210
Merge pull request #25076 from fengyuentau:improve_attention
dnn: try improving performance of Attention layer #25076

Checklist:

- [x] Use `Mat` over `Mat::zeros` for temporary buffer in forward
- [x] Use layer internal buffer over temporary Mat buffer
- [x] Try a single fastGemmBatch on the Q/K/V calculation

Performance:

Performance test case is `Layer_Attention.VisionTransformer/0`, which has input of shape {1, 197, 768}, weight of shape {768, 2304} and bias {2304}.

Data is in millisecond.

| | macOS 14.2.1, Apple M1 | Ubuntu 22.04.2, Intel i7 12700K |
| - | - | - |
| Current | 10.96 | 1.58 |
| w/ Mat | 6.27 | 1.41 |
| w/ Internals | 5.87 | 1.38 |
| w/ fastGemmBatch | 6.12 | 2.14 |


### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-02-28 16:47:08 +03:00
..
calib3d Merge pull request #25050 from akretz:fix_issue_24330 2024-02-27 08:10:21 +03:00
core Merge pull request #25059 from opencv-pushbot:gitee/alalek/core_fix_float16 2024-02-24 13:28:05 +03:00
dnn Merge pull request #25076 from fengyuentau:improve_attention 2024-02-28 16:47:08 +03:00
features2d Added Java bindings for BOWImgDescriptorExtractor constructor. 2023-10-31 11:23:47 +03:00
flann Merge pull request #25024 from vrabaud:neon 2024-02-20 11:29:23 +03:00
gapi Merge pull request #25060 from dmatveev:dm/gapi_test_time 2024-02-22 16:40:33 +03:00
highgui fix highgui qt's statusbar text got cropped 2024-01-07 06:32:29 -05:00
imgcodecs Update ios_conversions.mm 2024-02-26 19:21:52 +11:00
imgproc Merge pull request #24750 from YusukeKameda:4.x 2024-01-18 15:06:36 +03:00
java Merge pull request #24946 from alexlyulkov:al/kotlin-tests2 2024-02-22 09:30:45 +03:00
js Merge pull request #25084 from EDVTAZ:emscripten-3.1.54-compat 2024-02-26 10:30:56 +03:00
ml Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
objc Merge pull request #24136 from komakai:visionos_support 2023-12-20 15:35:10 +03:00
objdetect Merge pull request #25091 from Dhanwanth1803:scoreThresh 2024-02-27 12:10:52 +03:00
photo Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
python fix [use hasattr("cv2", "name") ,but first param is 'character string', 2024-02-23 22:02:43 +08:00
stitching Merge pull request #23736 from seanm:c++11-simplifications 2024-01-19 16:53:08 +03:00
ts Merge pull request #23736 from seanm:c++11-simplifications 2024-01-19 16:53:08 +03:00
video Merge pull request #24852 from Octopus136:4.x 2024-01-17 10:20:03 +03:00
videoio Fixed typo. 2024-02-24 18:03:18 +09:00
world cmake: use /INCREMENTAL:NO with MSVS 2015 2023-12-07 19:46:27 +00:00