opencv/modules
Yuantao Feng fa5ed62a66
Merge pull request #24694 from fengyuentau:matmul_refactor
dnn: refactor ONNX MatMul with fastGemm #24694

Done:
- [x] add backends
    - [x] CUDA
    - [x] OpenVINO
    - [x] CANN
    - [x] OpenCL
    - [x] Vulkan
- [x] add perf tests
- [x] const B case

### Benchmark

Tests are done on M1. All data is in milliseconds (ms).

| Configuration | MatMul (Prepacked) | MatMul | InnerProduct |
| - | - | - | - |
| A=[12, 197, 197], B=[12, 197, 64], trans_a=0, trans_b=0 | **0.39** | 0.41 | 1.33 |
| A=[12, 197, 64], B=[12, 64, 197], trans_a=0, trans_b=0  | **0.42** | 0.42 | 1.17 |
| A=[12, 50, 64], B=[12, 64, 50], trans_a=0, trans_b=0    | **0.13** | 0.15 | 0.33 |
| A=[12, 50, 50], B=[12, 50, 64], trans_a=0, trans_b=0    | **0.11** | 0.13 | 0.22 |
| A=[16, 197, 197], B=[16, 197, 64], trans_a=0, trans_b=0 | **0.46** | 0.54 | 1.46 |
| A=[16, 197, 64], B=[16, 64, 197], trans_a=0, trans_b=0  | **0.46** | 0.95 | 1.74 |
| A=[16, 50, 64], B=[16, 64, 50], trans_a=0, trans_b=0    | **0.18** | 0.32 | 0.43 |
| A=[16, 50, 50], B=[16, 50, 64], trans_a=0, trans_b=0    | **0.15** | 0.25 | 0.25 |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-19 19:36:41 +03:00
..
calib3d Merge pull request #24713 from MaximSmolskiy:improve-icvSmoothHistogram256 2023-12-19 18:39:34 +03:00
core Merge pull request #24618 from vrabaud:compilation 2023-12-01 09:10:30 +03:00
dnn Merge pull request #24694 from fengyuentau:matmul_refactor 2023-12-19 19:36:41 +03:00
features2d Added Java bindings for BOWImgDescriptorExtractor constructor. 2023-10-31 11:23:47 +03:00
flann Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
gapi Merge pull request #24068 from TolyaTalamanov:at/add-onnx-coreml-execution-provider 2023-12-13 21:22:15 +03:00
highgui Update window_QT.cpp 2023-11-13 12:10:52 +03:00
imgcodecs Merge pull request #24405 from kochanczyk:4.x 2023-10-30 11:58:08 +03:00
imgproc Fix typo 2023-12-15 09:21:23 +08:00
java Merge pull request #24685 from AleksandrPanov:fix_build_grandle 2023-12-12 09:11:22 +03:00
js Merge pull request #24458 from laolaolulu:4.x 2023-11-13 14:51:20 +03:00
ml Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
objc Backport 5.x: Support for module names that start from digit in ObjC bindings generator. 2023-05-25 11:45:59 +03:00
objdetect Get code to compile without DNN 2023-12-08 10:54:59 +01:00
photo Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
python Fixed Python signatures in Doxygen documentation. 2023-10-30 17:28:03 +03:00
stitching fix: supress GCC13 warnings (#24434) 2023-10-26 09:00:58 +03:00
ts Merge pull request #23109 from seanm:misc-warnings 2023-10-06 13:33:21 +03:00
video Merge pull request #24461 from fengyuentau:tracker_vit_backend_target 2023-10-27 14:12:44 +03:00
videoio Merge pull request #24666 from zzuliys:4.x 2023-12-19 18:34:21 +03:00
world cmake: use /INCREMENTAL:NO with MSVS 2015 2023-12-07 19:46:27 +00:00