Open Source Computer Vision Library
Go to file
Yuantao Feng fa5ed62a66
Merge pull request #24694 from fengyuentau:matmul_refactor
dnn: refactor ONNX MatMul with fastGemm #24694

Done:
- [x] add backends
    - [x] CUDA
    - [x] OpenVINO
    - [x] CANN
    - [x] OpenCL
    - [x] Vulkan
- [x] add perf tests
- [x] const B case

### Benchmark

Tests are done on M1. All data is in milliseconds (ms).

| Configuration | MatMul (Prepacked) | MatMul | InnerProduct |
| - | - | - | - |
| A=[12, 197, 197], B=[12, 197, 64], trans_a=0, trans_b=0 | **0.39** | 0.41 | 1.33 |
| A=[12, 197, 64], B=[12, 64, 197], trans_a=0, trans_b=0  | **0.42** | 0.42 | 1.17 |
| A=[12, 50, 64], B=[12, 64, 50], trans_a=0, trans_b=0    | **0.13** | 0.15 | 0.33 |
| A=[12, 50, 50], B=[12, 50, 64], trans_a=0, trans_b=0    | **0.11** | 0.13 | 0.22 |
| A=[16, 197, 197], B=[16, 197, 64], trans_a=0, trans_b=0 | **0.46** | 0.54 | 1.46 |
| A=[16, 197, 64], B=[16, 64, 197], trans_a=0, trans_b=0  | **0.46** | 0.95 | 1.74 |
| A=[16, 50, 64], B=[16, 64, 50], trans_a=0, trans_b=0    | **0.18** | 0.32 | 0.43 |
| A=[16, 50, 50], B=[16, 50, 64], trans_a=0, trans_b=0    | **0.15** | 0.25 | 0.25 |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-19 19:36:41 +03:00
.github CI: enable RISC-V for 4.x branch 2023-10-16 09:28:37 +03:00
3rdparty Update 3rdparty zlib to 1.3 2023-12-19 14:48:39 +08:00
apps Add missing <sstream> includes 2023-09-05 22:04:26 +03:00
cmake prepare to build for ARM64 on Windows with Visual Studio 2023-12-19 09:40:35 +09:00
data Merge pull request #22727 from su77ungr:patch-1 2022-11-17 06:54:25 +00:00
doc Merge pull request #24700 from savuor:doc_android_win_fix 2023-12-15 18:07:06 +03:00
include exclude opencv_contrib modules 2020-02-26 15:12:45 +03:00
modules Merge pull request #24694 from fengyuentau:matmul_refactor 2023-12-19 19:36:41 +03:00
platforms Merge pull request #24592 from savuor:recorder_android 2023-12-13 15:54:31 +03:00
samples Merge pull request #24666 from zzuliys:4.x 2023-12-19 18:34:21 +03:00
.editorconfig add .editorconfig 2018-10-11 17:57:51 +00:00
.gitattributes cmake: generate and install ffmpeg-download.ps1 2018-06-09 13:19:48 +03:00
.gitignore Merge pull request #17165 from komakai:objc-binding 2020-06-08 18:32:53 +00:00
CMakeLists.txt Merge pull request #24666 from zzuliys:4.x 2023-12-19 18:34:21 +03:00
CONTRIBUTING.md migration: github.com/opencv/opencv 2016-07-12 12:51:12 +03:00
COPYRIGHT copyright: 2023 (update) 2023-01-09 09:49:22 +00:00
LICENSE Merge pull request #18073 from vpisarev:apache2_license 2020-08-17 11:49:11 +00:00
README.md OpenCV.AI notification. 2023-12-05 21:43:54 +03:00
SECURITY.md Updated PGP key for security reports 2023-04-19 19:16:55 +03:00

OpenCV: Open Source Computer Vision Library

Keep OpenCV Free

OpenCV is raising funds to keep the library free for everyone, and we need the support of the entire community to do it. Donate to OpenCV on IndieGoGo before the campaign ends on December 16 to show your support.

Resources

Contributing

Please read the contribution guidelines before starting work on a pull request.

Summary of the guidelines:

  • One pull request per issue;
  • Choose the right base branch;
  • Include tests and documentation;
  • Clean up "oops" commits before submitting;
  • Follow the coding style guide.

Additional Resources