mirror of
https://github.com/opencv/opencv.git
synced 2025-06-28 15:30:49 +08:00
Open Source Computer Vision Library
![]() * first commit * turned C from input to constant; force C constant in impl; better handling 0d/1d cases * integrate with gemm from ficus nn * fix const inputs * adjust threshold for int8 tryQuantize * adjust threshold for int8 quantized 2 * support batched gemm and matmul; tune threshold for rcnn_ilsvrc13; update googlenet * add gemm perf against innerproduct * add perf tests for innerproduct with bias * fix perf * add memset * renamings for next step * add dedicated perf gemm * add innerproduct in perf_gemm * remove gemm and innerproduct perf tests from perf_layer * add perf cases for vit sizes; prepack constants * remove batched gemm; fix wrong trans; optimize KC * remove prepacking for const A; several fixes for const B prepacking * add todos and gemm expression * add optimized branch for avx/avx2 * trigger build * update macros and signature * update signature * fix macro * fix bugs for neon aarch64 & x64 * add backends: cuda, cann, inf_ngraph and vkcom * fix cuda backend * test commit for cuda * test cuda backend * remove debug message from cuda backend * use cpu dispatcher * fix neon macro undef in dispatcher * fix dispatcher * fix inner kernel for neon aarch64 * fix compiling issue on armv7; try fixing accuracy issue on other platforms * broadcast C with beta multiplied; improve func namings * fix bug for avx and avx2 * put all platform-specific kernels in dispatcher * fix typos * attempt to fix compile issues on x64 * run old gemm when neon, avx, avx2 are all not available; add kernel for armv7 neon * fix typo * quick fix: add macros for pack4 * quick fix: use vmlaq_f32 for armv7 * quick fix for missing macro of fast gemm pack f32 4 * disable conformance tests when optimized branches are not supported * disable perf tests when optimized branches are not supported * decouple cv_try_neon and cv_neon_aarch64 * drop googlenet_2023; add fastGemmBatched * fix step in fastGemmBatched * cpu: fix initialization ofb; gpu: support batch * quick followup fix for cuda * add default kernels * quick followup fix to avoid macro redef * optmized kernels for lasx * resolve mis-alignment; remove comments * tune performance for x64 platform * tune performance for neon aarch64 * tune for armv7 * comment time consuming tests * quick follow-up fix |
||
---|---|---|
.github | ||
3rdparty | ||
apps | ||
cmake | ||
data | ||
doc | ||
include | ||
modules | ||
platforms | ||
samples | ||
.editorconfig | ||
.gitattributes | ||
.gitignore | ||
CMakeLists.txt | ||
CONTRIBUTING.md | ||
COPYRIGHT | ||
LICENSE | ||
README.md | ||
SECURITY.md |
OpenCV: Open Source Computer Vision Library
Resources
- Homepage: https://opencv.org
- Courses: https://opencv.org/courses
- Docs: https://docs.opencv.org/4.x/
- Q&A forum: https://forum.opencv.org
- previous forum (read only): http://answers.opencv.org
- Issue tracking: https://github.com/opencv/opencv/issues
- Additional OpenCV functionality: https://github.com/opencv/opencv_contrib
Contributing
Please read the contribution guidelines before starting work on a pull request.
Summary of the guidelines:
- One pull request per issue;
- Choose the right base branch;
- Include tests and documentation;
- Clean up "oops" commits before submitting;
- Follow the coding style guide.