opencv

mirror of https://github.com/opencv/opencv.git synced 2025-07-24 05:39:22 +08:00

History

Yuantao Feng 8a96e34e33 dnn: add gemm_layer in place of fully_connected_layer for onnx models (#23897 ) * first commit * turned C from input to constant; force C constant in impl; better handling 0d/1d cases * integrate with gemm from ficus nn * fix const inputs * adjust threshold for int8 tryQuantize * adjust threshold for int8 quantized 2 * support batched gemm and matmul; tune threshold for rcnn_ilsvrc13; update googlenet * add gemm perf against innerproduct * add perf tests for innerproduct with bias * fix perf * add memset * renamings for next step * add dedicated perf gemm * add innerproduct in perf_gemm * remove gemm and innerproduct perf tests from perf_layer * add perf cases for vit sizes; prepack constants * remove batched gemm; fix wrong trans; optimize KC * remove prepacking for const A; several fixes for const B prepacking * add todos and gemm expression * add optimized branch for avx/avx2 * trigger build * update macros and signature * update signature * fix macro * fix bugs for neon aarch64 & x64 * add backends: cuda, cann, inf_ngraph and vkcom * fix cuda backend * test commit for cuda * test cuda backend * remove debug message from cuda backend * use cpu dispatcher * fix neon macro undef in dispatcher * fix dispatcher * fix inner kernel for neon aarch64 * fix compiling issue on armv7; try fixing accuracy issue on other platforms * broadcast C with beta multiplied; improve func namings * fix bug for avx and avx2 * put all platform-specific kernels in dispatcher * fix typos * attempt to fix compile issues on x64 * run old gemm when neon, avx, avx2 are all not available; add kernel for armv7 neon * fix typo * quick fix: add macros for pack4 * quick fix: use vmlaq_f32 for armv7 * quick fix for missing macro of fast gemm pack f32 4 * disable conformance tests when optimized branches are not supported * disable perf tests when optimized branches are not supported * decouple cv_try_neon and cv_neon_aarch64 * drop googlenet_2023; add fastGemmBatched * fix step in fastGemmBatched * cpu: fix initialization ofb; gpu: support batch * quick followup fix for cuda * add default kernels * quick followup fix to avoid macro redef * optmized kernels for lasx * resolve mis-alignment; remove comments * tune performance for x64 platform * tune performance for neon aarch64 * tune for armv7 * comment time consuming tests * quick follow-up fix		2023-09-20 00:53:34 +03:00
..
utils	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-02-11 17:32:37 +00:00
all_layers.hpp	dnn: add gemm_layer in place of fully_connected_layer for onnx models (#23897 )	2023-09-20 00:53:34 +03:00
dict.hpp	dnn: fix API - explicit ctors, const methods	2022-01-21 12:38:51 +00:00
dnn.hpp	Added CMake configuration OPENCV_DNN_BACKEND_DEFAULT	2023-09-05 10:05:12 +02:00
dnn.inl.hpp	build warnings	2020-12-05 20:08:29 +00:00
layer_reg.private.hpp	Merge pull request #20494 from rogday:onnx_diagnostic_fix	2021-08-20 14:43:47 +00:00
layer.details.hpp	dnn: update "guard" inline namespace	2018-09-03 20:46:57 +00:00
layer.hpp	fix model diagnostic tool	2022-01-18 01:22:22 +03:00
shape_utils.hpp	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-10-15 16:44:47 +00:00
version.hpp	pre: OpenCV 4.8.0 (version++)	2023-06-20 15:52:57 +03:00