opencv

mirror of https://github.com/opencv/opencv.git synced 2025-07-24 14:06:27 +08:00

Author	SHA1	Message	Date
Yuantao Feng	7fb336322d	Merge pull request #24808 from fengyuentau:fix_layernorm dnn: no layer norm fusion if axes.back() is not the axis of last dimension #24808 Merge with https://github.com/opencv/opencv_extra/pull/1137 Resolves https://github.com/opencv/opencv/issues/24797 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-01-10 13:01:00 +03:00
Yuantao Feng	c955564cb3	Merge pull request #24765 from fengyuentau:mod_operator dnn onnx: add mod #24765 Resolves https://github.com/opencv/opencv/issues/23174 TODO: - [x] enable some conformance tests - [x] add backends - [x] CANN - [x] OpenVINO - [x] CUDA ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-01-09 19:00:17 +03:00
Abduragim Shtanchaev	3b26e183cb	changed weights of yolov7	2023-12-28 23:03:47 +03:00
cudawarped	7d681cf80d	build: first class cuda support	2023-12-26 09:39:18 +03:00
Alexander Smorkalov	62f1a7410d	Merge pull request #24766 from asmorkalov:update_version_4.9.0-pre pre: OpenCV 4.9.0 (version++)	2023-12-25 16:04:53 +03:00
Alexander Smorkalov	b407c58b96	pre: OpenCV 4.9.0 (version++).	2023-12-25 15:20:10 +03:00
Yuantao Feng	f978c99523	Merge pull request #24753 from fengyuentau:einsum_importer dnn onnx: support constaint inputs in einsum importer #24753 Merge with https://github.com/opencv/opencv_extra/pull/1132. Resolves https://github.com/opencv/opencv/issues/24697 Credits to @LaurentBerger. --- This is a workaround. I suggest to get input shapes and calculate the output shapes in `getMemoryShapes` so as to keep the best compatibility. It is not always robust getting shapes during the importer stage and we should avoid that as much as possible. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-25 14:42:05 +03:00
Alexander Alekhin	f49b26182b	dnn(test): skip very long debug tests, reduce test time	2023-12-25 08:44:06 +00:00
Alexander Alekhin	96b894e0e1	Merge pull request #24761 from opencv-pushbot:gitee/alalek/test_skip_update_win32	2023-12-25 08:27:30 +00:00
Alexander Alekhin	f8502d45f9	dnn(test): skip tests on 32-bit Windows	2023-12-25 07:23:45 +00:00
Alexander Smorkalov	953dddd26b	Merge pull request #24747 from asmorkalov:as/tune_vitb_cuda Increate Vit_b test threshold a bit for CUDA FP16.	2023-12-22 17:04:46 +03:00
Dmitry Kurtaev	938bc4d503	[CUDA] Hotfix Scale with 1 parameter	2023-12-22 15:49:27 +03:00
Dhanwanth1803	027aee8ad4	Merge pull request #24384 from Dhanwanth1803:feat-crop Fixes #22747. Support [crop] configuration for DarkNet #24384 Request for comments. This is my first PR. Merge with extra: https://github.com/opencv/opencv_extra/pull/1112 resolves https://github.com/opencv/opencv/issues/22747 - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-12-22 14:55:01 +03:00
Alexander Smorkalov	53cd921ab4	Increate Vit_b test threshold a bit for CUDA FP16.	2023-12-22 13:37:44 +03:00
Vadim Pisarevsky	853e5dfcdf	Merge pull request #24709 from vpisarev:winograd_mode Try to enable Winograd by default in FP32 mode and disable it by default in FP16 mode #24709 Hopefully, it will resolve regressions since 4.8.1 (see also https://github.com/opencv/opencv/pull/24587) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-12-22 09:22:31 +03:00
Alexander Smorkalov	35e2ef8019	Merge pull request #24740 from opencv-pushbot:gitee/alalek/ocl_fix_kernel_compilation ocl: fix kernels compilation	2023-12-22 09:20:47 +03:00
Alexander Smorkalov	f5d8245801	Merge pull request #24736 from opencv-pushbot:gitee/alalek/issue_24734 dnn(ocl): don't try KERNEL_TYPE_GEMM_LIKE with kernel_w > 16	2023-12-21 20:01:01 +03:00
Alexander Alekhin	3340c71a2a	ocl: fix kernels compilation	2023-12-21 14:29:23 +00:00
Alexander Alekhin	c9bb92d58b	dnn(test): tune FP16 test tolerance	2023-12-21 13:39:05 +00:00
Alexander Alekhin	99c94d3d83	dnn(ocl): don't try KERNEL_TYPE_GEMM_LIKE with kernel_w > 16 - OpenCL kernel code doesn't support that	2023-12-21 13:30:57 +00:00
Yuantao Feng	0521a3a384	Merge pull request #24476 from fengyuentau:attention_layer dnn: add attention layer #24476 Resolves #24609 Merge with: https://github.com/opencv/opencv_extra/pull/1128. Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention. TODO: - [x] benchmark (before this PR vs. with this PR vs. ORT). - [x] Layer fusion: Take care Slice with end=INT64_MAX. - [x] Layer fusion: match more potential attention (VIT) patterns. - [x] Single-head attention is supported. - [x] Test AttentionSubgraph fusion. - [x] Add acc tests for VIT_B_32 and VitTrack - [x] Add perf tests for VIT_B_32 and VitTrack ## Benchmarks Platform: Macbook Air M1. ### Attention Subgraph Input scale: [1, 197, 768]. \| \| mean (ms) \| median (ms) \| min (ms) \| \| ---------------------- \| --------- \| ----------- \| -------- \| \| w/ Attention (this PR) \| 3.75 \| 3.68 \| 3.22 \| \| w/o Attention \| 9.06 \| 9.01 \| 8.24 \| \| ORT (python) \| 4.32 \| 2.63 \| 2.50 \| ### ViTs All data in millisecond (ms). \| ViTs \| With Attention \| Without Attention \| ORT \| \| -------- \| -------------- \| ----------------- \| ------ \| \| vit_b_16 \| 302.77 \| 365.35 \| 109.70 \| \| vit_b_32 \| 89.92 \| 116.22 \| 30.36 \| \| vit_l_16 \| 1593.32 \| 1730.74 \| 419.92 \| \| vit_l_32 \| 468.11 \| 577.41 \| 134.12 \| \| VitTrack \| 3.80 \| 3.87 \| 2.25 \| ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-20 19:35:07 +03:00
Laurent Berger	3e6dcdc0a4	Merge pull request #24539 from LaurentBerger:blobrecttoimage Add blobrecttoimage #24539 ### Pull Request Readiness Checklist resolves https://github.com/opencv/opencv/issues/14659 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work #14659 - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-19 20:00:04 +03:00
Yuantao Feng	fa5ed62a66	Merge pull request #24694 from fengyuentau:matmul_refactor dnn: refactor ONNX MatMul with fastGemm #24694 Done: - [x] add backends - [x] CUDA - [x] OpenVINO - [x] CANN - [x] OpenCL - [x] Vulkan - [x] add perf tests - [x] const B case ### Benchmark Tests are done on M1. All data is in milliseconds (ms). \| Configuration \| MatMul (Prepacked) \| MatMul \| InnerProduct \| \| - \| - \| - \| - \| \| A=[12, 197, 197], B=[12, 197, 64], trans_a=0, trans_b=0 \| 0.39 \| 0.41 \| 1.33 \| \| A=[12, 197, 64], B=[12, 64, 197], trans_a=0, trans_b=0 \| 0.42 \| 0.42 \| 1.17 \| \| A=[12, 50, 64], B=[12, 64, 50], trans_a=0, trans_b=0 \| 0.13 \| 0.15 \| 0.33 \| \| A=[12, 50, 50], B=[12, 50, 64], trans_a=0, trans_b=0 \| 0.11 \| 0.13 \| 0.22 \| \| A=[16, 197, 197], B=[16, 197, 64], trans_a=0, trans_b=0 \| 0.46 \| 0.54 \| 1.46 \| \| A=[16, 197, 64], B=[16, 64, 197], trans_a=0, trans_b=0 \| 0.46 \| 0.95 \| 1.74 \| \| A=[16, 50, 64], B=[16, 64, 50], trans_a=0, trans_b=0 \| 0.18 \| 0.32 \| 0.43 \| \| A=[16, 50, 50], B=[16, 50, 64], trans_a=0, trans_b=0 \| 0.15 \| 0.25 \| 0.25 \| ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-19 19:36:41 +03:00
Wanli	6ae1709c6a	Merge pull request #24613 from WanliZhong:softmax_default_axis Make default axis of softmax in onnx "-1" without opset option #24613 Try to solve problem: https://github.com/opencv/opencv/pull/24476#discussion_r1404821158 ONNX `opset <= 11` use 1 `else` use -1 TensorFlow `TF version = 2.x` use -1 `else` use 1 Darknet, Caffe, Torch use 1 by definition	2023-12-15 10:41:42 +03:00
Wanli	9bbc890d96	Merge pull request #24681 from WanliZhong:err_armv8 Fixed armv8 compilation warnings #24681 Fixes the following warning on armv8: ``` warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] ``` Buildbot: https://pullrequest.opencv.org/buildbot/builders/4_x_ARMv8-lin	2023-12-12 15:38:07 +03:00
Wanli	6ee71fee88	Merge pull request #24547 from WanliZhong:refactor_conv_perf_test Classify and extend convolution and depthwise performance tests #24547 This PR aims to: 1. Extend the test cases from models: `YOLOv5`, `YOLOv8`, `EfficientNet`, `YOLOX`, `YuNet`, `SFace`, `MPPalm`, `MPHand`, `MPPose`, `ViTTrack`, `PPOCRv3`, `CRNN`, `PPHumanSeg`. (371 new test cases are added) 2. Classify the existing convolution performance test to below cases - CONV_1x1 - CONV_3x3_S1_D1 (winograd) - CONV - DEPTHWISE 3. Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned): (i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved. (ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]` (iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 31], [32, 63], [64, 95]... ` > Reproduced: 1. follow step in https://github.com/alalek/opencv/commit/dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code [classify conv.txt](https://github.com/opencv/opencv/files/13522228/classify.conv.txt) Performance test result on Apple M2 Test result details: [M2.md](https://github.com/opencv/opencv/files/13379189/M2.md) Additional test result details with FP16: [m2_results_with_fp16.zip](https://github.com/opencv/opencv/files/13491070/m2_results_with_fp16.zip) Brief summary for 4.8.1 vs 4.7.0 or 4.6.0: 1. `CONV_1x1_S1_D1` dropped significant with small or large input shape. 2. `DEPTHWISE_5x5 ` dropped a little compared with 4.7.0. --- Performance test result on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html): 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads. Test result details: [INTEL.md](https://github.com/opencv/opencv/files/13374093/INTEL.md) Brief summary for 4.8.1 vs 4.5.5: 1. `CONV_5x5_S1_D1` dropped significant. 2. `CONV_1x1_S1_D1`, `CONV_3x3_S1_D1`, `DEPTHWISE_3x3_S1_D1`, `DEPTHWISW_3x3_S2_D1` dropped with small input shape. --- TODO: - [x] Perform tests on arm with each opencv version - [x] Perform tests on x86 with each opencv version - [x] Split each test classification with single test config - [x] test enable fp16	2023-12-11 21:35:33 +03:00
Abduragim Shtanchaev	d3dd2e463c	Merge pull request #24611 from Abdurrahheem:ash/add_yolov6_test Add test for YoloX Yolo v6 and Yolo v8 #24611 This PR adds test for YOLOv6 model (which was absent before) The onnx weights for the test are located in this PR [ #1126](https://github.com/opencv/opencv_extra/pull/1126) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-11 16:42:51 +03:00
Dmitry Kurtaev	ac4b26a561	Replace Slice optional inputs removal to adjustment	2023-12-08 23:29:52 +03:00
Yuantao Feng	a2edf4d929	Merge pull request #24647 from fengyuentau:cuda_sub dnn cuda: support Sub #24647 Related https://github.com/opencv/opencv/issues/24606#issuecomment-1837390257 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-06 13:46:24 +03:00
Yuantao Feng	f5ec92e4ca	Merge pull request #24655 from fengyuentau:graph_simplifier_optional_input dnn onnx graph simplifier: handle optional inputs of Slice #24655 Resolves https://github.com/opencv/opencv/issues/24609 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-12-06 13:43:54 +03:00
Alexander Smorkalov	7b1a5fb3de	Migrate Android Face Detection sample to DNN.	2023-11-29 11:02:44 +03:00
Abduragim Shtanchaev	5278560252	Merge pull request #24569 from Abdurrahheem:ash/padding_value_fix Add support for custom padding in DNN preprocessing #24569 This PR add functionality for specifying value in padding. It is required in many preprocessing pipelines in DNNs such as Yolox object detection model ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-28 11:54:09 +03:00
Dmitry Kurtaev	332748dd55	Merge pull request #24577 from dkurt:dnn_graph_match_stack Fix graph fusion with commutative ops #24577 ### Pull Request Readiness Checklist resolves https://github.com/opencv/opencv/issues/24568 Merge with extra: https://github.com/opencv/opencv_extra/pull/1125 TODO: - [x] replace recursive function to sequential See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-24 10:40:32 +03:00
skycat8	848dd12a1f	Merge pull request #24553 from skycat8:yolov5 Add yolov5n to tests #24553 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ X] I agree to contribute to the project under Apache 2 License. - [ X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ X] The PR is proposed to the proper branch - [ X] There is a reference to the original bug report and related work - [ X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ X] The feature is well documented and sample code can be built with the project CMake	2023-11-24 10:36:06 +03:00
Yuantao Feng	d05fb709f9	Merge pull request #24552 from fengyuentau:layernorm_backends dnn: add openvino, opencl and cuda backends for layer normalization layer #24552 Merge after https://github.com/opencv/opencv/pull/24544. Todo: - [x] openvino - [x] opencl - [x] cuda ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-21 15:33:01 +03:00
zihaomu	b913e73d04	DNN: add the Winograd fp16 support (#23654 ) * add Winograd FP16 implementation * fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime * fixed some warnings * hopefully fixed winograd on x64 (and maybe other platforms) --------- Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>	2023-11-20 13:45:37 +03:00
Yuantao Feng	a478757483	Merge pull request #24544 from fengyuentau:layernorm_conformance dnn test: move layer norm tests into conformance tests #24544 Merge with https://github.com/opencv/opencv_extra/pull/1122 ## Motivation Some ONNX operators, such as `LayerNormalization`, `BatchNormalization` and so on, produce outputs for training (mean, stdev). So they have reference outputs of conformance tests for those training outputs as well. However, when it comes to inference, we do not need and produce those outputs for training here in dnn. Hence, output size does not match if we use dnn to infer those conformance models. This has become the barrier if we want to test these operators using their conformance tests. <!-- \| Operator \| Inference needed \| Outputs (required - total) \| Optional outputs for training? \| \| ----------------------- \| ----------------------------------- \| -------------------------- \| ------------------------------ \| \| BatchNormalization \| Yes \| 1 - 3 \| Yes \| \| Dropout \| Maybe, can be eliminated via fusion \| 1 - 2 \| Yes \| \| GRU \| Yes \| 0 - 2 \| No \| \| LSTM \| Yes \| 0 - 3 \| No \| \| LayerNormalization \| Yes \| 1 - 3 \| Yes \| \| MaxPool \| Yes \| 1 - 2 \| Yes \| \| RNN \| Yes \| 0 - 2 \| No \| \| SoftmaxCrossEntropyLoss \| No \| 1 - 2 \| -- \| --> I checked all ONNX operators with optional outputs. Turns out there are only `BatchNormalization`, `Dropout`, `LayerNormalization` and `MaxPool` has optional outputs for training. All except `LayerNormalization` have models set for training mode and eval mode. Blame ONNX for that. ## Solution In this pull request, we remove graph outputs if the graph looks like the following: ``` [X] [Scale] [Bias] [X] [Scale] [Bias] \ \| / this patch \ \| / LayerNormalization -----------> LayerNormalization / \| \ \| [Y] [Mean] [Stdev] [Y] ``` We can update conformance tests and turn on some cases as well if extending to more layers. Notes: 1. This workaround does not solve expanded function operators if they are fused into a single operator, such as `$onnx/onnx/backend/test/data/node/test_layer_normalization_2d_axis1_expanded`, but they can be run without fusion. Note that either dnn or onnxruntime does not fuse those expanded function operators. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-20 11:19:24 +03:00
Abduragim Shtanchaev	8c10545d3c	Merge pull request #24509 from Abdurrahheem:ash/dev_einsum_fast_gemm Fast gemm for einsum #24509 ## This PR adds performance tests for Einsum Layer with FastGemm. See below results of performance test on different inputs ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-16 16:20:17 +03:00
Yuantao Feng	024dfd54af	dnn cann backend: add hardswish, layernorm and instasnce norm for cann and bug fix (#24462 ) * add hardswish for cann * gemm cann bug fix * fix indentation * cann: add layer norm * cann: add instance norm * add supportBackend * cann: layer norm does not support axis=-1 due to 1d mat issue * disable instance norm for now * fix doc * remove tensor desc initialization for 1D tensor	2023-11-15 17:57:52 +03:00
fengyuentau	031846f2e1	remove filter	2023-11-13 14:47:40 +08:00
Alexander Smorkalov	960a926055	Merge pull request #24510 from asmorkalov:as/softmax_rvv Enable softmax layer vectorization on RISC-V RVV #24510 Related: https://github.com/opencv/opencv/pull/24466 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-11-11 09:09:14 +03:00
Dmitry Kurtaev	b7ec2ebb55	Merge pull request #24483 from dkurt:dnn_fusion_commutative_ops Commutative rules for DNN subgraphs fusion #24483 ### Pull Request Readiness Checklist related: https://github.com/opencv/opencv/pull/24463#issuecomment-1783033931 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-08 16:26:33 +03:00
Abduragim Shtanchaev	9d0c8a9edb	Merge pull request #24445 from Abdurrahheem:ash/dev_einsum_pref Einsum Layer Performance Test #24445 ## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs Notation: - WX: windows10_x64 - MX: macos_x64 - MA: macos_arm64 - UX: ubuntu_x64 - UA: ubuntu_arm64 All data in ms (milliseconds). Gemm is backend for matrix multiplication --- Benchmarks: \| Equation \| Inputs Mat Dims \| UX (ms) \| UA (ms) \| MX (ms) \| MA (ms) \| WX (ms) \| \|-------------------------\|-----------------------------------\|----------------\|---------\|---------\|---------\|---------\| \| "ij, jk -> ik" \| [2, 3], [3,2] \| 0.04 ± 0.00 \| - \| - \| - \| - \| \| "ij, jk -> ik" \| [20, 30], [30,20] \| 0.08 ± 0.00 \| - \| - \| - \| - \| \| "ij, jk -> ik" \| [113, 127], [127,113] \| 2.41 ± 0.05 \| - \| - \| - \| - \| \| "imkj, injs -> imnks" \| [1, 4, 7, 9], [1, 5, 9, 8] \| 0.11 ± 0.00 \| - \| - \| - \| - \| \| "imkj, injs -> imnks" \| [1, 4, 70, 90], [1, 5, 90, 80] \| 15.49 ± 0.46 \| - \| - \| - \| - \| \| "imkj, injs -> imnks" \| [1, 4, 73, 91], [1, 5, 91, 57] \| 11.53 ± 0.06 \| - \| - \| - \| - \| \| "ij -> i" \| [30, 40] \| 0.03 ± 0.00 \| - \| - \| - \| - \| \| "ij -> i" \| [113, 374] \| 0.13 ± 0.00 \| - \| - \| - \| - \| \| "...ij -> ...i" \| [30, 40] \| 0.03 ± 0.00 \| - \| - \| - \| - \| \| "...ij -> ...i" \| [113, 374] \| 0.13 ± 0.00 \| - \| - \| - \| - \| \| "...ij, ...jk -> ...ik" \| [40, 50], [50,80] \| 0.37 ± 0.01 \| - \| - \| - \| - \| \| "...ij, ...jk -> ...ik" \| [47, 51], [51, 83] \| 0.43 ± 0.01 \| - \| - \| - \| - \| ----- ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-08 11:56:21 +03:00
Yuantao Feng	6079e22523	Merge pull request #24500 from fengyuentau:test_layer_fusion dnn (onnx): add subgraph fusion tests #24500 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-07 17:40:31 +03:00
Yuantao Feng	ee0822dc4d	Merge pull request #24378 from fengyuentau:instance_norm dnn onnx: add instance norm layer #24378 Resolves https://github.com/opencv/opencv/issues/24377 Relates https://github.com/opencv/opencv/pull/24092#discussion_r1349841644 \| Perf \| multi-thread \| single-thread \| \| - \| - \| - \| \| x: [2, 64, 180, 240] \| 3.95ms \| 11.12ms \| Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via https://github.com/opencv/opencv/pull/24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```	2023-11-07 12:59:10 +03:00
Wanli	ed52f7feea	Improve and refactor softmax layer (#24466 ) * improve and refactor softmax layer * fix building error * compatible region layer * fix axisStep when disable SIMD * fix dynamic array * try to fix error * use nlanes from VTraits * move axisBias to srcOffset * fix bug caused by axisBias * remove macro * replace #ifdef with #if for CV_SIMD	2023-11-06 04:48:32 +03:00
Dmitry Kurtaev	fa56623458	Merge pull request #24463 from dkurt:dnn_shared_nodes_fusion DNN graph fusion with shared nodes #24463 ### Pull Request Readiness Checklist For now, nodes from matched pattern are removed during the matching process so if nodes are used in similar subgraph, they cannot be found. required for https://github.com/opencv/opencv/pull/24397 Merge with extra: https://github.com/opencv/opencv_extra/pull/1115 A part from [model_name ](https://github.com/onnx/models/blob/main/vision/object_detection_segmentation/fcn/model/fcn-resnet101-11.onnx) with two Resize subgraphs with shared nodes: ![image](https://github.com/opencv/opencv/assets/25801568/611d89d9-12fb-4add-9218-13b10d2c086a) See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-03 12:34:09 +03:00
Yuantao Feng	c91af16fa7	Merge pull request #24409 from fengyuentau:norm_kernel dnn: add shared fastNorm kernel for mvn, instance norm and layer norm #24409 Relates https://github.com/opencv/opencv/pull/24378#issuecomment-1756906570 TODO: - [x] add fastNorm - [x] refactor layer norm with fastNorm - [x] refactor mvn with fastNorm - [ ] add onnx mvn in importer (in a new PR?) - [ ] refactor instance norm with fastNorm (in another PR https://github.com/opencv/opencv/pull/24378, need to merge this one first though) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-11-01 14:33:57 +03:00
Kumataro	1911c63826	fix: supress GCC13 warnings (#24434 ) * fix: supress GCC13 warnings * fix for review and compile-warning on MacOS	2023-10-26 09:00:58 +03:00
Abduragim Shtanchaev	a3b3a589f9	Merge pull request #24322 from Abdurrahheem:ash/dev_einsum_ellips Ellipses supported added for Einsum Layer #24322 This PR added addresses issues not covered in #24037. Namely these are: Test case for this patch is in this PR [#1106](https://github.com/opencv/opencv_extra/pull/1106) in opencv extra Added: - [x] Broadcasting reduction "...ii ->...I" - [x] Add lazy shape deduction. "...ij, ...jk->...ik" Features to add: - [ ] Add implicit output computation support. "bij,bjk ->" (output subscripts should be "bik") - [ ] Add support for CUDA backend - [ ] BatchWiseMultiply optimize - [ ] Performance test ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-10-24 16:47:00 +03:00

1 2 3 4 5 ...

2213 Commits