opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-14 08:59:11 +08:00

Author	SHA1	Message	Date
Alexander Smorkalov	25c28c5da4	Merge pull request #23485 from zihaomu:add_onnx_where DNN: add ONNX where node support	2023-05-05 09:21:07 +03:00
zihaomu	0513741a85	add broadcast where node	2023-05-05 11:16:19 +08:00
Alexander Smorkalov	351589e5fb	Merge pull request #23491 from fengyuentau:patch_for_segment_anything Fixes for Segment Anything	2023-05-04 21:07:58 +03:00
Alexander Alekhin	3c76b33532	Merge pull request #22614 from zihaomu:add_std2DB_API	2023-05-01 19:37:23 +00:00
zihaomu	8be93a6de7	add scale factor to DB demo.	2023-04-30 22:03:21 +08:00
Abduragim Shtanchaev	3b1ee0549b	added test for lstm without hidden states initialization	2023-04-25 16:01:13 +03:00
Alexander Smorkalov	e3e1f704a4	Merge pull request #23528 from WanliZhong:issue23278 DNN/CUDA: make 'abcd op 1b11' broadcast eltwise operator support cuda	2023-04-24 19:31:55 +03:00
Dmitry Kurtaev	aa57833ad5	Merge pull request #23409 from dkurt:dnn_tflite_quant Import and inference INT8 quantized TFLite model #23409 ### Pull Request Readiness Checklist * Support quantized TFLite models * Enable fused activations (FP32, INT8) Merge with extra: https://github.com/opencv/opencv_extra/pull/1048 ![res](https://user-images.githubusercontent.com/25801568/231433201-566b4bd6-ccff-462c-9e74-adbdcdf3648b.png) on the image, green boxes are from TFLite and red boxes from OpenCV See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-04-24 13:44:10 +03:00
Abduragim Shtanchaev	e4e774d42b	Merge pull request #23475 from Abdurrahheem:lstm_fix_initialization Fix ONNX parser for single-layer LSTM hidden and cell states #23475 ### Fix ONNX parser for single-layer LSTM hidden and cell states ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake This PR addresses #21118 [issue](https://github.com/opencv/opencv/issues/21118). The problem is that the ONNX parser is unable to read the hidden state and cell state for single-layer LSTMs. This PR fixes the issue by updating the parser to correctly read hidden and cell states.	2023-04-24 13:39:41 +03:00
wanli	e4360294c5	make 'abcd op 1b11' broadcast support cuda	2023-04-23 17:46:50 +08:00
Alexander Alekhin	9ab0ff6cf2	Merge pull request #23511 from zihaomu:issue_23465	2023-04-22 04:01:26 +00:00
Zihao Mu	601778e0e6	Merge pull request #22750 from zihaomu:improve_blobFromImage DNN: Add New API blobFromImageParam #22750 The purpose of this PR: 1. Add new API `blobFromImageParam` to extend `blobFromImage` API. It can support the different data layout (NCHW or NHWC), and letter_box. 2. ~~`blobFromImage` can output `CV_16F`~~ ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-04-21 19:10:17 +03:00
zihaomu	54e1a8709d	fix the bug, disable the fast1x1 when padding is not 0.	2023-04-21 10:55:07 +08:00
Yuantao Feng	3c1fcd5deb	Merge pull request #23401 from fengyuentau:fix_cann_layer_support dnn: Support more operators in CANN backend #23401 This PR adds the support of following layers: - [x] Sub - [x] PRelu - [x] DeConv - [x] Also warn users if backend is switched back to default if some of the layers are not supported. - [ ] [Dropped] LSTM: some hacks (adding layers) were introduced which makes it even harder to build the graph for CANN backend. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-04-20 10:18:35 +03:00
Abduragim Shtanchaev	b3a2444bcf	Merge pull request #23501 from Abdurrahheem:additional_lstm_tests Added LSTM and GRU tests for various batch and input length sizes #23501 Added tests with various sequence length and batch sizes Test data: https://github.com/opencv/opencv_extra/pull/1057 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-04-20 10:11:33 +03:00
Alexander Smorkalov	aa17f881b1	Merge pull request #23482 from zihaomu:onnx_opset13_split DNN: support the split node of onnx opset >= 13	2023-04-14 11:59:57 +03:00
fengyuentau	4f99e5ab37	allow null constant_value in Pad and ignore it when loading	2023-04-14 16:50:16 +08:00
fengyuentau	88cacd35c5	support broadcast on axis > 1 for Expand	2023-04-14 15:52:27 +08:00
Alexander Smorkalov	136121f6ee	Merge pull request #22660 from zhouzq-thu:4.x Fix objectness is not assigned in dnn::region_layer	2023-04-12 09:34:58 +03:00
Alexander Smorkalov	3f02c9d5b9	Merge pull request #23310 from hanliutong:fix_hal_compatibility Fix HAL compatibility layer	2023-04-11 12:43:54 +03:00
zihaomu	51281f8d69	support the split node of onnx opset >= 13	2023-04-11 16:18:50 +08:00
Yuantao Feng	3a83a35ab0	Merge pull request #23296 from fengyuentau:fix_identifying_constant Fix identifying initializers in ONNX graph simplification #23296 Fixes https://github.com/opencv/opencv/issues/23295 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-04-06 15:35:31 +03:00
Dmitry Kurtaev	5e1d33329b	Several fixes for ONNX importer: Expand, Gather	2023-03-27 22:15:26 +03:00
HAN Liutong	a809ae4e88	Fix HAL compatibility layer and modify use cases.	2023-03-27 21:30:47 +08:00
Dmitry Kurtaev	5df6b4a756	Merge pull request #23325 from dkurt:dnn_input_info Propagate inputs info for ONNX and TFLite models ### Pull Request Readiness Checklist Needed for generic applications such as benchmarking pipelines. So OpenCV can tell about the default input shapes specified in the models. See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-03-21 14:50:53 +03:00
Alexander Smorkalov	924a65413a	Merge pull request #23357 from zihaomu:fix_winograd_error_32bit DNN : fix bug in 32 bit cpu	2023-03-15 11:24:54 +03:00
zihaomu	6bac5453d1	fix bug in 32 bit cpu	2023-03-15 08:24:55 +08:00
Alexander Smorkalov	ccbc784195	Merge pull request #23354 from zihaomu:issue_23351 DNN : fix bug in layer fusion	2023-03-14 17:23:25 +03:00
zihaomu	386be97ce2	fix bug in layer fusion	2023-03-14 19:06:06 +08:00
tingbo.liao	7d032de7e8	Fix bugs of test case failure 4 failed tests in open_test_dnn listed below: * Test_Caffe_layers.Conv_Elu/0, where GetParam() = OCV/CPU * Test_ONNX_layers.ConvResizePool1d/0, where GetParam() = OCV/CPU * Test_TensorFlow_layers.tf_reshape_nhwc/0, where GetParam() = OCV/CPU * Test_Torch_layers.net_inception_block/0, where GetParam() = OCV/CPU In winofunc_AtXA_8x8_f32 and winofunc_BtXB_8x8_f32 implementation, incorrect input parameters cause tests failure. Add four new different variables for the last four input parameters of v_transpose4x4 to fix bugs, and update related comments. Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>	2023-03-14 17:05:19 +08:00
Alexander Smorkalov	22a52766dc	Merge pull request #23343 from zihaomu:fix_test_onnx_conf DNN Test ONNX: Fix the logic of the test case	2023-03-13 21:48:41 +03:00
Yuantao Feng	b94e13c8ae	Merge pull request #23319 from fengyuentau:fix_zoo_issue_136 Related issue: https://github.com/opencv/opencv_zoo/issues/136 Features added: - Support operators with multiple output: ONNX Split. - Support Slice without steps. Bugs fixed: - Wrong settings in ClipByValue (Relu6). - Wrong calculation of pads in convolution layer (It is wrong generally but only fixed specifically for CANN for now). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-03-13 21:46:33 +03:00
zihaomu	ee3740af00	move global skip out of if loop, and add opencv_deny_list	2023-03-13 22:16:51 +08:00
Zihao Mu	e03e2e7f94	Merge pull request #23192 from zihaomu:clean_up_SIMD_code ### Purpose of this PR: - Move all dispatch and SIMD code of `convolution layer` into `simd.hpp` file. - Support Winograd at AVX-only machine. - Re-name the folder from `fast_conv` to `cpu_kernels`. In the future, we can put other layers of CPU optimization into it, like `GEMM` or `MatMul`. ## Performance Test Since this patch just focuses on the code style, the performance is expected as the same as before. Test with the following script: `./bin/opencv_perf_dnn '--gtest_filter=conv' --gtest_output="xml:../1-0th.xml" --perf_threads=1` ### Test on X86 platform Min (ms) \|Name of Test\|4.x \| patch \| 4.x vs patch (x-factor)\| \|---\|:-:\|:-:\|:-:\| \|conv1d::Conv1D::(GFLOPS=0.000, K=[3], IN={1, 2, 19}, OCN=2, G=2, S=2, P=(1, 1), BIAS, OCV/CPU)\|0.001\|0.001\|0.98\| \|conv1d::Conv1D::(GFLOPS=0.000, K=[3], IN={1, 2, 25}, OCN=2, G=2, P=(2, 2), PM=SAME, OCV/CPU)\|0.001\|0.001\|0.95\| \|conv1d::Conv1D::(GFLOPS=0.000, K=[3], IN={1, 6, 10}, OCN=6, PM=VALID, BIAS, OCV/CPU)\|0.001\|0.001\|0.97\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[1 x 1 x 1], IN={1, 4, 9, 10, 10}, OCN=4, S=[1 x 1 x 2], P=(1, 1) x (1, 1) x (1, 1), PM=VALID, OCV/CPU)\|0.002\|0.002\|1.04\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[1 x 1 x 1], IN={1, 8, 1, 10, 10}, OCN=8, G=8, P=(1, 1) x (1, 1) x (1, 1), BIAS, OCV/CPU)\|0.002\|0.002\|0.94\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[3 x 3 x 3], IN={1, 2, 19, 19, 19}, OCN=2, G=2, S=[2 x 2 x 2], P=(1, 1) x (1, 1) x (1, 1), BIAS, OCV/CPU)\|0.040\|0.044\|0.93\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[3 x 4 x 2], IN={1, 4, 8, 10, 10}, OCN=4, G=4, S=[1 x 2 x 1], BIAS, OCV/CPU)\|0.010\|0.010\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.001, K=[3 x 3 x 3], IN={1, 2, 25, 19, 19}, OCN=2, G=2, S=[1 x 2 x 2], P=(2, 2) x (2, 2) x (2, 2), PM=SAME, OCV/CPU)\|0.106\|0.103\|1.03\| \|conv3d::Conv3D::(GFLOPS=0.002, K=[3 x 1 x 4], IN={1, 14, 5, 10, 10}, OCN=14, PM=SAME, OCV/CPU)\|0.041\|0.040\|1.03\| \|conv3d::Conv3D::(GFLOPS=0.006, K=[5 x 5 x 5], IN={1, 4, 50, 19, 19}, OCN=4, S=[2 x 2 x 2], P=(1, 1) x (1, 1) x (1, 1), PM=VALID, OCV/CPU)\|0.340\|0.329\|1.03\| \|conv3d::Conv3D::(GFLOPS=0.027, K=[3 x 3 x 3], IN={1, 6, 10, 38, 50}, OCN=6, PM=VALID, BIAS, OCV/CPU)\|0.590\|0.567\|1.04\| \|conv3d::Conv3D::(GFLOPS=0.030, K=[5 x 5 x 5], IN={1, 6, 19, 19, 19}, OCN=6, G=2, OCV/CPU)\|1.374\|1.314\|1.05\| \|conv3d::Conv3D::(GFLOPS=0.045, K=[7 x 7 x 7], IN={1, 2, 38, 38, 38}, OCN=2, S=[1 x 2 x 1], OCV/CPU)\|3.715\|3.528\|1.05\| \|conv3d::Conv3D::(GFLOPS=0.053, K=[3 x 3 x 3], IN={1, 10, 98, 10, 10}, OCN=10, PM=SAME, OCV/CPU)\|1.181\|1.166\|1.01\| \|conv3d::Conv3D::(GFLOPS=0.071, K=[7 x 7 x 7], IN={1, 6, 15, 19, 19}, OCN=6, S=[2 x 1 x 1], P=(3, 3) x (3, 3) x (3, 3), PM=SAME, BIAS, OCV/CPU)\|2.689\|2.587\|1.04\| \|conv3d::Conv3D::(GFLOPS=0.093, K=[5 x 5 x 5], IN={1, 4, 40, 75, 75}, OCN=4, S=[2 x 2 x 2], OCV/CPU)\|4.754\|4.500\|1.06\| \|conv3d::Conv3D::(GFLOPS=0.116, K=[5 x 5 x 5], IN={1, 2, 21, 75, 100}, OCN=2, BIAS, OCV/CPU)\|9.612\|9.112\|1.05\| \|conv3d::Conv3D::(GFLOPS=1.267, K=[5 x 5 x 5], IN={1, 3, 75, 75, 100}, OCN=3, PM=SAME, BIAS, OCV/CPU)\|69.000\|64.676\|1.07\| \|conv3d::Conv3D::(GFLOPS=1.343, K=[3 x 3 x 3], IN={1, 11, 9, 150, 200}, OCN=11, PM=VALID, BIAS, OCV/CPU)\|20.248\|18.451\|1.10\| \|conv::Conv::(GFLOPS=0.177, K=[1 x 1], IN={1, 512, 26, 26}, OCN=256, OCV/CPU)\|1.395\|1.392\|1.00\| \|conv::Conv::(GFLOPS=0.177, K=[1 x 1], IN={1, 1024, 13, 13}, OCN=512, OCV/CPU)\|1.990\|1.984\|1.00\| \|conv::Conv::(GFLOPS=0.178, K=[1 x 1], IN={1, 256, 52, 52}, OCN=128, OCV/CPU)\|1.393\|1.360\|1.02\| \|conv::Conv::(GFLOPS=0.210, K=[1 x 1], IN={1, 576, 38, 50}, OCN=96, PM=SAME, BIAS, OCV/CPU)\|1.813\|1.744\|1.04\| \|conv::Conv::(GFLOPS=0.231, K=[3 x 3], IN={1, 128, 56, 56}, OCN=32, P=[1 x 1], OCV/CPU)\|1.190\|1.191\|1.00\| \|conv::Conv::(GFLOPS=0.231, K=[3 x 3], IN={1, 256, 14, 14}, OCN=256, P=[1 x 1], OCV/CPU)\|1.286\|1.284\|1.00\| \|conv::Conv::(GFLOPS=0.280, K=[1 x 1], IN={1, 576, 38, 50}, OCN=128, PM=SAME, BIAS, OCV/CPU)\|2.295\|2.279\|1.01\| \|conv::Conv::(GFLOPS=0.302, K=[3 x 3], IN={1, 64, 64, 64}, OCN=64, PM=SAME, OCV/CPU)\|1.322\|1.331\|0.99\| \|conv::Conv::(GFLOPS=0.357, K=[1 x 1], IN={1, 64, 208, 208}, OCN=64, OCV/CPU)\|3.784\|3.533\|1.07\| \|conv::Conv::(GFLOPS=0.420, K=[3 x 3], IN={1, 96, 38, 50}, OCN=128, PM=SAME, BIAS, OCV/CPU)\|1.838\|1.844\|1.00\| \|conv::Conv::(GFLOPS=0.472, K=[3 x 3], IN={1, 128, 40, 40}, OCN=128, PM=SAME, OCV/CPU)\|1.957\|1.959\|1.00\| \|conv::Conv::(GFLOPS=0.472, K=[3 x 3], IN={1, 256, 20, 20}, OCN=256, PM=SAME, OCV/CPU)\|2.596\|2.573\|1.01\| \|conv::Conv::(GFLOPS=0.472, K=[3 x 3], IN={1, 512, 10, 10}, OCN=512, PM=SAME, OCV/CPU)\|4.183\|4.083\|1.02\| \|conv::Conv::(GFLOPS=0.561, K=[3 x 3], IN={1, 128, 38, 50}, OCN=128, PM=SAME, BIAS, OCV/CPU)\|2.413\|2.406\|1.00\| \|conv::Conv::(GFLOPS=0.624, K=[3 x 3], IN={1, 128, 46, 46}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|2.538\|2.546\|1.00\| \|conv::Conv::(GFLOPS=0.701, K=[3 x 3], IN={1, 128, 38, 50}, OCN=160, PM=SAME, BIAS, OCV/CPU)\|2.972\|2.980\|1.00\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 64, 104, 104}, OCN=64, P=[1 x 1], OCV/CPU)\|3.452\|3.464\|1.00\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 128, 52, 52}, OCN=128, P=[1 x 1], OCV/CPU)\|3.082\|3.105\|0.99\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 256, 26, 26}, OCN=256, P=[1 x 1], OCV/CPU)\|4.043\|3.919\|1.03\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 512, 13, 13}, OCN=512, P=[1 x 1], OCV/CPU)\|5.538\|5.531\|1.00\| \|conv::Conv::(GFLOPS=0.830, K=[3 x 3], IN={1, 64, 75, 100}, OCN=96, PM=SAME, BIAS, OCV/CPU)\|3.393\|3.418\|0.99\| \|conv::Conv::(GFLOPS=0.958, K=[3 x 3], IN={1, 192, 38, 38}, OCN=192, PM=SAME, OCV/CPU)\|4.325\|4.234\|1.02\| \|conv::Conv::(GFLOPS=0.958, K=[3 x 3], IN={1, 384, 19, 19}, OCN=384, PM=SAME, OCV/CPU)\|6.009\|5.908\|1.02\| \|conv::Conv::(GFLOPS=1.022, K=[3 x 3], IN={1, 576, 19, 19}, OCN=273, PM=SAME, BIAS, OCV/CPU)\|6.557\|6.376\|1.03\| \|conv::Conv::(GFLOPS=1.112, K=[3 x 3], IN={1, 512, 10, 10}, OCN=1206, P=[1 x 1], BIAS, OCV/CPU)\|10.114\|9.472\|1.07\| \|conv::Conv::(GFLOPS=1.181, K=[3 x 3], IN={1, 64, 160, 200}, OCN=128, S=[2 x 2], P=[1 x 1], BIAS, OCV/CPU)\|10.373\|9.879\|1.05\| \|conv::Conv::(GFLOPS=1.182, K=[3 x 3], IN={1, 32, 320, 400}, OCN=64, S=[2 x 2], P=[1 x 1], BIAS, OCV/CPU)\|12.782\|11.624\|1.10\| \|conv::Conv::(GFLOPS=1.195, K=[9 x 9], IN={1, 32, 240, 320}, OCN=3, P=[4 x 4], BIAS, OCV/CPU)\|90.931\|90.552\|1.00\| \|conv::Conv::(GFLOPS=1.196, K=[3 x 3], IN={1, 384, 26, 26}, OCN=256, P=[1 x 1], OCV/CPU)\|6.091\|5.818\|1.05\| \|conv::Conv::(GFLOPS=1.210, K=[3 x 3], IN={1, 32, 256, 256}, OCN=32, PM=SAME, OCV/CPU)\|7.083\|6.643\|1.07\| \|conv::Conv::(GFLOPS=1.245, K=[3 x 3], IN={1, 64, 75, 75}, OCN=192, PM=SAME, BIAS, OCV/CPU)\|5.054\|5.059\|1.00\| \|conv::Conv::(GFLOPS=1.245, K=[3 x 3], IN={1, 96, 75, 100}, OCN=96, PM=SAME, BIAS, OCV/CPU)\|5.005\|4.931\|1.02\| \|conv::Conv::(GFLOPS=1.248, K=[3 x 3], IN={1, 256, 46, 46}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|4.951\|5.065\|0.98\| \|conv::Conv::(GFLOPS=1.258, K=[3 x 3], IN={1, 1280, 10, 10}, OCN=546, PM=SAME, BIAS, OCV/CPU)\|11.957\|11.293\|1.06\| \|conv::Conv::(GFLOPS=1.261, K=[3 x 3], IN={1, 192, 38, 50}, OCN=192, PM=SAME, BIAS, OCV/CPU)\|5.328\|5.250\|1.01\| \|conv::Conv::(GFLOPS=1.416, K=[3 x 3], IN={1, 128, 62, 82}, OCN=128, BIAS, OCV/CPU)\|5.544\|5.292\|1.05\| \|conv::Conv::(GFLOPS=1.500, K=[3 x 3], IN={1, 128, 64, 84}, OCN=128, BIAS, OCV/CPU)\|6.186\|5.893\|1.05\| \|conv::Conv::(GFLOPS=1.586, K=[3 x 3], IN={1, 128, 66, 86}, OCN=128, BIAS, OCV/CPU)\|6.153\|5.834\|1.05\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 256, 26, 26}, OCN=512, P=[1 x 1], OCV/CPU)\|8.154\|8.107\|1.01\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 256, 52, 52}, OCN=512, S=[2 x 2], P=[1 x 1], OCV/CPU)\|12.699\|12.256\|1.04\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 512, 13, 13}, OCN=1024, P=[1 x 1], OCV/CPU)\|11.355\|11.217\|1.01\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 512, 26, 26}, OCN=1024, S=[2 x 2], P=[1 x 1], OCV/CPU)\|19.062\|17.814\|1.07\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 64, 104, 104}, OCN=128, P=[1 x 1], OCV/CPU)\|6.820\|6.531\|1.04\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 64, 208, 208}, OCN=128, S=[2 x 2], P=[1 x 1], OCV/CPU)\|14.502\|13.483\|1.08\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 128, 52, 52}, OCN=256, P=[1 x 1], OCV/CPU)\|6.270\|6.123\|1.02\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 128, 104, 104}, OCN=256, S=[2 x 2], P=[1 x 1], OCV/CPU)\|13.173\|12.451\|1.06\| \|conv::Conv::(GFLOPS=1.598, K=[3 x 3], IN={1, 32, 208, 208}, OCN=64, P=[1 x 1], OCV/CPU)\|8.326\|7.652\|1.09\| \|conv::Conv::(GFLOPS=1.598, K=[3 x 3], IN={1, 32, 416, 416}, OCN=64, S=[2 x 2], P=[1 x 1], OCV/CPU)\|17.605\|16.465\|1.07\| \|conv::Conv::(GFLOPS=1.659, K=[3 x 3], IN={1, 960, 10, 10}, OCN=960, PM=SAME, OCV/CPU)\|15.675\|14.771\|1.06\| \|conv::Conv::(GFLOPS=1.660, K=[3 x 3], IN={1, 128, 75, 75}, OCN=128, G=128, P=[1 x 1], BIAS, OCV/CPU)\|0.420\|0.423\|0.99\| \|conv::Conv::(GFLOPS=1.660, K=[3 x 3], IN={1, 128, 75, 75}, OCN=128, PM=SAME, OCV/CPU)\|6.788\|6.491\|1.05\| \|conv::Conv::(GFLOPS=1.675, K=[3 x 3], IN={1, 128, 68, 88}, OCN=128, BIAS, OCV/CPU)\|6.456\|6.168\|1.05\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 256, 38, 38}, OCN=256, G=256, P=[1 x 1], BIAS, OCV/CPU)\|0.263\|0.261\|1.01\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 256, 38, 38}, OCN=256, PM=SAME, OCV/CPU)\|7.690\|7.398\|1.04\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 512, 19, 19}, OCN=512, G=512, P=[1 x 1], BIAS, OCV/CPU)\|0.200\|0.202\|0.99\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 512, 19, 19}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|10.542\|10.464\|1.01\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 512, 19, 19}, OCN=512, PM=SAME, OCV/CPU)\|10.876\|10.728\|1.01\| \|conv::Conv::(GFLOPS=1.766, K=[3 x 3], IN={1, 128, 70, 90}, OCN=128, BIAS, OCV/CPU)\|7.194\|6.768\|1.06\| \|conv::Conv::(GFLOPS=1.859, K=[3 x 3], IN={1, 128, 72, 92}, OCN=128, BIAS, OCV/CPU)\|7.099\|6.731\|1.05\| \|conv::Conv::(GFLOPS=1.888, K=[3 x 3], IN={1, 1024, 10, 10}, OCN=1024, G=1024, P=[1 x 1], BIAS, OCV/CPU)\|0.147\|0.162\|0.91\| \|conv::Conv::(GFLOPS=1.888, K=[3 x 3], IN={1, 1024, 10, 10}, OCN=1024, PM=SAME, OCV/CPU)\|18.558\|17.141\|1.08\| \|conv::Conv::(GFLOPS=1.954, K=[3 x 3], IN={1, 128, 74, 94}, OCN=128, BIAS, OCV/CPU)\|7.641\|7.219\|1.06\| \|conv::Conv::(GFLOPS=1.995, K=[9 x 9], IN={1, 3, 320, 400}, OCN=32, P=[4 x 4], BIAS, OCV/CPU)\|22.666\|20.999\|1.08\| \|conv::Conv::(GFLOPS=2.052, K=[3 x 3], IN={1, 128, 76, 96}, OCN=128, BIAS, OCV/CPU)\|8.523\|7.921\|1.08\| \|conv::Conv::(GFLOPS=2.100, K=[3 x 3], IN={1, 144, 75, 75}, OCN=144, PM=SAME, OCV/CPU)\|8.514\|8.109\|1.05\| \|conv::Conv::(GFLOPS=2.153, K=[3 x 3], IN={1, 128, 78, 98}, OCN=128, BIAS, OCV/CPU)\|8.300\|7.878\|1.05\| \|conv::Conv::(GFLOPS=2.156, K=[3 x 3], IN={1, 576, 19, 19}, OCN=576, PM=SAME, OCV/CPU)\|13.403\|13.131\|1.02\| \|conv::Conv::(GFLOPS=2.255, K=[3 x 3], IN={1, 128, 80, 100}, OCN=128, BIAS, OCV/CPU)\|8.920\|8.357\|1.07\| \|conv::Conv::(GFLOPS=2.719, K=[3 x 3], IN={1, 96, 256, 256}, OCN=96, S=[2 x 2], PM=SAME, OCV/CPU)\|28.827\|27.616\|1.04\| \|conv::Conv::(GFLOPS=3.319, K=[3 x 3], IN={1, 128, 75, 75}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|12.895\|12.670\|1.02\| \|conv::Conv::(GFLOPS=3.321, K=[3 x 3], IN={1, 64, 150, 150}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|14.120\|13.078\|1.08\| \|conv::Conv::(GFLOPS=3.398, K=[7 x 7], IN={1, 128, 46, 46}, OCN=128, P=[3 x 3], BIAS, OCV/CPU)\|27.541\|27.582\|1.00\| \|conv::Conv::(GFLOPS=3.407, K=[3 x 3], IN={1, 512, 19, 19}, OCN=1024, D=[6 x 6], P=[6 x 6], BIAS, OCV/CPU)\|32.367\|31.140\|1.04\| \|conv::Conv::(GFLOPS=3.408, K=[3 x 3], IN={1, 256, 38, 38}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|14.934\|14.910\|1.00\| \|conv::Conv::(GFLOPS=4.247, K=[3 x 3], IN={1, 480, 32, 32}, OCN=480, PM=SAME, OCV/CPU)\|18.289\|18.491\|0.99\| \|conv::Conv::(GFLOPS=4.247, K=[5 x 5], IN={1, 144, 128, 128}, OCN=144, S=[2 x 2], PM=SAME, OCV/CPU)\|37.857\|36.845\|1.03\| \|conv::Conv::(GFLOPS=4.566, K=[7 x 7], IN={1, 172, 46, 46}, OCN=128, P=[3 x 3], BIAS, OCV/CPU)\|37.402\|36.566\|1.02\| \|conv::Conv::(GFLOPS=4.993, K=[3 x 3], IN={1, 256, 46, 46}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|19.031\|19.164\|0.99\| \|conv::Conv::(GFLOPS=4.993, K=[3 x 3], IN={1, 512, 46, 46}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|19.019\|19.135\|0.99\| \|conv::Conv::(GFLOPS=4.994, K=[3 x 3], IN={1, 128, 92, 92}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|20.077\|19.400\|1.03\| \|conv::Conv::(GFLOPS=4.997, K=[3 x 3], IN={1, 64, 184, 184}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|21.883\|21.302\|1.03\| \|conv::Conv::(GFLOPS=5.780, K=[5 x 5], IN={1, 672, 32, 32}, OCN=672, S=[2 x 2], PM=SAME, OCV/CPU)\|51.288\|49.851\|1.03\| \|conv::Conv::(GFLOPS=6.116, K=[3 x 3], IN={1, 1152, 16, 16}, OCN=1152, PM=SAME, OCV/CPU)\|27.349\|28.359\|0.96\| \|conv::Conv::(GFLOPS=6.118, K=[3 x 3], IN={1, 144, 128, 128}, OCN=144, PM=SAME, OCV/CPU)\|24.915\|25.130\|0.99\| \|conv::Conv::(GFLOPS=6.637, K=[3 x 3], IN={1, 256, 75, 75}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|25.488\|25.899\|0.98\| \|conv::Conv::(GFLOPS=6.638, K=[3 x 3], IN={1, 128, 150, 150}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|27.346\|27.390\|1.00\| \|conv::Conv::(GFLOPS=6.641, K=[3 x 3], IN={1, 64, 150, 200}, OCN=192, PM=SAME, BIAS, OCV/CPU)\|28.033\|28.301\|0.99\| \|conv::Conv::(GFLOPS=6.641, K=[3 x 3], IN={1, 64, 300, 300}, OCN=64, P=[1 x 1], BIAS, OCV/CPU)\|50.216\|49.970\|1.00\| \|conv::Conv::(GFLOPS=6.814, K=[3 x 3], IN={1, 512, 38, 38}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|29.670\|29.513\|1.01\| \|conv::Conv::(GFLOPS=8.025, K=[3 x 3], IN={1, 1024, 19, 19}, OCN=1206, P=[1 x 1], BIAS, OCV/CPU)\|50.565\|49.634\|1.02\| \|conv::Conv::(GFLOPS=9.986, K=[3 x 3], IN={1, 512, 46, 46}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|37.900\|37.814\|1.00\| \|conv::Conv::(GFLOPS=9.987, K=[3 x 3], IN={1, 256, 92, 92}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|41.367\|39.742\|1.04\| \|conv::Conv::(GFLOPS=9.989, K=[3 x 3], IN={1, 128, 184, 184}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|49.128\|50.350\|0.98\| \|conv::Conv::(GFLOPS=9.993, K=[3 x 3], IN={1, 64, 368, 368}, OCN=64, P=[1 x 1], BIAS, OCV/CPU)\|79.643\|80.645\|0.99\| \|conv::Conv::(GFLOPS=10.087, K=[3 x 3], IN={1, 576, 38, 50}, OCN=512, PM=SAME, BIAS, OCV/CPU)\|41.439\|40.895\|1.01\| \|conv::Conv::(GFLOPS=10.701, K=[3 x 3], IN={1, 512, 38, 38}, OCN=804, P=[1 x 1], BIAS, OCV/CPU)\|46.504\|46.220\|1.01\| \|conv::Conv::(GFLOPS=11.797, K=[5 x 5], IN={1, 240, 64, 64}, OCN=240, PM=SAME, OCV/CPU)\|98.086\|96.842\|1.01\| \|conv::Conv::(GFLOPS=11.797, K=[5 x 5], IN={1, 480, 32, 32}, OCN=480, PM=SAME, OCV/CPU)\|102.447\|97.299\|1.05\| \|conv::Conv::(GFLOPS=16.987, K=[5 x 5], IN={1, 1152, 16, 16}, OCN=1152, PM=SAME, OCV/CPU)\|145.047\|144.996\|1.00\| \|conv::Conv::(GFLOPS=23.122, K=[5 x 5], IN={1, 672, 32, 32}, OCN=672, PM=SAME, OCV/CPU)\|206.104\|195.543\|1.05\| ### Test on M1(ARM) platform \|Name of Test\|4.x\|patch\|4.x vs patch (x-factor)\| \|---\|:-:\|:-:\|:-:\| \|conv1d::Conv1D::(GFLOPS=0.000, K=[3], IN={1, 2, 19}, OCN=2, G=2, S=2, P=(1, 1), BIAS, OCV/CPU)\|0.001\|0.001\|0.97\| \|conv1d::Conv1D::(GFLOPS=0.000, K=[3], IN={1, 2, 25}, OCN=2, G=2, P=(2, 2), PM=SAME, OCV/CPU)\|0.001\|0.001\|0.94\| \|conv1d::Conv1D::(GFLOPS=0.000, K=[3], IN={1, 6, 10}, OCN=6, PM=VALID, BIAS, OCV/CPU)\|0.002\|0.002\|0.92\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[1 x 1 x 1], IN={1, 4, 9, 10, 10}, OCN=4, S=[1 x 1 x 2], P=(1, 1) x (1, 1) x (1, 1), PM=VALID, OCV/CPU)\|0.003\|0.003\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[1 x 1 x 1], IN={1, 8, 1, 10, 10}, OCN=8, G=8, P=(1, 1) x (1, 1) x (1, 1), BIAS, OCV/CPU)\|0.003\|0.003\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[3 x 3 x 3], IN={1, 2, 19, 19, 19}, OCN=2, G=2, S=[2 x 2 x 2], P=(1, 1) x (1, 1) x (1, 1), BIAS, OCV/CPU)\|0.031\|0.031\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.000, K=[3 x 4 x 2], IN={1, 4, 8, 10, 10}, OCN=4, G=4, S=[1 x 2 x 1], BIAS, OCV/CPU)\|0.009\|0.009\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.001, K=[3 x 3 x 3], IN={1, 2, 25, 19, 19}, OCN=2, G=2, S=[1 x 2 x 2], P=(2, 2) x (2, 2) x (2, 2), PM=SAME, OCV/CPU)\|0.066\|0.066\|1.01\| \|conv3d::Conv3D::(GFLOPS=0.002, K=[3 x 1 x 4], IN={1, 14, 5, 10, 10}, OCN=14, PM=SAME, OCV/CPU)\|0.102\|0.102\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.006, K=[5 x 5 x 5], IN={1, 4, 50, 19, 19}, OCN=4, S=[2 x 2 x 2], P=(1, 1) x (1, 1) x (1, 1), PM=VALID, OCV/CPU)\|0.328\|0.328\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.027, K=[3 x 3 x 3], IN={1, 6, 10, 38, 50}, OCN=6, PM=VALID, BIAS, OCV/CPU)\|0.693\|0.747\|0.93\| \|conv3d::Conv3D::(GFLOPS=0.030, K=[5 x 5 x 5], IN={1, 6, 19, 19, 19}, OCN=6, G=2, OCV/CPU)\|1.268\|1.266\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.045, K=[7 x 7 x 7], IN={1, 2, 38, 38, 38}, OCN=2, S=[1 x 2 x 1], OCV/CPU)\|3.530\|3.581\|0.99\| \|conv3d::Conv3D::(GFLOPS=0.053, K=[3 x 3 x 3], IN={1, 10, 98, 10, 10}, OCN=10, PM=SAME, OCV/CPU)\|1.186\|1.188\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.071, K=[7 x 7 x 7], IN={1, 6, 15, 19, 19}, OCN=6, S=[2 x 1 x 1], P=(3, 3) x (3, 3) x (3, 3), PM=SAME, BIAS, OCV/CPU)\|2.682\|2.683\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.093, K=[5 x 5 x 5], IN={1, 4, 40, 75, 75}, OCN=4, S=[2 x 2 x 2], OCV/CPU)\|4.490\|4.501\|1.00\| \|conv3d::Conv3D::(GFLOPS=0.116, K=[5 x 5 x 5], IN={1, 2, 21, 75, 100}, OCN=2, BIAS, OCV/CPU)\|8.914\|8.938\|1.00\| \|conv3d::Conv3D::(GFLOPS=1.267, K=[5 x 5 x 5], IN={1, 3, 75, 75, 100}, OCN=3, PM=SAME, BIAS, OCV/CPU)\|69.819\|69.876\|1.00\| \|conv3d::Conv3D::(GFLOPS=1.343, K=[3 x 3 x 3], IN={1, 11, 9, 150, 200}, OCN=11, PM=VALID, BIAS, OCV/CPU)\|24.058\|22.420\|1.07\| \|conv::Conv::(GFLOPS=0.177, K=[1 x 1], IN={1, 512, 26, 26}, OCN=256, OCV/CPU)\|2.240\|2.236\|1.00\| \|conv::Conv::(GFLOPS=0.177, K=[1 x 1], IN={1, 1024, 13, 13}, OCN=512, OCV/CPU)\|3.132\|3.136\|1.00\| \|conv::Conv::(GFLOPS=0.178, K=[1 x 1], IN={1, 256, 52, 52}, OCN=128, OCV/CPU)\|1.920\|1.919\|1.00\| \|conv::Conv::(GFLOPS=0.210, K=[1 x 1], IN={1, 576, 38, 50}, OCN=96, PM=SAME, BIAS, OCV/CPU)\|2.343\|2.346\|1.00\| \|conv::Conv::(GFLOPS=0.231, K=[3 x 3], IN={1, 128, 56, 56}, OCN=32, P=[1 x 1], OCV/CPU)\|1.234\|1.116\|1.11\| \|conv::Conv::(GFLOPS=0.231, K=[3 x 3], IN={1, 256, 14, 14}, OCN=256, P=[1 x 1], OCV/CPU)\|1.109\|1.121\|0.99\| \|conv::Conv::(GFLOPS=0.280, K=[1 x 1], IN={1, 576, 38, 50}, OCN=128, PM=SAME, BIAS, OCV/CPU)\|3.197\|3.084\|1.04\| \|conv::Conv::(GFLOPS=0.302, K=[3 x 3], IN={1, 64, 64, 64}, OCN=64, PM=SAME, OCV/CPU)\|1.123\|1.148\|0.98\| \|conv::Conv::(GFLOPS=0.357, K=[1 x 1], IN={1, 64, 208, 208}, OCN=64, OCV/CPU)\|4.836\|5.061\|0.96\| \|conv::Conv::(GFLOPS=0.420, K=[3 x 3], IN={1, 96, 38, 50}, OCN=128, PM=SAME, BIAS, OCV/CPU)\|1.535\|1.463\|1.05\| \|conv::Conv::(GFLOPS=0.472, K=[3 x 3], IN={1, 128, 40, 40}, OCN=128, PM=SAME, OCV/CPU)\|1.756\|1.584\|1.11\| \|conv::Conv::(GFLOPS=0.472, K=[3 x 3], IN={1, 256, 20, 20}, OCN=256, PM=SAME, OCV/CPU)\|1.821\|1.820\|1.00\| \|conv::Conv::(GFLOPS=0.472, K=[3 x 3], IN={1, 512, 10, 10}, OCN=512, PM=SAME, OCV/CPU)\|7.049\|6.672\|1.06\| \|conv::Conv::(GFLOPS=0.561, K=[3 x 3], IN={1, 128, 38, 50}, OCN=128, PM=SAME, BIAS, OCV/CPU)\|1.967\|1.922\|1.02\| \|conv::Conv::(GFLOPS=0.624, K=[3 x 3], IN={1, 128, 46, 46}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|1.943\|1.977\|0.98\| \|conv::Conv::(GFLOPS=0.701, K=[3 x 3], IN={1, 128, 38, 50}, OCN=160, PM=SAME, BIAS, OCV/CPU)\|2.464\|2.310\|1.07\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 64, 104, 104}, OCN=64, P=[1 x 1], OCV/CPU)\|2.860\|2.904\|0.98\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 128, 52, 52}, OCN=128, P=[1 x 1], OCV/CPU)\|2.428\|2.483\|0.98\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 256, 26, 26}, OCN=256, P=[1 x 1], OCV/CPU)\|2.955\|2.983\|0.99\| \|conv::Conv::(GFLOPS=0.798, K=[3 x 3], IN={1, 512, 13, 13}, OCN=512, P=[1 x 1], OCV/CPU)\|4.328\|4.484\|0.97\| \|conv::Conv::(GFLOPS=0.830, K=[3 x 3], IN={1, 64, 75, 100}, OCN=96, PM=SAME, BIAS, OCV/CPU)\|2.712\|2.778\|0.98\| \|conv::Conv::(GFLOPS=0.958, K=[3 x 3], IN={1, 192, 38, 38}, OCN=192, PM=SAME, OCV/CPU)\|3.205\|3.331\|0.96\| \|conv::Conv::(GFLOPS=0.958, K=[3 x 3], IN={1, 384, 19, 19}, OCN=384, PM=SAME, OCV/CPU)\|4.193\|4.412\|0.95\| \|conv::Conv::(GFLOPS=1.022, K=[3 x 3], IN={1, 576, 19, 19}, OCN=273, PM=SAME, BIAS, OCV/CPU)\|5.026\|4.565\|1.10\| \|conv::Conv::(GFLOPS=1.112, K=[3 x 3], IN={1, 512, 10, 10}, OCN=1206, P=[1 x 1], BIAS, OCV/CPU)\|14.490\|14.213\|1.02\| \|conv::Conv::(GFLOPS=1.181, K=[3 x 3], IN={1, 64, 160, 200}, OCN=128, S=[2 x 2], P=[1 x 1], BIAS, OCV/CPU)\|14.886\|14.003\|1.06\| \|conv::Conv::(GFLOPS=1.182, K=[3 x 3], IN={1, 32, 320, 400}, OCN=64, S=[2 x 2], P=[1 x 1], BIAS, OCV/CPU)\|15.923\|15.184\|1.05\| \|conv::Conv::(GFLOPS=1.195, K=[9 x 9], IN={1, 32, 240, 320}, OCN=3, P=[4 x 4], BIAS, OCV/CPU)\|45.136\|41.696\|1.08\| \|conv::Conv::(GFLOPS=1.196, K=[3 x 3], IN={1, 384, 26, 26}, OCN=256, P=[1 x 1], OCV/CPU)\|4.995\|4.631\|1.08\| \|conv::Conv::(GFLOPS=1.210, K=[3 x 3], IN={1, 32, 256, 256}, OCN=32, PM=SAME, OCV/CPU)\|6.402\|6.261\|1.02\| \|conv::Conv::(GFLOPS=1.245, K=[3 x 3], IN={1, 64, 75, 75}, OCN=192, PM=SAME, BIAS, OCV/CPU)\|4.478\|3.965\|1.13\| \|conv::Conv::(GFLOPS=1.245, K=[3 x 3], IN={1, 96, 75, 100}, OCN=96, PM=SAME, BIAS, OCV/CPU)\|3.908\|3.978\|0.98\| \|conv::Conv::(GFLOPS=1.248, K=[3 x 3], IN={1, 256, 46, 46}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|4.176\|4.206\|0.99\| \|conv::Conv::(GFLOPS=1.258, K=[3 x 3], IN={1, 1280, 10, 10}, OCN=546, PM=SAME, BIAS, OCV/CPU)\|21.509\|21.136\|1.02\| \|conv::Conv::(GFLOPS=1.261, K=[3 x 3], IN={1, 192, 38, 50}, OCN=192, PM=SAME, BIAS, OCV/CPU)\|4.426\|4.082\|1.08\| \|conv::Conv::(GFLOPS=1.416, K=[3 x 3], IN={1, 128, 62, 82}, OCN=128, BIAS, OCV/CPU)\|4.098\|4.289\|0.96\| \|conv::Conv::(GFLOPS=1.500, K=[3 x 3], IN={1, 128, 64, 84}, OCN=128, BIAS, OCV/CPU)\|4.646\|5.105\|0.91\| \|conv::Conv::(GFLOPS=1.586, K=[3 x 3], IN={1, 128, 66, 86}, OCN=128, BIAS, OCV/CPU)\|4.746\|4.724\|1.00\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 256, 26, 26}, OCN=512, P=[1 x 1], OCV/CPU)\|5.614\|5.779\|0.97\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 256, 52, 52}, OCN=512, S=[2 x 2], P=[1 x 1], OCV/CPU)\|21.909\|20.718\|1.06\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 512, 13, 13}, OCN=1024, P=[1 x 1], OCV/CPU)\|8.256\|8.290\|1.00\| \|conv::Conv::(GFLOPS=1.595, K=[3 x 3], IN={1, 512, 26, 26}, OCN=1024, S=[2 x 2], P=[1 x 1], OCV/CPU)\|25.196\|23.267\|1.08\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 64, 104, 104}, OCN=128, P=[1 x 1], OCV/CPU)\|5.721\|5.172\|1.11\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 64, 208, 208}, OCN=128, S=[2 x 2], P=[1 x 1], OCV/CPU)\|20.066\|18.322\|1.10\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 128, 52, 52}, OCN=256, P=[1 x 1], OCV/CPU)\|4.448\|4.542\|0.98\| \|conv::Conv::(GFLOPS=1.596, K=[3 x 3], IN={1, 128, 104, 104}, OCN=256, S=[2 x 2], P=[1 x 1], OCV/CPU)\|19.193\|19.013\|1.01\| \|conv::Conv::(GFLOPS=1.598, K=[3 x 3], IN={1, 32, 208, 208}, OCN=64, P=[1 x 1], OCV/CPU)\|6.009\|5.964\|1.01\| \|conv::Conv::(GFLOPS=1.598, K=[3 x 3], IN={1, 32, 416, 416}, OCN=64, S=[2 x 2], P=[1 x 1], OCV/CPU)\|20.169\|20.009\|1.01\| \|conv::Conv::(GFLOPS=1.659, K=[3 x 3], IN={1, 960, 10, 10}, OCN=960, PM=SAME, OCV/CPU)\|22.584\|23.423\|0.96\| \|conv::Conv::(GFLOPS=1.660, K=[3 x 3], IN={1, 128, 75, 75}, OCN=128, G=128, P=[1 x 1], BIAS, OCV/CPU)\|0.372\|0.504\|0.74\| \|conv::Conv::(GFLOPS=1.660, K=[3 x 3], IN={1, 128, 75, 75}, OCN=128, PM=SAME, OCV/CPU)\|5.426\|5.456\|0.99\| \|conv::Conv::(GFLOPS=1.675, K=[3 x 3], IN={1, 128, 68, 88}, OCN=128, BIAS, OCV/CPU)\|4.945\|5.221\|0.95\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 256, 38, 38}, OCN=256, G=256, P=[1 x 1], BIAS, OCV/CPU)\|0.210\|0.261\|0.81\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 256, 38, 38}, OCN=256, PM=SAME, OCV/CPU)\|5.720\|5.997\|0.95\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 512, 19, 19}, OCN=512, G=512, P=[1 x 1], BIAS, OCV/CPU)\|0.149\|0.161\|0.93\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 512, 19, 19}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|7.154\|7.225\|0.99\| \|conv::Conv::(GFLOPS=1.704, K=[3 x 3], IN={1, 512, 19, 19}, OCN=512, PM=SAME, OCV/CPU)\|7.184\|7.223\|0.99\| \|conv::Conv::(GFLOPS=1.766, K=[3 x 3], IN={1, 128, 70, 90}, OCN=128, BIAS, OCV/CPU)\|5.324\|5.343\|1.00\| \|conv::Conv::(GFLOPS=1.859, K=[3 x 3], IN={1, 128, 72, 92}, OCN=128, BIAS, OCV/CPU)\|5.114\|5.238\|0.98\| \|conv::Conv::(GFLOPS=1.888, K=[3 x 3], IN={1, 1024, 10, 10}, OCN=1024, G=1024, P=[1 x 1], BIAS, OCV/CPU)\|0.111\|0.121\|0.92\| \|conv::Conv::(GFLOPS=1.888, K=[3 x 3], IN={1, 1024, 10, 10}, OCN=1024, PM=SAME, OCV/CPU)\|25.907\|26.804\|0.97\| \|conv::Conv::(GFLOPS=1.954, K=[3 x 3], IN={1, 128, 74, 94}, OCN=128, BIAS, OCV/CPU)\|5.695\|5.654\|1.01\| \|conv::Conv::(GFLOPS=1.995, K=[9 x 9], IN={1, 3, 320, 400}, OCN=32, P=[4 x 4], BIAS, OCV/CPU)\|27.435\|27.566\|1.00\| \|conv::Conv::(GFLOPS=2.052, K=[3 x 3], IN={1, 128, 76, 96}, OCN=128, BIAS, OCV/CPU)\|6.944\|6.164\|1.13\| \|conv::Conv::(GFLOPS=2.100, K=[3 x 3], IN={1, 144, 75, 75}, OCN=144, PM=SAME, OCV/CPU)\|7.180\|6.717\|1.07\| \|conv::Conv::(GFLOPS=2.153, K=[3 x 3], IN={1, 128, 78, 98}, OCN=128, BIAS, OCV/CPU)\|6.817\|6.050\|1.13\| \|conv::Conv::(GFLOPS=2.156, K=[3 x 3], IN={1, 576, 19, 19}, OCN=576, PM=SAME, OCV/CPU)\|9.225\|8.660\|1.07\| \|conv::Conv::(GFLOPS=2.255, K=[3 x 3], IN={1, 128, 80, 100}, OCN=128, BIAS, OCV/CPU)\|7.496\|6.625\|1.13\| \|conv::Conv::(GFLOPS=2.719, K=[3 x 3], IN={1, 96, 256, 256}, OCN=96, S=[2 x 2], PM=SAME, OCV/CPU)\|35.520\|36.056\|0.99\| \|conv::Conv::(GFLOPS=3.319, K=[3 x 3], IN={1, 128, 75, 75}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|9.990\|9.702\|1.03\| \|conv::Conv::(GFLOPS=3.321, K=[3 x 3], IN={1, 64, 150, 150}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|10.517\|10.746\|0.98\| \|conv::Conv::(GFLOPS=3.398, K=[7 x 7], IN={1, 128, 46, 46}, OCN=128, P=[3 x 3], BIAS, OCV/CPU)\|36.702\|36.731\|1.00\| \|conv::Conv::(GFLOPS=3.407, K=[3 x 3], IN={1, 512, 19, 19}, OCN=1024, D=[6 x 6], P=[6 x 6], BIAS, OCV/CPU)\|41.035\|38.280\|1.07\| \|conv::Conv::(GFLOPS=3.408, K=[3 x 3], IN={1, 256, 38, 38}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|10.981\|10.573\|1.04\| \|conv::Conv::(GFLOPS=4.247, K=[3 x 3], IN={1, 480, 32, 32}, OCN=480, PM=SAME, OCV/CPU)\|12.863\|12.384\|1.04\| \|conv::Conv::(GFLOPS=4.247, K=[5 x 5], IN={1, 144, 128, 128}, OCN=144, S=[2 x 2], PM=SAME, OCV/CPU)\|50.437\|54.088\|0.93\| \|conv::Conv::(GFLOPS=4.566, K=[7 x 7], IN={1, 172, 46, 46}, OCN=128, P=[3 x 3], BIAS, OCV/CPU)\|50.650\|50.635\|1.00\| \|conv::Conv::(GFLOPS=4.993, K=[3 x 3], IN={1, 256, 46, 46}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|14.696\|14.606\|1.01\| \|conv::Conv::(GFLOPS=4.993, K=[3 x 3], IN={1, 512, 46, 46}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|16.201\|15.426\|1.05\| \|conv::Conv::(GFLOPS=4.994, K=[3 x 3], IN={1, 128, 92, 92}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|16.061\|14.292\|1.12\| \|conv::Conv::(GFLOPS=4.997, K=[3 x 3], IN={1, 64, 184, 184}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|17.743\|18.250\|0.97\| \|conv::Conv::(GFLOPS=5.780, K=[5 x 5], IN={1, 672, 32, 32}, OCN=672, S=[2 x 2], PM=SAME, OCV/CPU)\|77.909\|78.165\|1.00\| \|conv::Conv::(GFLOPS=6.116, K=[3 x 3], IN={1, 1152, 16, 16}, OCN=1152, PM=SAME, OCV/CPU)\|21.579\|21.879\|0.99\| \|conv::Conv::(GFLOPS=6.118, K=[3 x 3], IN={1, 144, 128, 128}, OCN=144, PM=SAME, OCV/CPU)\|20.424\|19.589\|1.04\| \|conv::Conv::(GFLOPS=6.637, K=[3 x 3], IN={1, 256, 75, 75}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|19.389\|19.461\|1.00\| \|conv::Conv::(GFLOPS=6.638, K=[3 x 3], IN={1, 128, 150, 150}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|21.319\|20.358\|1.05\| \|conv::Conv::(GFLOPS=6.641, K=[3 x 3], IN={1, 64, 150, 200}, OCN=192, PM=SAME, BIAS, OCV/CPU)\|22.609\|21.826\|1.04\| \|conv::Conv::(GFLOPS=6.641, K=[3 x 3], IN={1, 64, 300, 300}, OCN=64, P=[1 x 1], BIAS, OCV/CPU)\|25.497\|25.789\|0.99\| \|conv::Conv::(GFLOPS=6.814, K=[3 x 3], IN={1, 512, 38, 38}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|21.966\|22.108\|0.99\| \|conv::Conv::(GFLOPS=8.025, K=[3 x 3], IN={1, 1024, 19, 19}, OCN=1206, P=[1 x 1], BIAS, OCV/CPU)\|35.883\|33.470\|1.07\| \|conv::Conv::(GFLOPS=9.986, K=[3 x 3], IN={1, 512, 46, 46}, OCN=512, P=[1 x 1], BIAS, OCV/CPU)\|31.041\|29.314\|1.06\| \|conv::Conv::(GFLOPS=9.987, K=[3 x 3], IN={1, 256, 92, 92}, OCN=256, P=[1 x 1], BIAS, OCV/CPU)\|29.922\|28.145\|1.06\| \|conv::Conv::(GFLOPS=9.989, K=[3 x 3], IN={1, 128, 184, 184}, OCN=128, P=[1 x 1], BIAS, OCV/CPU)\|31.624\|31.148\|1.02\| \|conv::Conv::(GFLOPS=9.993, K=[3 x 3], IN={1, 64, 368, 368}, OCN=64, P=[1 x 1], BIAS, OCV/CPU)\|38.564\|39.164\|0.98\| \|conv::Conv::(GFLOPS=10.087, K=[3 x 3], IN={1, 576, 38, 50}, OCN=512, PM=SAME, BIAS, OCV/CPU)\|31.502\|30.269\|1.04\| \|conv::Conv::(GFLOPS=10.701, K=[3 x 3], IN={1, 512, 38, 38}, OCN=804, P=[1 x 1], BIAS, OCV/CPU)\|34.248\|34.589\|0.99\| \|conv::Conv::(GFLOPS=11.797, K=[5 x 5], IN={1, 240, 64, 64}, OCN=240, PM=SAME, OCV/CPU)\|130.211\|134.120\|0.97\| \|conv::Conv::(GFLOPS=11.797, K=[5 x 5], IN={1, 480, 32, 32}, OCN=480, PM=SAME, OCV/CPU)\|127.490\|132.874\|0.96\| \|conv::Conv::(GFLOPS=16.987, K=[5 x 5], IN={1, 1152, 16, 16}, OCN=1152, PM=SAME, OCV/CPU)\|199.834\|200.081\|1.00\| \|conv::Conv::(GFLOPS=23.122, K=[5 x 5], IN={1, 672, 32, 32}, OCN=672, PM=SAME, OCV/CPU)\|247.346\|247.523\|1.00\| ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux AVX2,Custom Win build_image:Custom Win=msvs2019 CPU_BASELINE:Custom Win=AVX512_SKX ```	2023-03-10 11:59:49 +03:00
Alexander Alekhin	9eb5e39ff3	dnn(tflite): fix wrong axis normalization	2023-02-21 21:20:37 +00:00
Alexander Alekhin	bdff0949bb	dnn(tflite): add 3rdparty flatbuffers with pre-generated schema	2023-02-21 16:06:19 +00:00
Zihao Mu	20dac7ea48	Merge pull request #23255 from zihaomu:fused_cuda_naryeltwise DNN: fuse conv+naryEletwise on CUDA backend.	2023-02-17 10:18:13 +00:00
Alexander Alekhin	58d8a2702a	Merge pull request #23243 from WanliZhong:accelerate_palm_det	2023-02-14 16:25:02 +00:00
Dmitry Kurtaev	76350cd30f	Merge pull request #23161 from dkurt:dnn_tflite TFLite models importer * initial commit * Refactor TFLiteImporter * Better FlatBuffers detection * Add permute before 4D->3D reshape * Track layers layout * TFLite Convolution2DTransposeBias layer * Skip TFLite tests without FlatBuffers * Fix check of FlatBuffers in tests. Add readNetFromTFLite from buffer * TFLite Max Unpooling test * Add skip for TFLite unpooling test * Revert DW convolution workaround * Fix ObjC bindings * Better errors handling * Regenerate TFLite schema using flatc * dnn(tflite): more checks, better logging * Checks for unimplemented fusion. Fix tests	2023-02-13 14:00:20 +00:00
Yuantao Feng	c2b7c1f13b	Merge pull request #23219 from fengyuentau:add_gelu Add GELU layer for vision transformers * add gelu and gelu approximation * drop setKernelParams	2023-02-10 18:03:29 +00:00
wanli	c8f5e228fc	release MUL and ADD operator on CUDA	2023-02-10 19:33:59 +08:00
Alexander Alekhin	96a45e842e	Merge pull request #23061 from WanliZhong:gemm_cuda DNN: make GEMM can be supported with transA and transB in CUDA	2023-02-09 00:06:32 +03:00
wanli	4718a4bf81	make GEMM can be supported with transA and transB in CUDA	2023-01-31 15:14:17 +08:00
Alexander Alekhin	f33598f55e	Merge branch 4.x	2023-01-28 17:31:32 +00:00
Alexander Alekhin	cd44aa0bb1	Merge pull request #23162 from zihaomu:issue_23151	2023-01-28 13:00:43 +00:00
zihaomu	f45a12439a	fix depth wise issue.	2023-01-28 11:41:00 +08:00
Yuantao Feng	4d918ba40b	Merge pull request #23047 from fengyuentau:layer_norm dnn: add layer normalization for vision transformers * add layer norm onnx parser, impl and tests * add onnx graph simplifier for layer norm expanded * handle the case when constants are of type Initializer * add test case for layer norm expanded with initializers * use CV_Assert & CV_CheckType in place of CV_Assert_N; use forward_fallback for OCL_FP16 * use const ref / ref in parameters of invoker::run; extract inner const if from nested loop; use size_t in place of ull * template hasBias * remove trailing whitespace * use pointer parameter with null check; move normSize division & mean_square division outside of loop; use std::max to ensure positive value before std::sqrt * refactor implementation, optimize parallel_for * disable layer norm expanded * remove the removal of layer norm optional outputs	2023-01-27 16:35:59 +03:00
Alexander Alekhin	8ffc06ff72	Merge pull request #23173 from tomoaki0705:fix_warning_master	2023-01-23 15:33:16 +00:00
Tomoaki Teshima	186c18668c	suppress warning	2023-01-23 22:47:43 +09:00
Alexander Alekhin	18cbfa4a4f	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2023-01-23 00:11:12 +00:00
Alexander Alekhin	a42d879925	Merge branch 4.x	2023-01-18 22:03:42 +00:00
Alexander Alekhin	3d5e3a910f	Merge pull request #23096 from zihaomu:issue_23074	2023-01-12 00:51:04 +00:00
zihaomu	840b1d5c94	add depthwise add fuse	2023-01-11 08:42:51 +08:00
Alexander Alekhin	593a376566	Merge branch 4.x	2023-01-09 11:08:02 +00:00
zihaomu	82616eec41	fix possible segmentation fault error in winograd on x86	2023-01-09 13:40:04 +08:00
Alexander Alekhin	9627ab9462	Merge pull request #23050 from zihaomu:fix_memory	2022-12-28 10:04:25 +00:00
zihaomu	71765858dc	fix invalid memory access	2022-12-28 17:16:11 +08:00
Alexander Alekhin	9a2a34f94e	dnn(openvino): remove undefined status	2022-12-28 06:55:00 +00:00
Alexander Alekhin	fc27a343e9	Merge pull request #22905 from zihaomu:clean_up_conv3d_1d	2022-12-26 17:39:18 +00:00
Alexander Alekhin	b42c11de82	pre: OpenCV 4.7.0 (version++)	2022-12-25 17:00:22 +00:00
Alexander Alekhin	a494c75bfe	pre: OpenCV 3.4.19 (version++)	2022-12-25 16:59:47 +00:00
Dmitry Kurtaev	8681686d8f	Merge pull request #22957 from dkurt:new_openvino_api Switch to new OpenVINO API after 2022.1 release * Pass Layer_Test_Convolution_DLDT.Accuracy/0 test * Pass test Test_Caffe_layers.Softmax * Failed 136 tests * Fix Concat. Failed 120 tests * Custom nGraph ops. 19 failed tests * Set and get properties from Core * Read model from buffer * Change MaxPooling layer output names. Restore reshape * Cosmetic changes * Cosmetic changes * Override getOutputsInfo * Fixes for OpenVINO < 2022.1 * Async inference for 2021.4 and less * Compile model with config * Fix serialize for 2022.1 * Asynchronous inference with 2022.1 * Handle 1d outputs * Work with model with dynamic output shape * Fixes with 1d output for old API * Control outputs by nGraph function for all OpenVINO versions * Refer inputs in PrePostProcessor by indices * Fix cycled dependency between InfEngineNgraphNode and InfEngineNgraphNet. Add InferRequest callback only for async inference. Do not capture InferRequest object. * Fix tests thresholds * Fix HETERO:GPU,CPU plugin issues with unsupported layer	2022-12-23 16:58:41 +00:00
Alexander Smorkalov	9012e6dd9b	Merge pull request #22965 from vrabaud:numpy_fix Remove references to deprecated NumPy type aliases.	2022-12-23 15:34:02 +03:00
Alexander Smorkalov	4930516652	Merge pull request #22898 from fengyuentau:slice_neg_steps dnn: support ONNX Slice with negative steps by adding and using cv::flipND	2022-12-23 14:15:06 +03:00
Vincent Rabaud	ad568edd7f	Remove references to deprecated NumPy type aliases. This change replaces references to a number of deprecated NumPy type aliases (np.bool, np.int, np.float, np.complex, np.object, np.str) with their recommended replacement (bool, int, float, complex, object, str). Those types were deprecated in 1.20 and are removed in 1.24, cf https://github.com/numpy/numpy/pull/22607.	2022-12-23 13:53:49 +03:00
Alexander Alekhin	1f41d06f9a	Merge pull request #23008 from mshabunin:fix-yolov4-tiny-hash	2022-12-23 10:14:25 +00:00
zihaomu	71c6339af0	remove old convolution branch, and optimize conv3d and conv1d.	2022-12-23 16:50:28 +08:00
fengyuentau	34a0897f90	add cv::flipND; support onnx slice with negative steps via cv::flipND	2022-12-23 16:39:53 +08:00
Maksim Shabunin	d35fbe6bfc	dnn: updated YOLOv4-tiny model and tests	2022-12-22 15:49:21 +03:00
Alexander Alekhin	6b4f3e5fab	Merge pull request #22993 from alalek:fixup_21738	2022-12-21 19:50:51 +00:00
Yuantao Feng	a2b3acfc6e	dnn: add the CANN backend (#22634 ) * cann backend impl v1 * cann backend impl v2: use opencv parsers to build models for cann * adjust fc according to the new transA and transB * put cann net in cann backend node and reuse forwardLayer * use fork() to create a child process and compile cann model * remove legacy code * remove debug code * fall bcak to CPU backend if there is one layer not supoorted by CANN backend * fix netInput forward	2022-12-21 09:04:41 +03:00
Alexander Alekhin	cdbb893b27	dnn: disable OpenCL code path in MatMul processing - this mode is not supported by 22828	2022-12-20 09:46:48 +00:00
Alexander Alekhin	1102b7eff8	dnn: fix gather layer implementation - support FP16 data	2022-12-20 06:09:34 +00:00
zoom	4891818114	make MatMul support 3D or 4D with broadcast	2022-12-15 10:36:08 +08:00
Alexander Alekhin	8ba44e7d55	Merge pull request #22882 from zihaomu:gemm_first_const	2022-12-08 14:18:33 +00:00
Zihao Mu	0a650b573b	Merge pull request #22840 from zihaomu:optimze_conv_memory_usage DNN: reduce the memory used in convolution layer * reduce the memory in winograd and disabel the test when usage memory is larger than 2gb. * remove VERY_LOG tag	2022-12-08 12:57:13 +00:00
Alexander Alekhin	b16f76eede	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-12-03 12:39:41 +00:00
Alexander Alekhin	d16b3b2487	dnn(test): restore openvino tests with 'Cannot get memory' message	2022-12-03 01:34:48 +00:00
Alexander Alekhin	74d0b4cc78	dnn(openvino): fix custom layers BlockingDesc	2022-12-03 01:34:10 +00:00
Alexander Smorkalov	e14ca39fd7	Merge pull request #22857 from fengyuentau:batched_nms dnn: add batched nms	2022-11-30 12:37:49 +03:00
Alexander Smorkalov	421ba8730a	Merge pull request #22809 from fengyuentau:tile dnn: support ONNX Tile	2022-11-29 14:42:28 +03:00
zihaomu	0d56524b72	gemm support transA and transB, and first input is constance.	2022-11-29 17:13:36 +08:00
fengyuentau	9fded9ca53	batched nms impl	2022-11-29 15:32:34 +08:00
fengyuentau	441624a5fb	tile impl	2022-11-29 11:15:38 +08:00
zoom	5044af69d1	let MatMul can work when both two inputs are const	2022-11-27 17:32:41 +08:00
Alexander Smorkalov	6ca205a029	Merge pull request #22478 from WanliZhong:nary_eltwise_cuda DNN: Let part of the operators in nary_eltwise support CUDA	2022-11-22 16:15:50 +03:00
zihaomu	5bf64e7dfe	fix the infinite loop in tf importer of 3.4 branch	2022-11-15 11:42:10 +08:00
zoom	ef2677b0a6	Make MatMul layer support 3d or 4d operation with const input	2022-11-10 11:41:44 +08:00
zoom	11d492b0b9	Let part of the operators in nary_eltwise support cuda	2022-11-02 14:08:21 +08:00
Zihao Mu	17f2b56291	remove never used code in onnximporter	2022-11-02 10:45:16 +08:00
Alexander Alekhin	ee9137f176	Merge pull request #22725 from zihaomu:fix_infinit_loop_in_tf	2022-10-31 17:03:03 +00:00
Zihao Mu	903bf0147e	Merge pull request #22666 from zihaomu:support_onnx_qdq_model DNN: let Quant and Dequant of ONNX_importer support the Constant input. * let Quant and Dequant support the Constant input. * fix negative value of axis.	2022-10-31 16:06:31 +00:00
Zihao Mu	18fbb72f7d	fix the infinite loop in tf importer.	2022-10-31 20:10:25 +08:00
Alexander Smorkalov	22f8fb4d5c	Do not fail tests in Yolo v7 model was not found.	2022-10-24 17:59:18 +03:00
Alexander Smorkalov	23edec83fb	Merge pull request #22667 from zihaomu:bug_fix_in_winograd DNN: bug fixed in Winograd	2022-10-21 17:54:13 +03:00
Alexander Smorkalov	e4cd430710	Merge pull request #22653 from WanliZhong:issue22597 DNN-TF: let StridedSlice layer support const input	2022-10-21 17:51:00 +03:00
Dmitry Kurtaev	35b2cff295	Merge pull request #22656 from dkurt:halide_fixes * Fixes for Halide * Enable some Halide tests	2022-10-21 17:49:49 +03:00
Zihao Mu	cee8c86b6e	fixed bug at winograd of SIMD128 and more robust code.	2022-10-21 19:14:54 +08:00
Alexander Smorkalov	5d292826b2	Merge pull request #22593 from zihaomu:optimize_wino optimize winograd futher more	2022-10-19 13:08:32 +03:00
Alexander Smorkalov	f378f02954	Merge pull request #22652 from rogday:cuda_test_fixes Address CUDA-related errors	2022-10-19 09:37:12 +03:00

1 2 3 4 5 ...

2121 Commits