Commit Graph

1619 Commits

Author SHA1 Message Date
Alexander Alekhin
96a45e842e
Merge pull request #23061 from WanliZhong:gemm_cuda
DNN: make GEMM can be supported with transA and transB in CUDA
2023-02-09 00:06:32 +03:00
wanli
4718a4bf81 make GEMM can be supported with transA and transB in CUDA 2023-01-31 15:14:17 +08:00
Alexander Alekhin
cd44aa0bb1 Merge pull request #23162 from zihaomu:issue_23151 2023-01-28 13:00:43 +00:00
zihaomu
f45a12439a fix depth wise issue. 2023-01-28 11:41:00 +08:00
Yuantao Feng
4d918ba40b
Merge pull request #23047 from fengyuentau:layer_norm
dnn: add layer normalization for vision transformers

* add layer norm onnx parser, impl and tests

* add onnx graph simplifier for layer norm expanded

* handle the case when constants are of type Initializer

* add test case for layer norm expanded with initializers

* use CV_Assert & CV_CheckType in place of CV_Assert_N; use forward_fallback for OCL_FP16

* use const ref / ref in parameters of invoker::run; extract inner const if from nested loop; use size_t in place of ull

* template hasBias

* remove trailing whitespace

* use pointer parameter with null check; move normSize division & mean_square division outside of loop; use std::max to ensure positive value before std::sqrt

* refactor implementation, optimize parallel_for

* disable layer norm expanded

* remove the removal of layer norm optional outputs
2023-01-27 16:35:59 +03:00
Tomoaki Teshima
186c18668c suppress warning 2023-01-23 22:47:43 +09:00
Alexander Alekhin
3d5e3a910f Merge pull request #23096 from zihaomu:issue_23074 2023-01-12 00:51:04 +00:00
zihaomu
840b1d5c94 add depthwise add fuse 2023-01-11 08:42:51 +08:00
zihaomu
82616eec41 fix possible segmentation fault error in winograd on x86 2023-01-09 13:40:04 +08:00
Alexander Alekhin
9627ab9462 Merge pull request #23050 from zihaomu:fix_memory 2022-12-28 10:04:25 +00:00
zihaomu
71765858dc fix invalid memory access 2022-12-28 17:16:11 +08:00
Alexander Alekhin
9a2a34f94e dnn(openvino): remove undefined status 2022-12-28 06:55:00 +00:00
Alexander Alekhin
fc27a343e9 Merge pull request #22905 from zihaomu:clean_up_conv3d_1d 2022-12-26 17:39:18 +00:00
Dmitry Kurtaev
8681686d8f
Merge pull request #22957 from dkurt:new_openvino_api
Switch to new OpenVINO API after 2022.1 release

* Pass Layer_Test_Convolution_DLDT.Accuracy/0 test

* Pass test Test_Caffe_layers.Softmax

* Failed 136 tests

* Fix Concat. Failed 120 tests

* Custom nGraph ops. 19 failed tests

* Set and get properties from Core

* Read model from buffer

* Change MaxPooling layer output names. Restore reshape

* Cosmetic changes

* Cosmetic changes

* Override getOutputsInfo

* Fixes for OpenVINO < 2022.1

* Async inference for 2021.4 and less

* Compile model with config

* Fix serialize for 2022.1

* Asynchronous inference with 2022.1

* Handle 1d outputs

* Work with model with dynamic output shape

* Fixes with 1d output for old API

* Control outputs by nGraph function for all OpenVINO versions

* Refer inputs in PrePostProcessor by indices

* Fix cycled dependency between InfEngineNgraphNode and InfEngineNgraphNet.
Add InferRequest callback only for async inference. Do not capture InferRequest object.

* Fix tests thresholds

* Fix HETERO:GPU,CPU plugin issues with unsupported layer
2022-12-23 16:58:41 +00:00
Alexander Smorkalov
4930516652
Merge pull request #22898 from fengyuentau:slice_neg_steps
dnn: support ONNX Slice with negative steps by adding and using cv::flipND
2022-12-23 14:15:06 +03:00
zihaomu
71c6339af0 remove old convolution branch, and optimize conv3d and conv1d. 2022-12-23 16:50:28 +08:00
fengyuentau
34a0897f90 add cv::flipND; support onnx slice with negative steps via cv::flipND 2022-12-23 16:39:53 +08:00
Alexander Alekhin
6b4f3e5fab Merge pull request #22993 from alalek:fixup_21738 2022-12-21 19:50:51 +00:00
Yuantao Feng
a2b3acfc6e
dnn: add the CANN backend (#22634)
* cann backend impl v1

* cann backend impl v2: use opencv parsers to build models for cann

* adjust fc according to the new transA and transB

* put cann net in cann backend node and reuse forwardLayer

* use fork() to create a child process and compile cann model

* remove legacy code

* remove debug code

* fall bcak to CPU backend if there is one layer not supoorted by CANN backend

* fix netInput forward
2022-12-21 09:04:41 +03:00
Alexander Alekhin
cdbb893b27 dnn: disable OpenCL code path in MatMul processing
- this mode is not supported by 22828
2022-12-20 09:46:48 +00:00
Alexander Alekhin
1102b7eff8 dnn: fix gather layer implementation
- support FP16 data
2022-12-20 06:09:34 +00:00
zoom
4891818114 make MatMul support 3D or 4D with broadcast 2022-12-15 10:36:08 +08:00
Alexander Alekhin
8ba44e7d55 Merge pull request #22882 from zihaomu:gemm_first_const 2022-12-08 14:18:33 +00:00
Zihao Mu
0a650b573b
Merge pull request #22840 from zihaomu:optimze_conv_memory_usage
DNN: reduce the memory used in convolution layer

* reduce the memory in winograd and disabel the test when usage memory is larger than 2gb.

* remove VERY_LOG tag
2022-12-08 12:57:13 +00:00
Alexander Alekhin
b16f76eede Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-12-03 12:39:41 +00:00
Alexander Alekhin
74d0b4cc78 dnn(openvino): fix custom layers BlockingDesc 2022-12-03 01:34:10 +00:00
Alexander Smorkalov
e14ca39fd7
Merge pull request #22857 from fengyuentau:batched_nms
dnn: add batched nms
2022-11-30 12:37:49 +03:00
Alexander Smorkalov
421ba8730a
Merge pull request #22809 from fengyuentau:tile
dnn: support ONNX Tile
2022-11-29 14:42:28 +03:00
zihaomu
0d56524b72 gemm support transA and transB, and first input is constance. 2022-11-29 17:13:36 +08:00
fengyuentau
9fded9ca53 batched nms impl 2022-11-29 15:32:34 +08:00
fengyuentau
441624a5fb tile impl 2022-11-29 11:15:38 +08:00
zoom
5044af69d1 let MatMul can work when both two inputs are const 2022-11-27 17:32:41 +08:00
Alexander Smorkalov
6ca205a029
Merge pull request #22478 from WanliZhong:nary_eltwise_cuda
DNN: Let part of the operators in nary_eltwise support CUDA
2022-11-22 16:15:50 +03:00
zihaomu
5bf64e7dfe fix the infinite loop in tf importer of 3.4 branch 2022-11-15 11:42:10 +08:00
zoom
ef2677b0a6 Make MatMul layer support 3d or 4d operation with const input 2022-11-10 11:41:44 +08:00
zoom
11d492b0b9 Let part of the operators in nary_eltwise support cuda 2022-11-02 14:08:21 +08:00
Zihao Mu
17f2b56291 remove never used code in onnximporter 2022-11-02 10:45:16 +08:00
Alexander Alekhin
ee9137f176 Merge pull request #22725 from zihaomu:fix_infinit_loop_in_tf 2022-10-31 17:03:03 +00:00
Zihao Mu
903bf0147e
Merge pull request #22666 from zihaomu:support_onnx_qdq_model
DNN: let Quant and Dequant of ONNX_importer support the Constant input.

* let Quant and Dequant support the Constant input.

* fix negative value of axis.
2022-10-31 16:06:31 +00:00
Zihao Mu
18fbb72f7d fix the infinite loop in tf importer. 2022-10-31 20:10:25 +08:00
Alexander Smorkalov
23edec83fb
Merge pull request #22667 from zihaomu:bug_fix_in_winograd
DNN: bug fixed in Winograd
2022-10-21 17:54:13 +03:00
Alexander Smorkalov
e4cd430710
Merge pull request #22653 from WanliZhong:issue22597
DNN-TF: let StridedSlice layer support const input
2022-10-21 17:51:00 +03:00
Dmitry Kurtaev
35b2cff295
Merge pull request #22656 from dkurt:halide_fixes
* Fixes for Halide
* Enable some Halide tests
2022-10-21 17:49:49 +03:00
Zihao Mu
cee8c86b6e fixed bug at winograd of SIMD128 and more robust code. 2022-10-21 19:14:54 +08:00
Alexander Smorkalov
5d292826b2
Merge pull request #22593 from zihaomu:optimize_wino
optimize winograd futher more
2022-10-19 13:08:32 +03:00
Alexander Smorkalov
f378f02954
Merge pull request #22652 from rogday:cuda_test_fixes
Address CUDA-related errors
2022-10-19 09:37:12 +03:00
Smirnov Egor
dd14cf6a9c address CUDA-related errors and enable cuda in elementwise ops 2022-10-18 16:54:42 +03:00
Alexander Smorkalov
ec7fc5adca
Merge pull request #22529 from fengyuentau:scatter_scatternd
DNN: supports Scatter and ScatterND from ONNX
2022-10-17 14:57:46 +03:00
Alexander Smorkalov
02143cd0e2
Merge pull request #22531 from zihaomu:stop_rely_name
Parsing quantized nodes does not rely on names
2022-10-17 11:20:24 +03:00
Alexander Smorkalov
1c5dcbcac8
Merge pull request #22639 from WanliZhong:issue#22625
DNN: Make Unsqueeze layer support negative axes
2022-10-17 09:27:49 +03:00