Commit Graph

2067 Commits

Author SHA1 Message Date
wanli
c8f5e228fc release MUL and ADD operator on CUDA 2023-02-10 19:33:59 +08:00
Alexander Alekhin
96a45e842e
Merge pull request #23061 from WanliZhong:gemm_cuda
DNN: make GEMM can be supported with transA and transB in CUDA
2023-02-09 00:06:32 +03:00
wanli
4718a4bf81 make GEMM can be supported with transA and transB in CUDA 2023-01-31 15:14:17 +08:00
Alexander Alekhin
cd44aa0bb1 Merge pull request #23162 from zihaomu:issue_23151 2023-01-28 13:00:43 +00:00
zihaomu
f45a12439a fix depth wise issue. 2023-01-28 11:41:00 +08:00
Yuantao Feng
4d918ba40b
Merge pull request #23047 from fengyuentau:layer_norm
dnn: add layer normalization for vision transformers

* add layer norm onnx parser, impl and tests

* add onnx graph simplifier for layer norm expanded

* handle the case when constants are of type Initializer

* add test case for layer norm expanded with initializers

* use CV_Assert & CV_CheckType in place of CV_Assert_N; use forward_fallback for OCL_FP16

* use const ref / ref in parameters of invoker::run; extract inner const if from nested loop; use size_t in place of ull

* template hasBias

* remove trailing whitespace

* use pointer parameter with null check; move normSize division & mean_square division outside of loop; use std::max to ensure positive value before std::sqrt

* refactor implementation, optimize parallel_for

* disable layer norm expanded

* remove the removal of layer norm optional outputs
2023-01-27 16:35:59 +03:00
Alexander Alekhin
8ffc06ff72 Merge pull request #23173 from tomoaki0705:fix_warning_master 2023-01-23 15:33:16 +00:00
Tomoaki Teshima
186c18668c suppress warning 2023-01-23 22:47:43 +09:00
Alexander Alekhin
18cbfa4a4f Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2023-01-23 00:11:12 +00:00
Alexander Alekhin
3d5e3a910f Merge pull request #23096 from zihaomu:issue_23074 2023-01-12 00:51:04 +00:00
zihaomu
840b1d5c94 add depthwise add fuse 2023-01-11 08:42:51 +08:00
zihaomu
82616eec41 fix possible segmentation fault error in winograd on x86 2023-01-09 13:40:04 +08:00
Alexander Alekhin
9627ab9462 Merge pull request #23050 from zihaomu:fix_memory 2022-12-28 10:04:25 +00:00
zihaomu
71765858dc fix invalid memory access 2022-12-28 17:16:11 +08:00
Alexander Alekhin
9a2a34f94e dnn(openvino): remove undefined status 2022-12-28 06:55:00 +00:00
Alexander Alekhin
fc27a343e9 Merge pull request #22905 from zihaomu:clean_up_conv3d_1d 2022-12-26 17:39:18 +00:00
Alexander Alekhin
b42c11de82 pre: OpenCV 4.7.0 (version++) 2022-12-25 17:00:22 +00:00
Alexander Alekhin
a494c75bfe pre: OpenCV 3.4.19 (version++) 2022-12-25 16:59:47 +00:00
Dmitry Kurtaev
8681686d8f
Merge pull request #22957 from dkurt:new_openvino_api
Switch to new OpenVINO API after 2022.1 release

* Pass Layer_Test_Convolution_DLDT.Accuracy/0 test

* Pass test Test_Caffe_layers.Softmax

* Failed 136 tests

* Fix Concat. Failed 120 tests

* Custom nGraph ops. 19 failed tests

* Set and get properties from Core

* Read model from buffer

* Change MaxPooling layer output names. Restore reshape

* Cosmetic changes

* Cosmetic changes

* Override getOutputsInfo

* Fixes for OpenVINO < 2022.1

* Async inference for 2021.4 and less

* Compile model with config

* Fix serialize for 2022.1

* Asynchronous inference with 2022.1

* Handle 1d outputs

* Work with model with dynamic output shape

* Fixes with 1d output for old API

* Control outputs by nGraph function for all OpenVINO versions

* Refer inputs in PrePostProcessor by indices

* Fix cycled dependency between InfEngineNgraphNode and InfEngineNgraphNet.
Add InferRequest callback only for async inference. Do not capture InferRequest object.

* Fix tests thresholds

* Fix HETERO:GPU,CPU plugin issues with unsupported layer
2022-12-23 16:58:41 +00:00
Alexander Smorkalov
9012e6dd9b
Merge pull request #22965 from vrabaud:numpy_fix
Remove references to deprecated NumPy type aliases.
2022-12-23 15:34:02 +03:00
Alexander Smorkalov
4930516652
Merge pull request #22898 from fengyuentau:slice_neg_steps
dnn: support ONNX Slice with negative steps by adding and using cv::flipND
2022-12-23 14:15:06 +03:00
Vincent Rabaud
ad568edd7f Remove references to deprecated NumPy type aliases.
This change replaces references to a number of deprecated NumPy
type aliases (np.bool, np.int, np.float, np.complex, np.object,
np.str) with their recommended replacement (bool, int, float,
complex, object, str).

Those types were deprecated in 1.20 and are removed in 1.24,
cf https://github.com/numpy/numpy/pull/22607.
2022-12-23 13:53:49 +03:00
Alexander Alekhin
1f41d06f9a Merge pull request #23008 from mshabunin:fix-yolov4-tiny-hash 2022-12-23 10:14:25 +00:00
zihaomu
71c6339af0 remove old convolution branch, and optimize conv3d and conv1d. 2022-12-23 16:50:28 +08:00
fengyuentau
34a0897f90 add cv::flipND; support onnx slice with negative steps via cv::flipND 2022-12-23 16:39:53 +08:00
Maksim Shabunin
d35fbe6bfc dnn: updated YOLOv4-tiny model and tests 2022-12-22 15:49:21 +03:00
Alexander Alekhin
6b4f3e5fab Merge pull request #22993 from alalek:fixup_21738 2022-12-21 19:50:51 +00:00
Yuantao Feng
a2b3acfc6e
dnn: add the CANN backend (#22634)
* cann backend impl v1

* cann backend impl v2: use opencv parsers to build models for cann

* adjust fc according to the new transA and transB

* put cann net in cann backend node and reuse forwardLayer

* use fork() to create a child process and compile cann model

* remove legacy code

* remove debug code

* fall bcak to CPU backend if there is one layer not supoorted by CANN backend

* fix netInput forward
2022-12-21 09:04:41 +03:00
Alexander Alekhin
cdbb893b27 dnn: disable OpenCL code path in MatMul processing
- this mode is not supported by 22828
2022-12-20 09:46:48 +00:00
Alexander Alekhin
1102b7eff8 dnn: fix gather layer implementation
- support FP16 data
2022-12-20 06:09:34 +00:00
zoom
4891818114 make MatMul support 3D or 4D with broadcast 2022-12-15 10:36:08 +08:00
Alexander Alekhin
8ba44e7d55 Merge pull request #22882 from zihaomu:gemm_first_const 2022-12-08 14:18:33 +00:00
Zihao Mu
0a650b573b
Merge pull request #22840 from zihaomu:optimze_conv_memory_usage
DNN: reduce the memory used in convolution layer

* reduce the memory in winograd and disabel the test when usage memory is larger than 2gb.

* remove VERY_LOG tag
2022-12-08 12:57:13 +00:00
Alexander Alekhin
b16f76eede Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-12-03 12:39:41 +00:00
Alexander Alekhin
d16b3b2487 dnn(test): restore openvino tests with 'Cannot get memory' message 2022-12-03 01:34:48 +00:00
Alexander Alekhin
74d0b4cc78 dnn(openvino): fix custom layers BlockingDesc 2022-12-03 01:34:10 +00:00
Alexander Smorkalov
e14ca39fd7
Merge pull request #22857 from fengyuentau:batched_nms
dnn: add batched nms
2022-11-30 12:37:49 +03:00
Alexander Smorkalov
421ba8730a
Merge pull request #22809 from fengyuentau:tile
dnn: support ONNX Tile
2022-11-29 14:42:28 +03:00
zihaomu
0d56524b72 gemm support transA and transB, and first input is constance. 2022-11-29 17:13:36 +08:00
fengyuentau
9fded9ca53 batched nms impl 2022-11-29 15:32:34 +08:00
fengyuentau
441624a5fb tile impl 2022-11-29 11:15:38 +08:00
zoom
5044af69d1 let MatMul can work when both two inputs are const 2022-11-27 17:32:41 +08:00
Alexander Smorkalov
6ca205a029
Merge pull request #22478 from WanliZhong:nary_eltwise_cuda
DNN: Let part of the operators in nary_eltwise support CUDA
2022-11-22 16:15:50 +03:00
zihaomu
5bf64e7dfe fix the infinite loop in tf importer of 3.4 branch 2022-11-15 11:42:10 +08:00
zoom
ef2677b0a6 Make MatMul layer support 3d or 4d operation with const input 2022-11-10 11:41:44 +08:00
zoom
11d492b0b9 Let part of the operators in nary_eltwise support cuda 2022-11-02 14:08:21 +08:00
Zihao Mu
17f2b56291 remove never used code in onnximporter 2022-11-02 10:45:16 +08:00
Alexander Alekhin
ee9137f176 Merge pull request #22725 from zihaomu:fix_infinit_loop_in_tf 2022-10-31 17:03:03 +00:00
Zihao Mu
903bf0147e
Merge pull request #22666 from zihaomu:support_onnx_qdq_model
DNN: let Quant and Dequant of ONNX_importer support the Constant input.

* let Quant and Dequant support the Constant input.

* fix negative value of axis.
2022-10-31 16:06:31 +00:00
Zihao Mu
18fbb72f7d fix the infinite loop in tf importer. 2022-10-31 20:10:25 +08:00
Alexander Smorkalov
22f8fb4d5c Do not fail tests in Yolo v7 model was not found. 2022-10-24 17:59:18 +03:00
Alexander Smorkalov
23edec83fb
Merge pull request #22667 from zihaomu:bug_fix_in_winograd
DNN: bug fixed in Winograd
2022-10-21 17:54:13 +03:00
Alexander Smorkalov
e4cd430710
Merge pull request #22653 from WanliZhong:issue22597
DNN-TF: let StridedSlice layer support const input
2022-10-21 17:51:00 +03:00
Dmitry Kurtaev
35b2cff295
Merge pull request #22656 from dkurt:halide_fixes
* Fixes for Halide
* Enable some Halide tests
2022-10-21 17:49:49 +03:00
Zihao Mu
cee8c86b6e fixed bug at winograd of SIMD128 and more robust code. 2022-10-21 19:14:54 +08:00
Alexander Smorkalov
5d292826b2
Merge pull request #22593 from zihaomu:optimize_wino
optimize winograd futher more
2022-10-19 13:08:32 +03:00
Alexander Smorkalov
f378f02954
Merge pull request #22652 from rogday:cuda_test_fixes
Address CUDA-related errors
2022-10-19 09:37:12 +03:00
Zhi-Qiang Zhou
c8561eae2d
Update region_layer.cpp
Fix objectness (dstData[index + 4]) is not assigned if new_coords == 1.
2022-10-19 11:17:23 +08:00
Smirnov Egor
dd14cf6a9c address CUDA-related errors and enable cuda in elementwise ops 2022-10-18 16:54:42 +03:00
Alexander Smorkalov
ec7fc5adca
Merge pull request #22529 from fengyuentau:scatter_scatternd
DNN: supports Scatter and ScatterND from ONNX
2022-10-17 14:57:46 +03:00
Alexander Smorkalov
02143cd0e2
Merge pull request #22531 from zihaomu:stop_rely_name
Parsing quantized nodes does not rely on names
2022-10-17 11:20:24 +03:00
Alexander Smorkalov
1c5dcbcac8
Merge pull request #22639 from WanliZhong:issue#22625
DNN: Make Unsqueeze layer support negative axes
2022-10-17 09:27:49 +03:00
fengyuentau
d24d8f2abe implementation of scatter and scatternd with conformance tests enabled 2022-10-17 11:30:32 +08:00
Alexander Alekhin
762481411d Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-10-15 16:44:47 +00:00
zoom
d816442e4d Make Unsqueeze layer support negative axes. 2022-10-14 18:00:19 +08:00
Zihao Mu
0fa43e3aac Optimize the winograd futher more. 2022-10-14 10:15:45 +08:00
zoom
9119692bb8 let StridedSlice layer support const input 2022-10-12 11:50:44 +08:00
Alexander Smorkalov
ec26541771
Merge pull request #22577 from zihaomu:Disable_winograd_branch_in_tryquantize
DNN: add enableWinograd API for Net
2022-10-11 09:44:00 +03:00
Zihao Mu
d9eff7daeb parse quantized nodes does not rely on name. 2022-10-10 17:08:46 +08:00
Alexander Smorkalov
3419e64dcf
Merge pull request #22611 from zihaomu:greaterOrEqual
DNN: support GreaterOrEqual and LessOrEqual op in ONNX
2022-10-10 11:43:44 +03:00
Zihao Mu
1e2ceca4df add enableWinograd API for Net. 2022-10-09 09:33:07 +08:00
Alexander Alekhin
347246901e Merge pull request #21745 from alalek:dnn_plugin_openvino 2022-10-08 22:32:25 +00:00
Zihao Mu
9821fae59d add greater_or_equal and less_or_equal ONNX support 2022-10-08 15:51:40 +08:00
Alexander Alekhin
43b2bb2c25 dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
Alexander Smorkalov
96844b0ca5
Merge pull request #22554 from WanliZhong:slice_axes_no_seq
DNN: Let Slice layer support non-sequential and negative axes
2022-10-03 10:15:55 +03:00
zoom
4557971481 enhance slice layer
refactor the code for parsing Slice layer
add test for Slice layer
let 'begin' and 'end' resize to dims
add opset message comment
2022-10-01 17:12:07 +08:00
Zihao Mu
15cfafb360
DNN: Remove unused code in onnx_importer.cpp 2022-09-29 10:53:43 +08:00
Voron
cbf43a54fb added opencv for openvino tutorial 2022-09-28 12:05:28 +02:00
Alexander Smorkalov
a6274647a4
Merge pull request #21738 from rogday:gather
add Gather implementation
2022-09-19 16:21:14 +03:00
Egor Smirnov
65f71ce2eb add Gather implementation 2022-09-19 15:06:44 +03:00
Alexander Smorkalov
6aefb8e86f
Merge pull request #22290 from fengyuentau:naive_yolov7
Support for YOLOv7 ONNX (not simplified)
2022-09-19 14:43:18 +03:00
fengyuentau
4aef9b1c93 dnn: support yolov7 (not simplified) 2022-09-19 18:38:03 +08:00
Alexander Smorkalov
e1e9261450
Merge pull request #22479 from scottchou007:master
Fix issues in opencv_test_dnn from conv48 kernels without bias
2022-09-16 09:05:55 +03:00
scottchou007
a3cb2020bc Fix issues in opencv_test_dnn from conv48 kernels using uninitialized tensors when there is no bias. 2022-09-15 13:41:27 -07:00
Alexander Alekhin
65bdb3a544 dnn: eliminate GCC12 warning in total() call 2022-09-14 11:37:00 +00:00
Alexander Smorkalov
c2c8da2517
Merge pull request #22448 from Ichini24:reshape-permutations-fix
changed names of permutations if Reshpe is in NHWC
2022-09-13 09:24:56 +03:00
wxsheng
4154bd0667
Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add resize.lasx.cpp for Loongson SIMD acceleration
* Add imgwarp.lasx.cpp for Loongson SIMD acceleration
* Add LASX acceleration support for dnn/conv
* Add CV_PAUSE(v) for Loongarch
* Set LASX by default on Loongarch64
* LoongArch: tune test threshold for Core/HAL.mat_decomp/15

Co-authored-by: shengwenxue <shengwenxue@loongson.cn>
2022-09-10 09:39:43 +03:00
Alexander Alekhin
ca7f964104 dnn: use inheritance for OpenVINO net impl 2022-09-06 18:05:00 +00:00
anton
337452b4c0 changed names of permutations if Reshpe is in NHWC 2022-09-03 19:02:41 +02:00
Zihao Mu
b69b1eae8f fix bug 22450 2022-09-02 16:30:06 +08:00
Alexander Smorkalov
70fb1cd603 Merge pull request #22440 from zihaomu:fix_conv_bug 2022-08-30 07:01:05 +00:00
Alexander Smorkalov
d2c48b898c Merge pull request #22306 from zihaomu:qgemm_and_squeeze_opset13_onnximporter 2022-08-30 06:33:57 +00:00
Zihao Mu
2d837efba7 add qgemm and squeeze op13 supported on ONNXImporter 2022-08-30 09:50:29 +08:00
Alexander Smorkalov
1fd45a1b85
Merge pull request #22362 from fengyuentau:conv_asym_pad_fuse
Remove asymmetric padding in Conv layer since it is supported in CPU backend
2022-08-29 17:56:17 +03:00
Zihao Mu
2cd7e17b65 replace v_add with + 2022-08-29 17:15:35 +08:00
Alexander Smorkalov
2619099fe5
Merge pull request #22337 from zihaomu:load_ONNX_fp16_as_fp32
DNN: load fp16 ONNX model as fp32
2022-08-29 09:32:25 +03:00
fengyuentau
2959286eb5 tengine: supports conv with asymmetric padding 2022-08-29 02:51:26 +00:00
Zihao Mu
9638e34ab0 reuse WORDS_BIGENDIAN. 2022-08-27 07:42:38 +08:00
Zihao Mu
bb64db98d8
Further optimization of Conv2D, fused Conv_Add_Activation, bring latest code from ficus OpConv.fx. (#22401) 2022-08-26 12:57:25 +03:00
Zihao Mu
7eaec9dd22 load fp16 as fp32 and align fp16 and double in onnx_graph_simplifie 2022-08-26 10:04:44 +08:00