wanli
c8f5e228fc
release MUL and ADD operator on CUDA
2023-02-10 19:33:59 +08:00
Alexander Alekhin
96a45e842e
Merge pull request #23061 from WanliZhong:gemm_cuda
...
DNN: make GEMM can be supported with transA and transB in CUDA
2023-02-09 00:06:32 +03:00
wanli
4718a4bf81
make GEMM can be supported with transA and transB in CUDA
2023-01-31 15:14:17 +08:00
Alexander Alekhin
cd44aa0bb1
Merge pull request #23162 from zihaomu:issue_23151
2023-01-28 13:00:43 +00:00
zihaomu
f45a12439a
fix depth wise issue.
2023-01-28 11:41:00 +08:00
Yuantao Feng
4d918ba40b
Merge pull request #23047 from fengyuentau:layer_norm
...
dnn: add layer normalization for vision transformers
* add layer norm onnx parser, impl and tests
* add onnx graph simplifier for layer norm expanded
* handle the case when constants are of type Initializer
* add test case for layer norm expanded with initializers
* use CV_Assert & CV_CheckType in place of CV_Assert_N; use forward_fallback for OCL_FP16
* use const ref / ref in parameters of invoker::run; extract inner const if from nested loop; use size_t in place of ull
* template hasBias
* remove trailing whitespace
* use pointer parameter with null check; move normSize division & mean_square division outside of loop; use std::max to ensure positive value before std::sqrt
* refactor implementation, optimize parallel_for
* disable layer norm expanded
* remove the removal of layer norm optional outputs
2023-01-27 16:35:59 +03:00
Alexander Alekhin
8ffc06ff72
Merge pull request #23173 from tomoaki0705:fix_warning_master
2023-01-23 15:33:16 +00:00
Tomoaki Teshima
186c18668c
suppress warning
2023-01-23 22:47:43 +09:00
Alexander Alekhin
18cbfa4a4f
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2023-01-23 00:11:12 +00:00
Alexander Alekhin
3d5e3a910f
Merge pull request #23096 from zihaomu:issue_23074
2023-01-12 00:51:04 +00:00
zihaomu
840b1d5c94
add depthwise add fuse
2023-01-11 08:42:51 +08:00
zihaomu
82616eec41
fix possible segmentation fault error in winograd on x86
2023-01-09 13:40:04 +08:00
Alexander Alekhin
9627ab9462
Merge pull request #23050 from zihaomu:fix_memory
2022-12-28 10:04:25 +00:00
zihaomu
71765858dc
fix invalid memory access
2022-12-28 17:16:11 +08:00
Alexander Alekhin
9a2a34f94e
dnn(openvino): remove undefined status
2022-12-28 06:55:00 +00:00
Alexander Alekhin
fc27a343e9
Merge pull request #22905 from zihaomu:clean_up_conv3d_1d
2022-12-26 17:39:18 +00:00
Alexander Alekhin
b42c11de82
pre: OpenCV 4.7.0 (version++)
2022-12-25 17:00:22 +00:00
Alexander Alekhin
a494c75bfe
pre: OpenCV 3.4.19 (version++)
2022-12-25 16:59:47 +00:00
Dmitry Kurtaev
8681686d8f
Merge pull request #22957 from dkurt:new_openvino_api
...
Switch to new OpenVINO API after 2022.1 release
* Pass Layer_Test_Convolution_DLDT.Accuracy/0 test
* Pass test Test_Caffe_layers.Softmax
* Failed 136 tests
* Fix Concat. Failed 120 tests
* Custom nGraph ops. 19 failed tests
* Set and get properties from Core
* Read model from buffer
* Change MaxPooling layer output names. Restore reshape
* Cosmetic changes
* Cosmetic changes
* Override getOutputsInfo
* Fixes for OpenVINO < 2022.1
* Async inference for 2021.4 and less
* Compile model with config
* Fix serialize for 2022.1
* Asynchronous inference with 2022.1
* Handle 1d outputs
* Work with model with dynamic output shape
* Fixes with 1d output for old API
* Control outputs by nGraph function for all OpenVINO versions
* Refer inputs in PrePostProcessor by indices
* Fix cycled dependency between InfEngineNgraphNode and InfEngineNgraphNet.
Add InferRequest callback only for async inference. Do not capture InferRequest object.
* Fix tests thresholds
* Fix HETERO:GPU,CPU plugin issues with unsupported layer
2022-12-23 16:58:41 +00:00
Alexander Smorkalov
9012e6dd9b
Merge pull request #22965 from vrabaud:numpy_fix
...
Remove references to deprecated NumPy type aliases.
2022-12-23 15:34:02 +03:00
Alexander Smorkalov
4930516652
Merge pull request #22898 from fengyuentau:slice_neg_steps
...
dnn: support ONNX Slice with negative steps by adding and using cv::flipND
2022-12-23 14:15:06 +03:00
Vincent Rabaud
ad568edd7f
Remove references to deprecated NumPy type aliases.
...
This change replaces references to a number of deprecated NumPy
type aliases (np.bool, np.int, np.float, np.complex, np.object,
np.str) with their recommended replacement (bool, int, float,
complex, object, str).
Those types were deprecated in 1.20 and are removed in 1.24,
cf https://github.com/numpy/numpy/pull/22607 .
2022-12-23 13:53:49 +03:00
Alexander Alekhin
1f41d06f9a
Merge pull request #23008 from mshabunin:fix-yolov4-tiny-hash
2022-12-23 10:14:25 +00:00
zihaomu
71c6339af0
remove old convolution branch, and optimize conv3d and conv1d.
2022-12-23 16:50:28 +08:00
fengyuentau
34a0897f90
add cv::flipND; support onnx slice with negative steps via cv::flipND
2022-12-23 16:39:53 +08:00
Maksim Shabunin
d35fbe6bfc
dnn: updated YOLOv4-tiny model and tests
2022-12-22 15:49:21 +03:00
Alexander Alekhin
6b4f3e5fab
Merge pull request #22993 from alalek:fixup_21738
2022-12-21 19:50:51 +00:00
Yuantao Feng
a2b3acfc6e
dnn: add the CANN backend ( #22634 )
...
* cann backend impl v1
* cann backend impl v2: use opencv parsers to build models for cann
* adjust fc according to the new transA and transB
* put cann net in cann backend node and reuse forwardLayer
* use fork() to create a child process and compile cann model
* remove legacy code
* remove debug code
* fall bcak to CPU backend if there is one layer not supoorted by CANN backend
* fix netInput forward
2022-12-21 09:04:41 +03:00
Alexander Alekhin
cdbb893b27
dnn: disable OpenCL code path in MatMul processing
...
- this mode is not supported by 22828
2022-12-20 09:46:48 +00:00
Alexander Alekhin
1102b7eff8
dnn: fix gather layer implementation
...
- support FP16 data
2022-12-20 06:09:34 +00:00
zoom
4891818114
make MatMul support 3D or 4D with broadcast
2022-12-15 10:36:08 +08:00
Alexander Alekhin
8ba44e7d55
Merge pull request #22882 from zihaomu:gemm_first_const
2022-12-08 14:18:33 +00:00
Zihao Mu
0a650b573b
Merge pull request #22840 from zihaomu:optimze_conv_memory_usage
...
DNN: reduce the memory used in convolution layer
* reduce the memory in winograd and disabel the test when usage memory is larger than 2gb.
* remove VERY_LOG tag
2022-12-08 12:57:13 +00:00
Alexander Alekhin
b16f76eede
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2022-12-03 12:39:41 +00:00
Alexander Alekhin
d16b3b2487
dnn(test): restore openvino tests with 'Cannot get memory' message
2022-12-03 01:34:48 +00:00
Alexander Alekhin
74d0b4cc78
dnn(openvino): fix custom layers BlockingDesc
2022-12-03 01:34:10 +00:00
Alexander Smorkalov
e14ca39fd7
Merge pull request #22857 from fengyuentau:batched_nms
...
dnn: add batched nms
2022-11-30 12:37:49 +03:00
Alexander Smorkalov
421ba8730a
Merge pull request #22809 from fengyuentau:tile
...
dnn: support ONNX Tile
2022-11-29 14:42:28 +03:00
zihaomu
0d56524b72
gemm support transA and transB, and first input is constance.
2022-11-29 17:13:36 +08:00
fengyuentau
9fded9ca53
batched nms impl
2022-11-29 15:32:34 +08:00
fengyuentau
441624a5fb
tile impl
2022-11-29 11:15:38 +08:00
zoom
5044af69d1
let MatMul can work when both two inputs are const
2022-11-27 17:32:41 +08:00
Alexander Smorkalov
6ca205a029
Merge pull request #22478 from WanliZhong:nary_eltwise_cuda
...
DNN: Let part of the operators in nary_eltwise support CUDA
2022-11-22 16:15:50 +03:00
zihaomu
5bf64e7dfe
fix the infinite loop in tf importer of 3.4 branch
2022-11-15 11:42:10 +08:00
zoom
ef2677b0a6
Make MatMul layer support 3d or 4d operation with const input
2022-11-10 11:41:44 +08:00
zoom
11d492b0b9
Let part of the operators in nary_eltwise support cuda
2022-11-02 14:08:21 +08:00
Zihao Mu
17f2b56291
remove never used code in onnximporter
2022-11-02 10:45:16 +08:00
Alexander Alekhin
ee9137f176
Merge pull request #22725 from zihaomu:fix_infinit_loop_in_tf
2022-10-31 17:03:03 +00:00
Zihao Mu
903bf0147e
Merge pull request #22666 from zihaomu:support_onnx_qdq_model
...
DNN: let Quant and Dequant of ONNX_importer support the Constant input.
* let Quant and Dequant support the Constant input.
* fix negative value of axis.
2022-10-31 16:06:31 +00:00
Zihao Mu
18fbb72f7d
fix the infinite loop in tf importer.
2022-10-31 20:10:25 +08:00
Alexander Smorkalov
22f8fb4d5c
Do not fail tests in Yolo v7 model was not found.
2022-10-24 17:59:18 +03:00
Alexander Smorkalov
23edec83fb
Merge pull request #22667 from zihaomu:bug_fix_in_winograd
...
DNN: bug fixed in Winograd
2022-10-21 17:54:13 +03:00
Alexander Smorkalov
e4cd430710
Merge pull request #22653 from WanliZhong:issue22597
...
DNN-TF: let StridedSlice layer support const input
2022-10-21 17:51:00 +03:00
Dmitry Kurtaev
35b2cff295
Merge pull request #22656 from dkurt:halide_fixes
...
* Fixes for Halide
* Enable some Halide tests
2022-10-21 17:49:49 +03:00
Zihao Mu
cee8c86b6e
fixed bug at winograd of SIMD128 and more robust code.
2022-10-21 19:14:54 +08:00
Alexander Smorkalov
5d292826b2
Merge pull request #22593 from zihaomu:optimize_wino
...
optimize winograd futher more
2022-10-19 13:08:32 +03:00
Alexander Smorkalov
f378f02954
Merge pull request #22652 from rogday:cuda_test_fixes
...
Address CUDA-related errors
2022-10-19 09:37:12 +03:00
Zhi-Qiang Zhou
c8561eae2d
Update region_layer.cpp
...
Fix objectness (dstData[index + 4]) is not assigned if new_coords == 1.
2022-10-19 11:17:23 +08:00
Smirnov Egor
dd14cf6a9c
address CUDA-related errors and enable cuda in elementwise ops
2022-10-18 16:54:42 +03:00
Alexander Smorkalov
ec7fc5adca
Merge pull request #22529 from fengyuentau:scatter_scatternd
...
DNN: supports Scatter and ScatterND from ONNX
2022-10-17 14:57:46 +03:00
Alexander Smorkalov
02143cd0e2
Merge pull request #22531 from zihaomu:stop_rely_name
...
Parsing quantized nodes does not rely on names
2022-10-17 11:20:24 +03:00
Alexander Smorkalov
1c5dcbcac8
Merge pull request #22639 from WanliZhong:issue#22625
...
DNN: Make Unsqueeze layer support negative axes
2022-10-17 09:27:49 +03:00
fengyuentau
d24d8f2abe
implementation of scatter and scatternd with conformance tests enabled
2022-10-17 11:30:32 +08:00
Alexander Alekhin
762481411d
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2022-10-15 16:44:47 +00:00
zoom
d816442e4d
Make Unsqueeze layer support negative axes.
2022-10-14 18:00:19 +08:00
Zihao Mu
0fa43e3aac
Optimize the winograd futher more.
2022-10-14 10:15:45 +08:00
zoom
9119692bb8
let StridedSlice layer support const input
2022-10-12 11:50:44 +08:00
Alexander Smorkalov
ec26541771
Merge pull request #22577 from zihaomu:Disable_winograd_branch_in_tryquantize
...
DNN: add enableWinograd API for Net
2022-10-11 09:44:00 +03:00
Zihao Mu
d9eff7daeb
parse quantized nodes does not rely on name.
2022-10-10 17:08:46 +08:00
Alexander Smorkalov
3419e64dcf
Merge pull request #22611 from zihaomu:greaterOrEqual
...
DNN: support GreaterOrEqual and LessOrEqual op in ONNX
2022-10-10 11:43:44 +03:00
Zihao Mu
1e2ceca4df
add enableWinograd API for Net.
2022-10-09 09:33:07 +08:00
Alexander Alekhin
347246901e
Merge pull request #21745 from alalek:dnn_plugin_openvino
2022-10-08 22:32:25 +00:00
Zihao Mu
9821fae59d
add greater_or_equal and less_or_equal ONNX support
2022-10-08 15:51:40 +08:00
Alexander Alekhin
43b2bb2c25
dnn: plugin support for OpenVINO
2022-10-07 16:57:31 +00:00
Alexander Smorkalov
96844b0ca5
Merge pull request #22554 from WanliZhong:slice_axes_no_seq
...
DNN: Let Slice layer support non-sequential and negative axes
2022-10-03 10:15:55 +03:00
zoom
4557971481
enhance slice layer
...
refactor the code for parsing Slice layer
add test for Slice layer
let 'begin' and 'end' resize to dims
add opset message comment
2022-10-01 17:12:07 +08:00
Zihao Mu
15cfafb360
DNN: Remove unused code in onnx_importer.cpp
2022-09-29 10:53:43 +08:00
Voron
cbf43a54fb
added opencv for openvino tutorial
2022-09-28 12:05:28 +02:00
Alexander Smorkalov
a6274647a4
Merge pull request #21738 from rogday:gather
...
add Gather implementation
2022-09-19 16:21:14 +03:00
Egor Smirnov
65f71ce2eb
add Gather implementation
2022-09-19 15:06:44 +03:00
Alexander Smorkalov
6aefb8e86f
Merge pull request #22290 from fengyuentau:naive_yolov7
...
Support for YOLOv7 ONNX (not simplified)
2022-09-19 14:43:18 +03:00
fengyuentau
4aef9b1c93
dnn: support yolov7 (not simplified)
2022-09-19 18:38:03 +08:00
Alexander Smorkalov
e1e9261450
Merge pull request #22479 from scottchou007:master
...
Fix issues in opencv_test_dnn from conv48 kernels without bias
2022-09-16 09:05:55 +03:00
scottchou007
a3cb2020bc
Fix issues in opencv_test_dnn from conv48 kernels using uninitialized tensors when there is no bias.
2022-09-15 13:41:27 -07:00
Alexander Alekhin
65bdb3a544
dnn: eliminate GCC12 warning in total() call
2022-09-14 11:37:00 +00:00
Alexander Smorkalov
c2c8da2517
Merge pull request #22448 from Ichini24:reshape-permutations-fix
...
changed names of permutations if Reshpe is in NHWC
2022-09-13 09:24:56 +03:00
wxsheng
4154bd0667
Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
...
* Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add resize.lasx.cpp for Loongson SIMD acceleration
* Add imgwarp.lasx.cpp for Loongson SIMD acceleration
* Add LASX acceleration support for dnn/conv
* Add CV_PAUSE(v) for Loongarch
* Set LASX by default on Loongarch64
* LoongArch: tune test threshold for Core/HAL.mat_decomp/15
Co-authored-by: shengwenxue <shengwenxue@loongson.cn>
2022-09-10 09:39:43 +03:00
Alexander Alekhin
ca7f964104
dnn: use inheritance for OpenVINO net impl
2022-09-06 18:05:00 +00:00
anton
337452b4c0
changed names of permutations if Reshpe is in NHWC
2022-09-03 19:02:41 +02:00
Zihao Mu
b69b1eae8f
fix bug 22450
2022-09-02 16:30:06 +08:00
Alexander Smorkalov
70fb1cd603
Merge pull request #22440 from zihaomu:fix_conv_bug
2022-08-30 07:01:05 +00:00
Alexander Smorkalov
d2c48b898c
Merge pull request #22306 from zihaomu:qgemm_and_squeeze_opset13_onnximporter
2022-08-30 06:33:57 +00:00
Zihao Mu
2d837efba7
add qgemm and squeeze op13 supported on ONNXImporter
2022-08-30 09:50:29 +08:00
Alexander Smorkalov
1fd45a1b85
Merge pull request #22362 from fengyuentau:conv_asym_pad_fuse
...
Remove asymmetric padding in Conv layer since it is supported in CPU backend
2022-08-29 17:56:17 +03:00
Zihao Mu
2cd7e17b65
replace v_add with +
2022-08-29 17:15:35 +08:00
Alexander Smorkalov
2619099fe5
Merge pull request #22337 from zihaomu:load_ONNX_fp16_as_fp32
...
DNN: load fp16 ONNX model as fp32
2022-08-29 09:32:25 +03:00
fengyuentau
2959286eb5
tengine: supports conv with asymmetric padding
2022-08-29 02:51:26 +00:00
Zihao Mu
9638e34ab0
reuse WORDS_BIGENDIAN.
2022-08-27 07:42:38 +08:00
Zihao Mu
bb64db98d8
Further optimization of Conv2D, fused Conv_Add_Activation, bring latest code from ficus OpConv.fx. ( #22401 )
2022-08-26 12:57:25 +03:00
Zihao Mu
7eaec9dd22
load fp16 as fp32 and align fp16 and double in onnx_graph_simplifie
2022-08-26 10:04:44 +08:00