Commit Graph

957 Commits

Author SHA1 Message Date
zoom
4891818114 make MatMul support 3D or 4D with broadcast 2022-12-15 10:36:08 +08:00
Alexander Alekhin
8ba44e7d55 Merge pull request #22882 from zihaomu:gemm_first_const 2022-12-08 14:18:33 +00:00
Zihao Mu
0a650b573b
Merge pull request #22840 from zihaomu:optimze_conv_memory_usage
DNN: reduce the memory used in convolution layer

* reduce the memory in winograd and disabel the test when usage memory is larger than 2gb.

* remove VERY_LOG tag
2022-12-08 12:57:13 +00:00
Alexander Alekhin
b16f76eede Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-12-03 12:39:41 +00:00
Alexander Alekhin
d16b3b2487 dnn(test): restore openvino tests with 'Cannot get memory' message 2022-12-03 01:34:48 +00:00
Alexander Smorkalov
e14ca39fd7
Merge pull request #22857 from fengyuentau:batched_nms
dnn: add batched nms
2022-11-30 12:37:49 +03:00
Alexander Smorkalov
421ba8730a
Merge pull request #22809 from fengyuentau:tile
dnn: support ONNX Tile
2022-11-29 14:42:28 +03:00
zihaomu
0d56524b72 gemm support transA and transB, and first input is constance. 2022-11-29 17:13:36 +08:00
fengyuentau
9fded9ca53 batched nms impl 2022-11-29 15:32:34 +08:00
fengyuentau
441624a5fb tile impl 2022-11-29 11:15:38 +08:00
zoom
5044af69d1 let MatMul can work when both two inputs are const 2022-11-27 17:32:41 +08:00
zoom
ef2677b0a6 Make MatMul layer support 3d or 4d operation with const input 2022-11-10 11:41:44 +08:00
Zihao Mu
903bf0147e
Merge pull request #22666 from zihaomu:support_onnx_qdq_model
DNN: let Quant and Dequant of ONNX_importer support the Constant input.

* let Quant and Dequant support the Constant input.

* fix negative value of axis.
2022-10-31 16:06:31 +00:00
Alexander Smorkalov
22f8fb4d5c Do not fail tests in Yolo v7 model was not found. 2022-10-24 17:59:18 +03:00
Dmitry Kurtaev
35b2cff295
Merge pull request #22656 from dkurt:halide_fixes
* Fixes for Halide
* Enable some Halide tests
2022-10-21 17:49:49 +03:00
Alexander Smorkalov
5d292826b2
Merge pull request #22593 from zihaomu:optimize_wino
optimize winograd futher more
2022-10-19 13:08:32 +03:00
Alexander Smorkalov
f378f02954
Merge pull request #22652 from rogday:cuda_test_fixes
Address CUDA-related errors
2022-10-19 09:37:12 +03:00
Smirnov Egor
dd14cf6a9c address CUDA-related errors and enable cuda in elementwise ops 2022-10-18 16:54:42 +03:00
Alexander Smorkalov
ec7fc5adca
Merge pull request #22529 from fengyuentau:scatter_scatternd
DNN: supports Scatter and ScatterND from ONNX
2022-10-17 14:57:46 +03:00
fengyuentau
d24d8f2abe implementation of scatter and scatternd with conformance tests enabled 2022-10-17 11:30:32 +08:00
zoom
d816442e4d Make Unsqueeze layer support negative axes. 2022-10-14 18:00:19 +08:00
Zihao Mu
0fa43e3aac Optimize the winograd futher more. 2022-10-14 10:15:45 +08:00
Alexander Smorkalov
ec26541771
Merge pull request #22577 from zihaomu:Disable_winograd_branch_in_tryquantize
DNN: add enableWinograd API for Net
2022-10-11 09:44:00 +03:00
Alexander Smorkalov
3419e64dcf
Merge pull request #22611 from zihaomu:greaterOrEqual
DNN: support GreaterOrEqual and LessOrEqual op in ONNX
2022-10-10 11:43:44 +03:00
Zihao Mu
1e2ceca4df add enableWinograd API for Net. 2022-10-09 09:33:07 +08:00
Alexander Alekhin
347246901e Merge pull request #21745 from alalek:dnn_plugin_openvino 2022-10-08 22:32:25 +00:00
Zihao Mu
9821fae59d add greater_or_equal and less_or_equal ONNX support 2022-10-08 15:51:40 +08:00
Alexander Alekhin
43b2bb2c25 dnn: plugin support for OpenVINO 2022-10-07 16:57:31 +00:00
zoom
4557971481 enhance slice layer
refactor the code for parsing Slice layer
add test for Slice layer
let 'begin' and 'end' resize to dims
add opset message comment
2022-10-01 17:12:07 +08:00
Alexander Smorkalov
a6274647a4
Merge pull request #21738 from rogday:gather
add Gather implementation
2022-09-19 16:21:14 +03:00
Egor Smirnov
65f71ce2eb add Gather implementation 2022-09-19 15:06:44 +03:00
Alexander Smorkalov
6aefb8e86f
Merge pull request #22290 from fengyuentau:naive_yolov7
Support for YOLOv7 ONNX (not simplified)
2022-09-19 14:43:18 +03:00
fengyuentau
4aef9b1c93 dnn: support yolov7 (not simplified) 2022-09-19 18:38:03 +08:00
anton
337452b4c0 changed names of permutations if Reshpe is in NHWC 2022-09-03 19:02:41 +02:00
Alexander Smorkalov
d2c48b898c Merge pull request #22306 from zihaomu:qgemm_and_squeeze_opset13_onnximporter 2022-08-30 06:33:57 +00:00
Zihao Mu
2d837efba7 add qgemm and squeeze op13 supported on ONNXImporter 2022-08-30 09:50:29 +08:00
Alexander Smorkalov
2619099fe5
Merge pull request #22337 from zihaomu:load_ONNX_fp16_as_fp32
DNN: load fp16 ONNX model as fp32
2022-08-29 09:32:25 +03:00
Zihao Mu
bb64db98d8
Further optimization of Conv2D, fused Conv_Add_Activation, bring latest code from ficus OpConv.fx. (#22401) 2022-08-26 12:57:25 +03:00
Zihao Mu
7eaec9dd22 load fp16 as fp32 and align fp16 and double in onnx_graph_simplifie 2022-08-26 10:04:44 +08:00
Alexander Alekhin
2ebdc04787 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-08-14 15:50:42 +00:00
Alexander Smorkalov
bb71cb200e
Merge pull request #22199 from zihaomu:bug_fix_22195
DNN: Reduce Layer (add dynamic batch and ReduceSum support)
2022-08-11 12:59:51 +03:00
Alexander Smorkalov
b2b7193374
Merge pull request #22311 from zihaomu:layer_fused_optmized_mish
DNN: add another two Mish activation to onnx_graph_simplifier
2022-08-05 14:22:06 +03:00
Zihao Mu
0614c40b42 add more skip for very long test case in test_dnn. 2022-08-02 14:58:05 +08:00
Zihao Mu
d4640f4647 support ReduceLayer without reshape layer. 2022-08-02 10:32:31 +08:00
Zihao Mu
3c5377ca1b add another Mish graph simplifier. 2022-07-28 11:21:29 +08:00
rogday
ed69bcae2d
Merge pull request #21865 from rogday:nary_eltwise_layers
Reimplementation of Element-wise layers with broadcasting support

* init

* semi-working initial version

* add small_vector

* wip

* remove smallvec

* add nary function

* replace auto with Mat in lambda expr used in transform

* uncomment asserts

* autobuffer shape_buf & step_buf

* fix a missing bracket

* fixed a missing addLayer in parseElementWise

* solve one-dimensional broadcast

* remove pre_broadcast_transform for the case of two constants; fix missing constBlobsExtraInfo when addConstant is called

* one autobuffer for step & shape

* temporal fix for the missing original dimension information

* fix parseUnsqueeze when it gets a 1d tensor constant

* support sum/mean/min/max with only one input

* reuse old code to handle cases of two non-constant inputs

* add condition to handle div & mul of two non-constant inputs

* use || instead of or

* remove trainling spaces

* enlarge buf in binary_forward to contain other buffer

* use autobuffer in nary_forward

* generate data randomly and add more cases for perf

* add op and, or & xor

* update perf_dnn

* remove some comments

* remove legacy; add two ONNX conformance tests in filter

* move from cpu_denylist to all_denylist

* adjust parsing for inputs>=2

Co-authored-by: fengyuentau <yuantao.feng@opencv.org.cn>
2022-07-19 06:14:05 +03:00
Zihao Mu
1b8fba8e26 support ReduceSum with two input and dynamic shape batch size in ReduceLayer. 2022-07-13 13:46:16 +08:00
Zihao Mu
45fbb67aba fix scale layer can not handle 1x1 weight correctly. 2022-07-13 11:25:27 +08:00
Zihao Mu
a80fcacd90
Merge pull request #21372 from zihaomu:dnn_quantize_per_tensor
Add per_tensor_quantize to int8 quantize

* add per_tensor_quantize to dnn int8 module.

* change api flag from perTensor to perChannel, and recognize quantize type and onnx importer.

* change the default to hpp
2022-07-05 19:14:42 +03:00
Zihao Mu
59b870a87a
Merge pull request #21910 from zihaomu:fast_conv_ARM
DNN: Accelerating convolution

* Fast Conv of ARM, X86 and universal intrinsics.

* improve code style.

* error fixed.

* improve the License

* optimize memory allocated and Adjust the threshold.

* change FasterRCNN_vgg16 to 2GB memory.
2022-07-01 13:03:15 +03:00