Commit Graph

1525 Commits

Author SHA1 Message Date
HAN Liutong
1599f9f0c0
Merge pull request #21086 from hanliutong:rvv-dnn
Further optimize DNN for RISC-V Vector.

* Optimize DNN on RVV by using vsetvl.

* Rename vl.

* Update fastConv by using setvl instead of mask.

* Fix fastDepthwiseConv
2021-12-10 16:03:22 +00:00
Gruhuang
17bc8565f6
Merge pull request #21154 from pccvlab:MatMul_with_two_inputs
Add BatchMatMul layer support for tf_importer

* two inputs

* support batch_matmul

* refactor: remove useless code

* refactor: decrease nesting
2021-12-10 14:44:27 +03:00
Smirnov Egor
e608adea60 add ArgMax and ArgMin layers 2021-12-06 20:49:54 +03:00
HAN Liutong
4935b14539
Merge pull request #21012 from hanliutong:rvv_clang
Update RVV backend for using Clang.

* Update cmake file of clang.

* Modify the RVV optimization on DNN to adapt to clang.

* Modify intrin_rvv: Disable some existing types.

* Modify intrin_rvv: Reinterpret instead of load&cast.

* Modify intrin_rvv: Update load&store without cast.

* Modify intrin_rvv: Rename vfredsum to fredosum.

* Modify intrin_rvv: Rewrite Check all/any by using vpopc.

* Modify intrin_rvv: Use reinterpret instead of c-style casting.

* Remove all macros which is not used in v_reinterpret

* Rename vpopc to vcpop according to spec.
2021-12-03 15:13:24 +00:00
Alexander Alekhin
8b4fa2605e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-12-03 12:32:49 +00:00
Alexander Alekhin
35ff9af6ce Merge pull request #21162 from rogday:softmax_simplification 2021-12-02 17:14:48 +00:00
Alexander Alekhin
dad2b9aac8 Merge pull request #21160 from rogday:elu_alpha 2021-12-02 17:13:57 +00:00
rogday
1613d30544
Merge pull request #21159 from rogday:ceil_mode
fix ceil_mode for Average/MaxPooling

* fix ceil_mode

* add a comment
2021-12-02 20:11:11 +03:00
Alexander Alekhin
5da69c0b9a Merge pull request #21164 from rogday:sum_identity 2021-12-01 22:49:02 +00:00
Alexander Alekhin
a806e8cc58 Merge pull request #21163 from rogday:transpose_default 2021-12-01 22:47:57 +00:00
Smirnov Egor
33e97e994d add sum of 1 input 2021-11-30 15:42:20 +03:00
Smirnov Egor
11e6848bb9 add default order to transpose 2021-11-30 15:34:34 +03:00
Smirnov Egor
829410729c add new (Log)SoftMax simplification passes 2021-11-30 15:20:52 +03:00
Smirnov Egor
4995aecd62 add alpha parameter to ELU 2021-11-30 14:43:18 +03:00
Smirnov Egor
0e2a3686c0 add alpha parameter to ELU layer 2021-11-30 12:20:35 +03:00
Alexander Alekhin
0d2857a242 Merge pull request #21152 from rogday:fix_defaults 2021-11-29 22:39:27 +00:00
Alexander Alekhin
17d99e6266 Merge pull request #21142 from alalek:dnn_two_inputs_ocl_fp16_3.4 2021-11-29 21:44:59 +00:00
Andrew Ryrie
ea7d4be3f8
Merge pull request #20658 from smbz:lstm_optimisation
* dnn: LSTM optimisation

This uses the AVX-optimised fastGEMM1T for matrix multiplications where available, instead of the standard cv::gemm.

fastGEMM1T is already used by the fully-connected layer.  This commit involves two minor modifications:
 - Use unaligned access.  I don't believe this involves any performance hit in on modern CPUs (Nehalem and Bulldozer onwards) in the case where the address is actually aligned.
 - Allow for weight matrices where the number of columns is not a multiple of 8.

I have not enabled AVX-512 as I don't have an AVX-512 CPU to test on.

* Fix warning about initialisation order

* Remove C++11 syntax

* Fix build when AVX(2) is not available

In this case the CV_TRY_X macros are defined to 0, rather than being undefined.

* Minor changes as requested:

 - Don't check hardware support for AVX(2) when dispatch is disabled for these
 - Add braces

* Fix out-of-bounds access in fully connected layer

The old tail handling in fastGEMM1T implicitly rounded vecsize up to the next multiple of 8, and the fully connected layer implements padding up to the next multiple of 8 to cope with this.  The new tail handling does not round the vecsize upwards like this but it does require that the vecsize is at least 8.  To adapt to the new tail handling, the fully connected layer now rounds vecsize itself at the same time as adding the padding(which makes more sense anyway).

This also means that the fully connected layer always passes a vecsize of at least 8 to fastGEMM1T, which fixes the out-of-bounds access problems.

* Improve tail mask handling

 - Use static array for generating tail masks (as requested)
 - Apply tail mask to the weights as well as the input vectors to prevent spurious propagation of NaNs/Infs

* Revert whitespace change

* Improve readability of conditions for using AVX

* dnn(lstm): minor coding style changes, replaced left aligned load
2021-11-29 21:43:00 +00:00
Smirnov Egor
05db8784ae fix Clip, LeakyReLU, LRN, Split defaults 2021-11-29 20:20:34 +03:00
Supernovae
b594ed99b8
Merge pull request #20933 from shubham-shahh:master
Improved overall readability of the code

* grid_nms.cu: minor fix-ups

* Update grid_stride_range.hpp

* Update tf_importer.cpp
2021-11-28 12:54:29 +00:00
Alexander Alekhin
58b06222ff dnn(DataLayer): fix CPU/OpenCL code paths for FP16 handling 2021-11-28 07:44:05 +00:00
yuki takehara
a6277370ca
Merge pull request #21107 from take1014:remove_assert_21038
resolves #21038

* remove C assert

* revert C header

* fix several points in review

* fix test_ds.cpp
2021-11-27 18:34:52 +00:00
Hanxi Guo
1fcf7ba5bc
Merge pull request #20406 from MarkGHX:gsoc_2021_webnn
[GSoC] OpenCV.js: Accelerate OpenCV.js DNN via WebNN

* Add WebNN backend for OpenCV DNN Module

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Add WebNN head files into OpenCV 3rd partiy files

Create webnn.hpp

update cmake

Complete README and add OpenCVDetectWebNN.cmake file

add webnn.cpp

Modify webnn.cpp

Can successfully compile the codes for creating a MLContext

Update webnn.cpp

Update README.md

Update README.md

Update README.md

Update README.md

Update cmake files and

update README.md

Update OpenCVDetectWebNN.cmake and README.md

Update OpenCVDetectWebNN.cmake

Fix OpenCVDetectWebNN.cmake and update README.md

Add source webnn_cpp.cpp and libary libwebnn_proc.so

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

update dnn.cpp

update op_webnn

update op_webnn

Update op_webnn.hpp

update op_webnn.cpp & hpp

Update op_webnn.hpp

Update op_webnn

update the skeleton

Update op_webnn.cpp

Update op_webnn

Update op_webnn.cpp

Update op_webnn.cpp

Update op_webnn.hpp

update op_webnn

update op_webnn

Solved the problems of released variables.

Fixed the bugs in op_webnn.cpp

Implement op_webnn

Implement Relu by WebNN API

Update dnn.cpp for better test

Update elementwise_layers.cpp

Implement ReLU6

Update elementwise_layers.cpp

Implement SoftMax using WebNN API

Implement Reshape by WebNN API

Implement PermuteLayer by WebNN API

Implement PoolingLayer using WebNN API

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Implement poolingLayer by WebNN API and add more detailed logs

Update dnn.cpp

Update dnn.cpp

Remove redundant codes and add more logs for poolingLayer

Add more logs in the pooling layer implementation

Fix the indent issue and resolve the compiling issue

Fix the build problems

Fix the build issue

FIx the build issue

Update dnn.cpp

Update dnn.cpp

* Fix the build issue

* Implement BatchNorm Layer by WebNN API

* Update convolution_layer.cpp

This is a temporary file for Conv2d layer implementation

* Integrate some general functions into op_webnn.cpp&hpp

* Update const_layer.cpp

* Update convolution_layer.cpp

Still have some bugs that should be fixed.

* Update conv2d layer and fc layer

still have some problems to be fixed.

* update constLayer, conv layer, fc layer

There are still some bugs to be fixed.

* Fix the build issue

* Update concat_layer.cpp

Still have some bugs to be fixed.

* Update conv2d layer, fully connected layer and const layer

* Update convolution_layer.cpp

* Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron)

* Delete bib19450.aux

* Add WebNN backend for OpenCV DNN Module

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Add WebNN head files into OpenCV 3rd partiy files

Create webnn.hpp

update cmake

Complete README and add OpenCVDetectWebNN.cmake file

add webnn.cpp

Modify webnn.cpp

Can successfully compile the codes for creating a MLContext

Update webnn.cpp

Update README.md

Update README.md

Update README.md

Update README.md

Update cmake files and

update README.md

Update OpenCVDetectWebNN.cmake and README.md

Update OpenCVDetectWebNN.cmake

Fix OpenCVDetectWebNN.cmake and update README.md

Add source webnn_cpp.cpp and libary libwebnn_proc.so

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

update dnn.cpp

update op_webnn

update op_webnn

Update op_webnn.hpp

update op_webnn.cpp & hpp

Update op_webnn.hpp

Update op_webnn

update the skeleton

Update op_webnn.cpp

Update op_webnn

Update op_webnn.cpp

Update op_webnn.cpp

Update op_webnn.hpp

update op_webnn

update op_webnn

Solved the problems of released variables.

Fixed the bugs in op_webnn.cpp

Implement op_webnn

Implement Relu by WebNN API

Update dnn.cpp for better test

Update elementwise_layers.cpp

Implement ReLU6

Update elementwise_layers.cpp

Implement SoftMax using WebNN API

Implement Reshape by WebNN API

Implement PermuteLayer by WebNN API

Implement PoolingLayer using WebNN API

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Implement poolingLayer by WebNN API and add more detailed logs

Update dnn.cpp

Update dnn.cpp

Remove redundant codes and add more logs for poolingLayer

Add more logs in the pooling layer implementation

Fix the indent issue and resolve the compiling issue

Fix the build problems

Fix the build issue

FIx the build issue

Update dnn.cpp

Update dnn.cpp

* Fix the build issue

* Implement BatchNorm Layer by WebNN API

* Update convolution_layer.cpp

This is a temporary file for Conv2d layer implementation

* Integrate some general functions into op_webnn.cpp&hpp

* Update const_layer.cpp

* Update convolution_layer.cpp

Still have some bugs that should be fixed.

* Update conv2d layer and fc layer

still have some problems to be fixed.

* update constLayer, conv layer, fc layer

There are still some bugs to be fixed.

* Update conv2d layer, fully connected layer and const layer

* Update convolution_layer.cpp

* Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron)

* Update dnn.cpp

* Fix Error in dnn.cpp

* Resolve duplication in conditions in convolution_layer.cpp

* Fixed the issues in the comments

* Fix building issue

* Update tutorial

* Fixed comments

* Address the comments

* Update CMakeLists.txt

* Offer more accurate perf test on native

* Add better perf tests for both native and web

* Modify per tests for better results

* Use more latest version of Electron

* Support latest WebNN Clamp op

* Add definition of HAVE_WEBNN macro

* Support group convolution

* Implement Scale_layer using WebNN

* Add Softmax option for native classification example

* Fix comments

* Fix comments
2021-11-23 21:15:31 +00:00
Alexander Alekhin
394e640909 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-11-13 15:11:30 +00:00
Alexander Alekhin
8041ab8a61
Merge pull request #21025 from alalek:issue_21004
* dnn(ocl4dnn): fix LRN layer accuracy problems

- FP16 intermediate computation is not accurate and may provide NaN values

* dnn(test): update tolerance for FP16
2021-11-12 01:54:07 +03:00
ZaKiiiiiiiii
98b6ce353c
Merge pull request #20904 from Crayon-new:fix_bug_in_maxLayer
fix bug: wrong output dimension when "keep_dims" is false in pooling layer.

* fix bug in max layer

* code align

* delete permute layer and add test case

* add name assert

* check other cases

* remove c++11 features

* style:add "const" remove assert

* style:sanitize file names
2021-11-09 19:24:04 +03:00
Alexander Alekhin
7842181b47 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-11-05 09:27:46 +00:00
Alexander Alekhin
d484939c02
Merge pull request #20999 from alalek:dnn_replace_deprecated_calls
dnn(protobuf): replace deprecated calls

* dnn: replace deprecated ByteSize() => ByteSizeLong()

* dnn: replace deprecated calls, use GetRepeatedFieldRef
2021-11-03 15:59:36 +00:00
Alexander Alekhin
ec10f2e72b Merge pull request #20877 from rogday:simple_layers 2021-10-20 17:00:38 +00:00
rogday
b3f966e2ca
Merge pull request #20883 from rogday:eltwise_refactoring
* backport elementwise_layers refactor

* keep NULL
2021-10-19 13:29:22 +00:00
Alexander Alekhin
1926e919be dnn(int8): fix using of incorrect UMat constructor 2021-10-18 04:46:00 +00:00
Alexander Alekhin
31c40fa4cc Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-10-15 13:35:03 +00:00
Smirnov Egor
1feb3838b5 add Ceil, Floor, Log, Round, Sqrt, Not, Equal, Less, Greater 2021-10-15 16:02:46 +03:00
Alexander Alekhin
53d6c9b9c0 Merge pull request #20860 from rogday:sum_fix 2021-10-12 15:36:32 +00:00
Smirnov Egor
238dbffb48 change asserts for Sum 2021-10-11 20:59:44 +03:00
Smirnov Egor
a9d7b6eab7 fix const - input and remove unimplemented function 2021-10-11 18:58:10 +03:00
Alexander Alekhin
4672dbda2a Merge pull request #20818 from rogday:yolov4x_mish_cuda 2021-10-08 19:12:43 +00:00
Smirnov Egor
9c84749e2c backport YOLOv4x-mish new_coords CUDA implementation 2021-10-08 14:14:49 +03:00
Alexander Alekhin
cca4c47781 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-10-08 11:05:45 +00:00
Alexander Alekhin
81e7988eb9 Merge pull request #20840 from alalek:dnn_ocl_cleanup_code 2021-10-08 05:07:51 +00:00
Alexander Alekhin
8c2dd5fb9a dnn(ocl4dnn): cleanup dead code, improve logging 2021-10-08 00:39:40 +00:00
Alexander Alekhin
724e04e979 dnn(ocl4dnn): add extra checks to convolution layer
- prevent running code over unsupported/non-tested configurations
- prevent integer div by zero
2021-10-07 23:18:32 +00:00
Alexander Alekhin
03a08435e2 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-10-07 04:27:22 +00:00
Alexander Alekhin
822d468232 Merge pull request #20813 from rogday:soft_nms 2021-10-06 20:20:34 +00:00
Smirnov Egor
2221dcc9f2 add SoftNMS implementation 2021-10-06 21:31:45 +03:00
Oliver Kuckertz
a3d7811f24
Merge pull request #20725 from mologie:fix-dnn-tf-on-arm
* dnn: fix unaligned memory access crash on armv7

The getTensorContent function would return a Mat pointing to some
member of a Protobuf-encoded message. Protobuf does not make any
alignment guarantees, which results in a crash on armv7 when loading
models while bit 2 is set in /proc/cpu/alignment (or the relevant
kernel feature for alignment compatibility is disabled). Any read
attempt from the previously unaligned data member would send SIGBUS.

As workaround, this commit makes an aligned copy via existing clone
functionality in getTensorContent. The unsafe copy=false option is
removed. Unfortunately, a rather crude hack in PReLUSubgraph in fact
writes(!) to the Protobuf message. We limit ourselves to fixing the
alignment issues in this commit, and add getTensorContentRefUnaligned
to cover the write case with a safe memcpy. A FIXME marks the issue.

* dnn: reduce amount of .clone() calls

* dnn: update FIXME comment

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2021-10-06 16:41:05 +00:00
HAN Liutong
e5fb50476c
Merge pull request #20521 from hanliutong:dev-rvv-multiVLEN
Make the implementation of optimization in DNN adjustable to different vector sizes with RVV intrinsics.

* Update fastGEMM for multi VLEN.

* Update fastGEMM1T for multi VLEN.

* Update fastDepthwiseConv for multi VLEN.

* Update fastConv for multi VLEN.

* Replace malloc with cv::AutoBuffer.
2021-10-05 15:35:00 +00:00
Alexander Alekhin
1b70f94282 Merge pull request #20782 from YashasSamaga:cuda4dnn-eltwise-broadcast 2021-10-04 22:35:00 +00:00
Jebastin Nadar
cce78cc5e2
Merge pull request #20535 from SamFC10:onnx-q
dnn : int8 quantized layers support in onnx importer

* added quantized layers support in onnx importer

* added more cases in eltwise node, some more checks

* added tests for quantized nodes

* relax thresholds for failed tests, address review comments

* refactoring based on review comments

* added support for unsupported cases and pre-quantized resnet50 test

* relax thresholds due to int8 resize layer
2021-10-04 18:07:38 +00:00
Zihao Mu
9085b933d8
Merge pull request #20702 from zihaomu:tf_expand_dim_layer
Add ExpandDims layer of tf_importer.cpp

* Add ExpandDims to tf_importer.

* add -1 expand test case.

* Support different dimensions of input.

* Compatible with 5-dimensional NDHWC data

* Code align

* support 3-dim input.

* 3-dim bug fixed.

* fixing error of code format.
2021-10-04 16:37:38 +00:00
YashasSamaga
505dde09de support broadcasting in eltwise ops 2021-10-04 12:38:45 +05:30
SamFC10
87ebf2e50b fix illegal memory access in int8 convolution 2021-10-03 15:16:01 +05:30
Alexander Alekhin
37c3f0d8a0 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-10-02 17:57:18 +00:00
Alexander Alekhin
f977d10a19 dnn(ocl): fix conv DWCONV workgroup 2021-10-01 18:52:07 +00:00
Alexander Alekhin
846317ef37 dnn(ocl): fix conv BASIC workgroup 2021-09-29 14:55:46 +00:00
Alexander Alekhin
24fcb7f813 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-09-25 17:50:00 +00:00
rogday
38b9ec7a18
Merge pull request #20682 from rogday:min
* Add Min layer to CPU, OpenCL, Halide, Inference Engine, NGraph and CUDA

* fix indentation

* add min to fusion and halide tests; fix doc
2021-09-22 15:17:37 +03:00
SamFC10
9c5d7716e2 fix for unsqueeze opset version 13 2021-09-17 17:40:57 +05:30
rogday
c410d7a97d
Merge pull request #20671 from rogday:yolov4x-mish
Add support for YOLOv4x-mish

* backport to 3.4 for supporting yolov4x-mish

* add YOLOv4x-mish test

* address review comments

Co-authored-by: Guo Xu <guoxu@1school.com.cn>
2021-09-14 17:49:49 +00:00
YashasSamaga
50462dcdc6 fix effrank assert to allow input effrank <= output effrank 2021-09-13 20:44:33 +05:30
Alexander Alekhin
6e66a9222a dnn(onnx): fix format specifier 2021-09-11 22:26:52 +00:00
Alexander Alekhin
c3ac834526 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-09-11 21:27:26 +00:00
Zihao Mu
51b03b87e6 BiasAdd could load Const from second place. 2021-09-11 15:34:41 +00:00
Alexander Alekhin
6ace801418 Merge pull request #20661 from alalek:dnn_ocl_fix_gemm_like_kernel 2021-09-10 11:58:52 +00:00
rogday
d31b93b513
Merge pull request #20674 from rogday:prelu_slope
Fix PReLU negative slope access pattern

* fix prelu negative slope access pattern

* change begin() to ptr()
2021-09-10 11:07:16 +00:00
rogday
4807cd8a6e
Merge pull request #20605 from rogday:split_slice_shenanigans
Add Normalize subgraph, fix Slice, Mul and Expand

* Add Normalize subgraph, support for starts<0 and axis<0 in Slice, Mul broadcasting in the middle and fix Expand's unsqueeze

* remove todos

* remove range-based for loop

* address review comments

* change >> to > > in template

* fix indexation

* fix expand that does nothing
2021-09-09 14:41:40 +03:00
Alexander Alekhin
35e824c287 dnn(ocl): fix out of bound access in GEMM-like kernels
- dropped usage of CreateSubBuffer() - buffers lifetime management issue
- fixed elementwise offset
- avoid out of bounds read access
2021-09-06 18:17:21 +00:00
Alexander Alekhin
5578ad5e14 dnn(ocl): fix automatic globalsize adjusting
- if kernel code doesn't support that
2021-09-06 03:11:29 +00:00
Alexander Alekhin
0a43b23275 Merge pull request #20651 from alalek:issue_18361 2021-09-04 18:22:12 +00:00
Alexander Alekhin
7967683296 Merge pull request #20648 from alalek:issue_20615 2021-09-04 18:21:58 +00:00
Alexander Alekhin
5b2c016834 dnn(ocl): avoid out of buffer access in copyWeightsSwizzled 2021-09-04 15:45:59 +00:00
Alexander Alekhin
407adc7061 dnn(ocl): fix buffer offsets in IDLF kernel
- drop CreateSubBuffer
- fix FUSED_CONV_ELTWISE mode
2021-09-04 15:28:35 +00:00
rogday
d0e612dc36
Merge pull request #20647 from rogday:resize_concat_optimization
Fix resize+concat optimization

* fix resize+concat optimization

* add comment and fix indentation
2021-09-03 12:32:29 +00:00
Alexander Alekhin
5aa7435d25 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-09-02 15:24:04 +00:00
Alexander Alekhin
060a76dc3e Merge pull request #20573 from rogday:onnx_scale_fix 2021-09-01 14:09:17 +00:00
WJJ1995
edc442afdb
Merge pull request #20511 from wjj19950828:add_humanseg_support_0806
* support PPSeg model for dnn module

* fixed README for CI

* add test case

* fixed bug

* deal with comments

* rm dnn_model_runner

* update test case

* fixed bug for testcase

* update testcase
2021-09-01 10:10:05 +00:00
Alexander Alekhin
ae6fabc6fe dnn(ocl): drop CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE check
- it is a hint and it should not block kernel execution
2021-08-30 20:40:14 +00:00
Alexander Alekhin
4c05a697fa Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-08-28 21:30:28 +00:00
Vincent Rabaud
38d0063c36 Do not use deprecated ReleaseCleared in protobuf library.
This is to make code work with protobuf arenas for memory
management (ReleaseCleared is incompatible).
The cleaning of the memory is also simpler.
2021-08-26 15:36:22 +02:00
Alexander Alekhin
6fbfc58602 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-08-21 17:25:18 +00:00
Alexander Alekhin
77a5c43d50 Merge pull request #20586 from alalek:issue_20585 2021-08-21 17:22:58 +00:00
Alexander Alekhin
f28e4b86fb dnn(ocl): fix top initialization in verifyResult 2021-08-21 16:04:13 +00:00
rogday
6801dd043d
Merge pull request #20494 from rogday:onnx_diagnostic_fix
fix ONNXImporter diagnostic mode layer registration issue

* fix layer registration, thread unsafe access and align the behavior of DNN_DIAGNOSTICS_RUN between onnx and tf importers

* move skipModelInput

* print all missing layers

* address TF issue
2021-08-20 14:43:47 +00:00
Alexander Alekhin
a9817e9127 Merge pull request #20556 from rogday:onnx_split_sum_fix 2021-08-20 08:10:18 +00:00
Vincent Rabaud
9cfa84313c Use the one argument version of SetTotalBytesLimit.
The two argument versions has been deprecated, cf
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream
2021-08-19 14:31:29 +02:00
SamFC10
fa90e14b06 int8 layers and 8-bit quantization support 2021-08-19 09:56:47 +05:30
Smirnov Egor
fe625a558e fix hasDynamicShapes for batch_size and fix axis selection in Scale layer 2021-08-18 19:22:24 +03:00
thezane
210bfaf8d6
Merge pull request #20483 from thezane:support-cumsum-layer-for-onnx
* Support cumsum layer for onnx

* Add unit tests

* Address review comments
2021-08-17 20:09:25 +03:00
Smirnov Egor
9ef41f68fb fix Split partial sum 2021-08-16 15:44:54 +03:00
Alexander Alekhin
05d733e707 Merge pull request #20524 from yichenj:dnn_text_recognition_enhance 2021-08-15 12:30:25 +00:00
Alexander Alekhin
0c01cf7c85 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-08-14 18:24:00 +00:00
Julia Bareeva
cfb36443fb
Merge pull request #20506 from JulieBar:lstm_activations
* Support activations(Sigmoid, Tanh) for LSTM

* fix warning
2021-08-13 15:41:00 +03:00
Alexander Alekhin
9d3826c676 Merge pull request #20525 from SamFC10:fix-prior-variances 2021-08-13 10:06:55 +00:00
JIANG Yichen
955cf35d5f Implement ctc prefix beam search decode for TextRecognitionModel.
The algorithm is based on Hannun's paper: First-Pass Large Vocabulary
Continuous Speech Recognition using Bi-Directional Recurrent DNNs
2021-08-12 20:33:31 +08:00
HAN Liutong
aaca4987c9
Merge pull request #20287 from hanliutong:dev-rvv-0.10
Optimization of DNN using native RISC-V vector intrinsics.

* Use RVV to optimize fastGEMM (FP32) in DNN.

* Use RVV to optimize fastGEMM1T in DNN.

* Use RVV to optimize fastConv in DNN.

* Use RVV to optimize fastDepthwiseConv in DNN.

* Vectorize tails using vl.

* Use "vl" instead of scalar to handle small block in fastConv.

* Fix memory access out of bound in "fastGEMM1T".

* Remove setvl.

* Remove useless initialization.

* Use loop unrolling to handle tail part instead of switch.
2021-08-11 01:16:03 +03:00
Smirnov Egor
739ff84732 add Max layer to TFImporter 2021-08-09 14:01:51 +03:00
SamFC10
2a177052de fix bug in prior-box variances 2021-08-09 12:08:55 +05:30
Julia Bareeva
e1cafa3834
Merge pull request #20442 from JulieBar:gru_layer
* Add initialization and inference for GRU layer

* fix issues found on review
2021-08-07 10:07:37 +03:00
Alexander Alekhin
907743eee7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-07-30 14:50:36 +00:00
Smirnov Egor
27392f832d reimplement onnx refactor for master 2021-07-30 13:00:13 +03:00
rogday
cff0168f3a
Merge pull request #20453 from rogday:onnx_importer_fix
Split layer dispatch into functions in ONNXImporter

* split layer dispatch into functions

* fixes

* identation and comment fixes

* fix constness
2021-07-28 18:06:24 +03:00
Alexander Alekhin
f4d6a3ec4e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-07-24 15:34:58 +00:00
Julia Bareeva
4e5699fa71
Merge pull request #20450 from JulieBar:lstm_inside
Support non-zero hidden state for LSTM

* fully support non-zero hidden state for LSTM

* check dims of hidden state for LSTM

* fix failed test Test_Model.TextRecognition

* add new tests for LSTM w/ non-zero hidden params

Co-authored-by: Julie Bareeva <julia.bareeva@xperience.ai>
2021-07-23 17:11:50 +03:00
Smirnov Egor
024b43ca06 implement asymmetric padding for conv2d, max_pool and conv2d_backprop_input 2021-07-22 16:58:40 +03:00
Alexander Alekhin
b61a55eebf Merge pull request #20402 from rogday:tf_diag_dummy 2021-07-16 15:44:29 +00:00
Smirnov Egor
c30078c5a3 add NotImplemented layer 2021-07-16 15:39:54 +03:00
Alexander Alekhin
39b91c97f0 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-07-16 10:35:42 +00:00
Alexander Alekhin
8334ee18e6 Merge pull request #20394 from SamFC10:conv-asymmetric-pads 2021-07-16 10:33:42 +00:00
SamFC10
96d35f7c54 Fix convolution asymmetric padding bug in onnx importer 2021-07-16 09:39:41 +05:30
Alexander Alekhin
fbde0c6c96 dnn(ie): fix handling of 1D and non-32F outputs of InferenceEngine 2021-07-15 21:47:05 +00:00
Alexander Alekhin
9e42e04b4a Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-07-10 13:01:03 +00:00
César Gouveia
167a12028d
Merge pull request #20374 from cesarpgouveia:bugfix/fix_load_onnxModel_debug
* Fix bug while loading onnx model in debug

* dnn: fix other .at using

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2021-07-09 18:21:56 +00:00
Alexander Alekhin
821fae0d94 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-07-03 00:30:58 +00:00
mitruska
18dbac203f Use explicit version of ngraph NormalizeL2 2021-07-02 21:33:05 +00:00
Alexander Alekhin
8fad85edda Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-07-01 10:52:31 +00:00
Alexander Alekhin
b699fe7a9d Merge pull request #20335 from SamFC10:concat-const-input 2021-07-01 10:25:35 +00:00
SamFC10
5b8c10f2f8 modified onnx importer to concat const input blobs 2021-07-01 10:58:31 +05:30
Alexander Alekhin
24983f62e2 Merge pull request #20325 from alalek:dnn_openvino_2021.4.0 2021-06-30 23:58:26 +00:00
Alexander Alekhin
f2057ce1ab dnn(ie): replace deprecated calls 2021-06-30 22:30:15 +00:00
Alexander Alekhin
7d842f5bcf dnn: use OpenVINO 2021.4 defines 2021-06-29 18:48:21 +00:00
Smirnov Egor
dc5199feea skipping missing layers and layer failures 2021-06-25 11:26:37 +03:00
SamFC10
55e1dfb778 Fix BatchNorm reinitialization 2021-06-20 13:19:29 +05:30
Alexander Alekhin
735a79ae83 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-06-19 18:44:16 +00:00
rogday
7ee1816612 split if into map of functions 2021-06-11 13:20:45 +03:00
YashasSamaga
32df5faa25 add MatMulOp 2021-05-22 01:01:29 +05:30
Alexander Alekhin
170bf6d7af Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-05-01 09:44:24 +00:00
Alexander Alekhin
71bae7c23f dnn(ie): implicit usage of IE::GPU OpenCL kernels cache 2021-04-29 12:43:22 +03:00
Aleksandr Voron
2e143b8799
Merge pull request #19961 from alvoron:dnn_ngraph_int64_fix
Explicit usage of int64_t in CropAndResizeLayer (IE backend)

* Update crop_and_resize_layer.cpp
2021-04-21 18:29:19 +00:00
Alexander Alekhin
3e1673e8b2 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-04-01 09:54:57 +00:00
Anastasia Murzova
cc6d48959e Added reduce sum by channel support 2021-03-30 23:01:22 +03:00
Vitaly Tuzov
aab62aa6dd
Merge pull request #18952 from terfendail:wui_doc
* Updated UI documentation to address WUI

* Added documentation for vx_ calls

* Removed vx_store operation overload

* Doxyfile updated to enable wide UI

* Enable doxygen documentation for vx_ WUI functions

* Wide intrinsics definition rework

* core: fix SIMD C++ emulator build (supports 128-bit only)
2021-03-30 16:18:03 +00:00
Alexander Alekhin
c89084e6b7 Merge pull request #19223 from YashasSamaga:cuda4dnn-halfpix-linear-resize 2021-03-30 13:19:41 +00:00
Anastasia M
e08de1101d
Merge pull request #19693 from LupusSanctus:onnx_diagnostic
ONNX diagnostic tool

* Final

* Add forgotten Normalize layer to the set of supported types

* ONNX diagnostic tool corrections

* Fixed CI test warnings

* Added code minor corrections

Co-authored-by: Sergey Slashchinin <sergei.slashchinin@xperience.ai>
2021-03-29 16:38:28 +00:00
Alexander Alekhin
35eaacd1db Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-27 15:35:16 +00:00
Alexander Alekhin
d27eb79fa6 Merge pull request #19785 from alalek:dnn_ocl_fix_async_kernels 2021-03-26 12:27:58 +00:00
Anastasia M
3e48a91d97
Merge pull request #19546 from LupusSanctus:am/slice_steps
* Added Steps support in DNN Slice layer

* Added code corrections

* dnn(slice): fix OCL and OCL_FP16 processing
2021-03-26 11:04:57 +00:00
Alexander Alekhin
86d0a86141 dnn(ocl): fix gemm kernel scheduling 2021-03-26 00:35:00 +00:00
Alexander Alekhin
b62d015285 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-24 18:58:46 +00:00
Alexander Alekhin
56bdd7db5c dnn: use OpenVINO 2021.3 defines
original commit: 6291503793
2021-03-24 10:26:24 +00:00
Anastasia Murzova
e75f1b071b Added reshape corrections 2021-03-24 10:53:11 +03:00
Anastasia Murzova
7a2b3ed471 Corrected DNN elementwise multiplication 2021-03-24 10:53:11 +03:00
Anastasia M
551d4a8ec1
Merge pull request #19477 from LupusSanctus:am/eltwice_vec
* Aligned OpenCV DNN and TF sum op behaviour

Support Mat (shape: [1, m, k, n] ) + Vec (shape: [1, 1, 1, n]) operation
by vec to mat expansion

* Added code corrections: backend, minor refactoring
2021-03-23 22:16:09 +00:00
Alexander Alekhin
ca8c3dd9b5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-22 12:05:23 +00:00
Liubov Batanina
c0dd82fb53
Merge pull request #19632 from l-bat:lb/ie_arm_target
Added OpenVINO ARM target

* Added IE ARM target

* Added OpenVINO ARM target

* Delete ARM target

* Detect ARM platform

* Changed device name in ArmPlugin

* Change ARM detection
2021-03-20 11:20:02 +00:00
Alexander Alekhin
b19f860384 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-13 13:02:18 +00:00
Liubov Batanina
8d29a902e4 Added ngraph::op::v6::MVN 2021-03-12 21:02:03 +03:00
Liubov Batanina
95ab9468c1 Added ngraph::op::v4::Interpolation 2021-03-12 12:00:59 +03:00
Alexander Alekhin
fbb38cc245 Merge pull request #19222 from YashasSamaga:cuda4dnn-fix-build-diagnostics 2021-03-10 17:40:36 +00:00
Alexander Alekhin
e4692ac079 Merge pull request #19613 from WeiChungChang:NMS_refine 2021-03-10 17:36:57 +00:00
Qoo
47337e2196 boost NMS performance 2021-03-10 15:59:26 +00:00