Commit Graph

339 Commits

Author SHA1 Message Date
Alexander Alekhin
60c093f086 pre: OpenCV 3.4.17 (version++) 2021-12-17 10:05:52 +00:00
Smirnov Egor
e608adea60 add ArgMax and ArgMin layers 2021-12-06 20:49:54 +03:00
Alexander Alekhin
8b4fa2605e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-12-03 12:32:49 +00:00
Smirnov Egor
0e2a3686c0 add alpha parameter to ELU layer 2021-11-30 12:20:35 +03:00
Hanxi Guo
1fcf7ba5bc
Merge pull request #20406 from MarkGHX:gsoc_2021_webnn
[GSoC] OpenCV.js: Accelerate OpenCV.js DNN via WebNN

* Add WebNN backend for OpenCV DNN Module

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Add WebNN head files into OpenCV 3rd partiy files

Create webnn.hpp

update cmake

Complete README and add OpenCVDetectWebNN.cmake file

add webnn.cpp

Modify webnn.cpp

Can successfully compile the codes for creating a MLContext

Update webnn.cpp

Update README.md

Update README.md

Update README.md

Update README.md

Update cmake files and

update README.md

Update OpenCVDetectWebNN.cmake and README.md

Update OpenCVDetectWebNN.cmake

Fix OpenCVDetectWebNN.cmake and update README.md

Add source webnn_cpp.cpp and libary libwebnn_proc.so

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

update dnn.cpp

update op_webnn

update op_webnn

Update op_webnn.hpp

update op_webnn.cpp & hpp

Update op_webnn.hpp

Update op_webnn

update the skeleton

Update op_webnn.cpp

Update op_webnn

Update op_webnn.cpp

Update op_webnn.cpp

Update op_webnn.hpp

update op_webnn

update op_webnn

Solved the problems of released variables.

Fixed the bugs in op_webnn.cpp

Implement op_webnn

Implement Relu by WebNN API

Update dnn.cpp for better test

Update elementwise_layers.cpp

Implement ReLU6

Update elementwise_layers.cpp

Implement SoftMax using WebNN API

Implement Reshape by WebNN API

Implement PermuteLayer by WebNN API

Implement PoolingLayer using WebNN API

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Implement poolingLayer by WebNN API and add more detailed logs

Update dnn.cpp

Update dnn.cpp

Remove redundant codes and add more logs for poolingLayer

Add more logs in the pooling layer implementation

Fix the indent issue and resolve the compiling issue

Fix the build problems

Fix the build issue

FIx the build issue

Update dnn.cpp

Update dnn.cpp

* Fix the build issue

* Implement BatchNorm Layer by WebNN API

* Update convolution_layer.cpp

This is a temporary file for Conv2d layer implementation

* Integrate some general functions into op_webnn.cpp&hpp

* Update const_layer.cpp

* Update convolution_layer.cpp

Still have some bugs that should be fixed.

* Update conv2d layer and fc layer

still have some problems to be fixed.

* update constLayer, conv layer, fc layer

There are still some bugs to be fixed.

* Fix the build issue

* Update concat_layer.cpp

Still have some bugs to be fixed.

* Update conv2d layer, fully connected layer and const layer

* Update convolution_layer.cpp

* Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron)

* Delete bib19450.aux

* Add WebNN backend for OpenCV DNN Module

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Add WebNN head files into OpenCV 3rd partiy files

Create webnn.hpp

update cmake

Complete README and add OpenCVDetectWebNN.cmake file

add webnn.cpp

Modify webnn.cpp

Can successfully compile the codes for creating a MLContext

Update webnn.cpp

Update README.md

Update README.md

Update README.md

Update README.md

Update cmake files and

update README.md

Update OpenCVDetectWebNN.cmake and README.md

Update OpenCVDetectWebNN.cmake

Fix OpenCVDetectWebNN.cmake and update README.md

Add source webnn_cpp.cpp and libary libwebnn_proc.so

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

Update dnn.cpp

update dnn.cpp

update op_webnn

update op_webnn

Update op_webnn.hpp

update op_webnn.cpp & hpp

Update op_webnn.hpp

Update op_webnn

update the skeleton

Update op_webnn.cpp

Update op_webnn

Update op_webnn.cpp

Update op_webnn.cpp

Update op_webnn.hpp

update op_webnn

update op_webnn

Solved the problems of released variables.

Fixed the bugs in op_webnn.cpp

Implement op_webnn

Implement Relu by WebNN API

Update dnn.cpp for better test

Update elementwise_layers.cpp

Implement ReLU6

Update elementwise_layers.cpp

Implement SoftMax using WebNN API

Implement Reshape by WebNN API

Implement PermuteLayer by WebNN API

Implement PoolingLayer using WebNN API

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Update pooling_layer.cpp

Implement poolingLayer by WebNN API and add more detailed logs

Update dnn.cpp

Update dnn.cpp

Remove redundant codes and add more logs for poolingLayer

Add more logs in the pooling layer implementation

Fix the indent issue and resolve the compiling issue

Fix the build problems

Fix the build issue

FIx the build issue

Update dnn.cpp

Update dnn.cpp

* Fix the build issue

* Implement BatchNorm Layer by WebNN API

* Update convolution_layer.cpp

This is a temporary file for Conv2d layer implementation

* Integrate some general functions into op_webnn.cpp&hpp

* Update const_layer.cpp

* Update convolution_layer.cpp

Still have some bugs that should be fixed.

* Update conv2d layer and fc layer

still have some problems to be fixed.

* update constLayer, conv layer, fc layer

There are still some bugs to be fixed.

* Update conv2d layer, fully connected layer and const layer

* Update convolution_layer.cpp

* Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron)

* Update dnn.cpp

* Fix Error in dnn.cpp

* Resolve duplication in conditions in convolution_layer.cpp

* Fixed the issues in the comments

* Fix building issue

* Update tutorial

* Fixed comments

* Address the comments

* Update CMakeLists.txt

* Offer more accurate perf test on native

* Add better perf tests for both native and web

* Modify per tests for better results

* Use more latest version of Electron

* Support latest WebNN Clamp op

* Add definition of HAVE_WEBNN macro

* Support group convolution

* Implement Scale_layer using WebNN

* Add Softmax option for native classification example

* Fix comments

* Fix comments
2021-11-23 21:15:31 +00:00
Alexander Alekhin
7ba26ada12 Merge branch 4.x 2021-10-15 21:53:39 +00:00
Smirnov Egor
1feb3838b5 add Ceil, Floor, Log, Round, Sqrt, Not, Equal, Less, Greater 2021-10-15 16:02:46 +03:00
Smirnov Egor
2221dcc9f2 add SoftNMS implementation 2021-10-06 21:31:45 +03:00
Alexander Alekhin
3e6f27522b pre: OpenCV 4.5.4 (version++) 2021-10-04 22:35:47 +00:00
Alexander Alekhin
ebef84e9ea pre: OpenCV 3.4.16 (version++) 2021-10-04 20:47:07 +00:00
Jebastin Nadar
cce78cc5e2
Merge pull request #20535 from SamFC10:onnx-q
dnn : int8 quantized layers support in onnx importer

* added quantized layers support in onnx importer

* added more cases in eltwise node, some more checks

* added tests for quantized nodes

* relax thresholds for failed tests, address review comments

* refactoring based on review comments

* added support for unsupported cases and pre-quantized resnet50 test

* relax thresholds due to int8 resize layer
2021-10-04 18:07:38 +00:00
rogday
38b9ec7a18
Merge pull request #20682 from rogday:min
* Add Min layer to CPU, OpenCL, Halide, Inference Engine, NGraph and CUDA

* fix indentation

* add min to fusion and halide tests; fix doc
2021-09-22 15:17:37 +03:00
rogday
6801dd043d
Merge pull request #20494 from rogday:onnx_diagnostic_fix
fix ONNXImporter diagnostic mode layer registration issue

* fix layer registration, thread unsafe access and align the behavior of DNN_DIAGNOSTICS_RUN between onnx and tf importers

* move skipModelInput

* print all missing layers

* address TF issue
2021-08-20 14:43:47 +00:00
SamFC10
fa90e14b06 int8 layers and 8-bit quantization support 2021-08-19 09:56:47 +05:30
thezane
210bfaf8d6
Merge pull request #20483 from thezane:support-cumsum-layer-for-onnx
* Support cumsum layer for onnx

* Add unit tests

* Address review comments
2021-08-17 20:09:25 +03:00
Alexander Alekhin
05d733e707 Merge pull request #20524 from yichenj:dnn_text_recognition_enhance 2021-08-15 12:30:25 +00:00
JIANG Yichen
955cf35d5f Implement ctc prefix beam search decode for TextRecognitionModel.
The algorithm is based on Hannun's paper: First-Pass Large Vocabulary
Continuous Speech Recognition using Bi-Directional Recurrent DNNs
2021-08-12 20:33:31 +08:00
Julia Bareeva
e1cafa3834
Merge pull request #20442 from JulieBar:gru_layer
* Add initialization and inference for GRU layer

* fix issues found on review
2021-08-07 10:07:37 +03:00
Alexander Alekhin
7a5f554bc4 Merge branch 4.x 2021-06-13 10:27:44 +00:00
Alexander Alekhin
b57faa41c2 pre: OpenCV 4.5.3 (version++) 2021-06-08 08:52:20 +00:00
Alexander Alekhin
43940f7ffc pre: OpenCV 3.4.15 (version++) 2021-06-07 20:10:34 +00:00
Alexander Alekhin
b91e0dca90 Merge branch 4.x 2021-06-04 15:18:51 +00:00
Alexander Alekhin
3e513ee6ab Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-06-03 16:23:36 +00:00
Paul Jurczak
ff60abb575
Merge pull request #20080 from pauljurczak:patch-3
* Update dnn.hpp

getPerfProfile is not supported by the CUDA backend, see https://github.com/opencv/opencv/issues/20077

* dnn.hpp: fix doxygen formatting
2021-06-02 19:15:52 +00:00
Alexander Alekhin
fc628014bb Merge branch 4.x 2021-04-10 18:03:01 +00:00
Anastasia M
e08de1101d
Merge pull request #19693 from LupusSanctus:onnx_diagnostic
ONNX diagnostic tool

* Final

* Add forgotten Normalize layer to the set of supported types

* ONNX diagnostic tool corrections

* Fixed CI test warnings

* Added code minor corrections

Co-authored-by: Sergey Slashchinin <sergei.slashchinin@xperience.ai>
2021-03-29 16:38:28 +00:00
Alexander Alekhin
35eaacd1db Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-27 15:35:16 +00:00
Anastasia M
3e48a91d97
Merge pull request #19546 from LupusSanctus:am/slice_steps
* Added Steps support in DNN Slice layer

* Added code corrections

* dnn(slice): fix OCL and OCL_FP16 processing
2021-03-26 11:04:57 +00:00
Alexander Alekhin
b62d015285 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-24 18:58:46 +00:00
Anastasia M
551d4a8ec1
Merge pull request #19477 from LupusSanctus:am/eltwice_vec
* Aligned OpenCV DNN and TF sum op behaviour

Support Mat (shape: [1, m, k, n] ) + Vec (shape: [1, 1, 1, n]) operation
by vec to mat expansion

* Added code corrections: backend, minor refactoring
2021-03-23 22:16:09 +00:00
Alexander Alekhin
ca8c3dd9b5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-03-22 12:05:23 +00:00
Liubov Batanina
c0dd82fb53
Merge pull request #19632 from l-bat:lb/ie_arm_target
Added OpenVINO ARM target

* Added IE ARM target

* Added OpenVINO ARM target

* Delete ARM target

* Detect ARM platform

* Changed device name in ArmPlugin

* Change ARM detection
2021-03-20 11:20:02 +00:00
Alexander Alekhin
a823b06fa5 pre: OpenCV 4.5.2 (version++) 2021-03-02 23:20:59 +00:00
Alexander Alekhin
a123c48d4d pre: OpenCV 3.4.14 (version++) 2021-03-02 20:47:29 +00:00
SamFC10
96947c30c0 Added exp layer
backport of commit: 6111935835
partial backport of commit: dd5976162b
2021-02-28 19:59:40 +00:00
SamFC10
6111935835 Added exp layer 2021-02-20 22:16:00 +05:30
Tsukasa Sugiura
107f233626
Merge pull request #19484 from UnaNancyOwen:fix_highlevelapi
* [dnn] fix high level api for python

* [dnn] add test_textdetection_model_db

* [dnn] fix textdetection test only check type and shape
2021-02-10 19:42:00 +00:00
Alexander Alekhin
6b474c4051 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-02-06 00:44:11 +00:00
Alexander Alekhin
83aa711346 dnn: rename clamp() => normalize_axis() 2021-02-04 08:13:55 +00:00
Alexander Alekhin
d1b5a78171 build warnings
- GCC 4.8.5 / CentOS 7
2020-12-05 20:08:29 +00:00
Wenqing Zhang
22d64ae08f
Merge pull request #17570 from HannibalAPE:text_det_recog_demo
[GSoC] High Level API and Samples for Scene Text Detection and Recognition

* APIs and samples for scene text detection and recognition

* update APIs and tutorial for Text Detection and Recognition

* API updates:
(1) put decodeType into struct Voc
(2) optimize the post-processing of DB

* sample update:
(1) add transformation into scene_text_spotting.cpp
(2) modify text_detection.cpp with API update

* update tutorial

* simplify text recognition API
update tutorial

* update impl usage in recognize() and detect()

* dnn: refactoring public API of TextRecognitionModel/TextDetectionModel

* update provided models
update opencv.bib

* dnn: adjust text rectangle angle

* remove points ordering operation in model.cpp

* update gts of DB test in test_model.cpp

* dnn: ensure to keep text rectangle angle

- avoid 90/180 degree turns

* dnn(text): use quadrangle result in TextDetectionModel API

* dnn: update Text Detection API
(1) keep points' order consistent with (bl, tl, tr, br) in unclip
(2) update contourScore with boundingRect
2020-12-03 18:47:40 +00:00
Daniel Cauchi
9d37cdaa66
Merge pull request #18891 from CowKeyMan:NMS_boxes_with_different_labels
Add option for NMS for boxes with different labels

* DetectionModel impl

* Add option for NMS for boxes with different labels

In the detect function in modules/dnn/include/opencv2/dnn/dnn.hpp, whose implementation can be found at modules/dnn/src/model.cpp, the Non Max Suppression (NMS) is applied only for objects of the same label. Thus, a flag
was added with the purpose to allow developers to choose if they want to keep the default implementation or wether they would like NMS to be applied to all the boxes, regardless of label.

The flag is called nmsDifferentLabels, and is given a default value of false, which applies the current default implementation, thus allowing existing projects to update opencv without disruption

Solves issue opencv#18832

* Change return type of set & Add default constr

* Add assertions due to default constructor
2020-12-01 13:50:24 +00:00
Alexander Alekhin
9d2eabaaa2 Merge remote-tracking branch 'upstream/master' into merge-4.x 2020-11-27 18:15:28 +00:00
Alexander Alekhin
2155296a13 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-11-27 14:08:06 +00:00
Sergei Slashchinin
f4f462c50b
Merge pull request #18862 from sl-sergei:support_pool1d
Support for Pool1d layer for OpenCV and OpenCL targets

* Initial version of Pool1d support

* Fix variable naming

* Fix 1d pooling for OpenCL

* Change support logic, remove unnecessary variable, split the tests

* Remove other depricated variables

* Fix warning. Check tests

* Change support check logic

* Change support check logic, 2
2020-11-24 16:52:45 +00:00
Alexander Alekhin
ce8027c6fb Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-11-17 21:56:26 +00:00
Alexander Alekhin
9485113923 pre: OpenCV 3.4.13 (version++) 2020-11-17 21:50:30 +00:00
Alexander Alekhin
9acbfc6e05 Merge pull request #18711 from alalek:dnn_fix_model_public_api 2020-11-17 21:47:59 +00:00
Omar Alzaibaq
a316b11aaa
Merge pull request #18220 from Omar-AE:hddl-supported
* added HDDL VPU support

* changed to return True in one line if any device connected

* dnn: use releaseHDDLPlugin()

* dnn(hddl): fix conditions
2020-11-17 19:47:24 +00:00
Alexander Alekhin
23baf1a75e dnn: fix High-Level public API (cv::dnn::Model class)
- proxy selected Net methods only (don't derive from Net directly)
- default Model ctor is protected
2020-11-17 11:01:31 +00:00
Sergey Slashchinin
32e7ef8a3d Add fixes and tests for different layers 2020-11-17 13:39:32 +03:00
Alexander Alekhin
9794ff1398 next: update versions handling 2020-10-11 08:11:32 +00:00
Alexander Alekhin
a12ceb04bb pre: OpenCV 4.5.0 (version++) 2020-09-08 06:08:58 +00:00
Alexander Alekhin
50ff40d684 pre: OpenCV 3.4.12 (version++) 2020-09-06 22:26:32 +00:00
Alexander Alekhin
fa25faa2d2 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-08-06 14:15:52 +00:00
kadi soheib
6bed5c181b Corrected Comment as requested by reviewer. 2020-07-31 23:43:38 +03:00
kadi soheib
17c430da88 Updated comment. 2020-07-04 06:37:59 +03:00
kadi soheib
96a501c08b Adding comment from source code to documentation. 2020-07-04 06:37:58 +03:00
Alexander Alekhin
5f3012fc9a pre: OpenCV 4.4.0 (version++) 2020-06-09 02:27:13 +00:00
Alexander Alekhin
a43e3bebe6 pre: OpenCV 3.4.11 (version++) 2020-06-08 18:46:27 +00:00
Liubov Batanina
d3aaf2d3a3
Merge pull request #17371 from l-bat:nms_model
* Fix NMS bug in DetectionModel

* Fixed comments

* Refactoring
2020-05-28 22:54:19 +00:00
Alexander Alekhin
21e28adb87 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-05-22 19:50:14 +00:00
Liubov Batanina
d991c22090
Merge pull request #16575 from l-bat:flownet2
Support FlowNet2 model

* Support DataAugmentation layer

* Fix warnings

* Fix comments

* Support Correlation layer

* TEST

* Support Correlation layer

* Supported Accum and FlowWarp layers

* Supported ChannelNorm layer

* Supported Resample with inputs.size() > 1

* Fixed comments

* Refactoring

* Added tests

* Add resample test

* Added asserts in resize layer

* Updated DataAugmentation layer

* Update convolution layer

* Refactoring

* Fix data augmentation layer

* Fix caffe importer

* Fix resize

* Switch to Mat ptr

* Remove useless resize type

* Used ResizeLayer in Accum

* Split ChannelNormLayer

* Delete duplicate assert

* Add sample

* Fix sample

* Added colormap
2020-05-19 12:29:50 +00:00
Alexander Alekhin
4cdb4652cf Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-03-20 21:41:15 +00:00
Pavel Rojtberg
66cf55ea1f dnn: expose only float variant of NMSBoxes for bindings
the float variant was always shadowed by the int version as
Rect2d is implicitly convertible to Rect.
This swaps things which is fine, as the vector of boxes was always
copied and the computation was done in double.
2020-03-19 12:36:35 +01:00
Alexander Alekhin
d00e58cdb0 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-03-10 22:49:51 +00:00
Alexander Alekhin
db95aec4a7 dnn(ie): switch to nGraph backend by default 2020-03-10 14:33:22 +03:00
Alexander Alekhin
45d073f889 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-02-26 20:09:03 +03:00
Alexander Alekhin
01048e5603
Merge pull request #16616 from alalek:dnn_fix_input_shape
* dnn: fix processing of input shapes

- importer: avoid using of .setInput() => .setInputShape()
- setInput: shape limitation check (partial)

* dnn(test): test .setInput() in readNet()
2020-02-21 22:39:54 +03:00
Alexander Alekhin
560f85f8e5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-01-28 14:26:57 +03:00
Alexander Alekhin
5429b1f5ff Merge pull request #16223 from l-bat:lip_jppnet 2020-01-27 19:17:43 +00:00
Liubov Batanina
4a19ac5aca Move instruction 2020-01-27 16:18:32 +03:00
Alexander Alekhin
3d14dd4e39 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-01-22 16:58:30 +03:00
Liubov Batanina
7e5b5390ba Fix comments 2020-01-22 14:57:54 +03:00
Liubov Batanina
832ca0734d Refactoring 2020-01-22 10:52:40 +03:00
Julien
886220b9be Merge pull request #16273 from JulienMaille:wrapper_available_target
* add a wrapper for getAvailableTargets

* add java wrapper on Target enum
2020-01-17 19:24:37 +03:00
Liubov Batanina
7eba3a7c96 Add pack description 2020-01-09 13:59:35 +03:00
Liubov Batanina
752653c70b Update global pooling 2019-12-28 18:03:40 +03:00
Brian Wignall
f9c514b391 Fix spelling typos
backport commit 659ffaddb4
2019-12-27 12:46:53 +00:00
Brian Wignall
659ffaddb4 Fix spelling typos 2019-12-26 06:45:03 -05:00
Liubov Batanina
543e0302d3 Support global pooling by axis 2019-12-24 16:16:58 +03:00
Alexander Alekhin
4c86fc13cb Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-19 15:09:05 +03:00
Alexander Alekhin
4342657762 Merge pull request #16034 from Quantizs:irLoadFromBuffer 2019-12-19 10:00:07 +00:00
antalzsiroscandid
aa80f754f4 dnn: reading IR models from buffer 2019-12-18 15:31:08 +01:00
Diego
5b0b59ecfb Merge pull request #15189 from dvd42:keypoints_module
Keypoints module
2019-12-13 18:00:06 +03:00
Alexander Alekhin
92b9888837 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-12 13:02:19 +03:00
Alexander Alekhin
5ee7abbe3c
Merge pull request #16088 from alalek:dnn_eltwise_layer_different_src_channels
dnn(eltwise): fix handling of different number of channels

* dnn(test): reproducer for Eltwise layer issue from PR16063

* dnn(eltwise): rework support for inputs with different channels

* dnn(eltwise): get rid of finalize(), variableChannels

* dnn(eltwise): update input sorting by number of channels

- do not swap inputs if number of channels are same after truncation

* dnn(test): skip "shortcut" with batch size 2 on MYRIAD targets
2019-12-11 20:16:58 +03:00
Alexander Alekhin
4b0132ed7a Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-02 16:26:52 +03:00
Lubov Batanina
7523c777c5 Merge pull request #15537 from l-bat:ngraph
* Support nGraph

* Fix resize
2019-12-02 16:16:06 +03:00
Manjunath Bhat
78c5e41c23 Merge pull request #15808 from thebhatman:Mish_swish
* Added Swish and Mish activations

* Fixed whitespace errors

* Kernel implementation done

* Added function for launching kernel

* Changed type of 1.0

* Attempt to add test for Swish and Mish

* Resolving type mismatch for log

* exp from device

* Use log1pexp instead of adding 1

* Added openCL kernels
2019-12-02 00:06:17 +03:00
thebhatman
8a18d132fc Port Swish and Mish layers 2019-12-01 11:55:39 +03:00
Alexander Alekhin
b6a58818bb Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-11 20:25:42 +00:00
Alexander Alekhin
055ffc0425 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-24 18:21:19 +00:00
Yashas Samaga B L
613c12e590 Merge pull request #14827 from YashasSamaga:cuda4dnn-csl-low
CUDA backend for the DNN module

* stub cuda4dnn design

* minor fixes for tests and doxygen

* add csl public api directory to module headers

* add low-level CSL components

* add high-level CSL components

* integrate csl::Tensor into backbone code

* switch to CPU iff unsupported; otherwise, fail on error

* add fully connected layer

* add softmax layer

* add activation layers

* support arbitary rank TensorDescriptor

* pass input wrappers to `initCUDA()`

* add 1d/2d/3d-convolution

* add pooling layer

* reorganize and refactor code

* fixes for gcc, clang and doxygen; remove cxx14/17 code

* add blank_layer

* add LRN layer

* add rounding modes for pooling layer

* split tensor.hpp into tensor.hpp and tensor_ops.hpp

* add concat layer

* add scale layer

* add batch normalization layer

* split math.cu into activations.cu and math.hpp

* add eltwise layer

* add flatten layer

* add tensor transform api

* add asymmetric padding support for convolution layer

* add reshape layer

* fix rebase issues

* add permute layer

* add padding support for concat layer

* refactor and reorganize code

* add normalize layer

* optimize bias addition in scale layer

* add prior box layer

* fix and optimize normalize layer

* add asymmetric padding support for pooling layer

* add event API

* improve pooling performance for some padding scenarios

* avoid over-allocation of compute resources to kernels

* improve prior box performance

* enable layer fusion

* add const layer

* add resize layer

* add slice layer

* add padding layer

* add deconvolution layer

* fix channelwise  ReLU initialization

* add vector traits

* add vectorized versions of relu, clipped_relu, power

* add vectorized concat kernels

* improve concat_with_offsets performance

* vectorize scale and bias kernels

* add support for multi-billion element tensors

* vectorize prior box kernels

* fix address alignment check

* improve bias addition performance of conv/deconv/fc layers

* restructure code for supporting multiple targets

* add DNN_TARGET_CUDA_FP64

* add DNN_TARGET_FP16

* improve vectorization

* add region layer

* improve tensor API, add dynamic ranks

1. use ManagedPtr instead of a Tensor in backend wrapper
2. add new methods to tensor classes
  - size_range: computes the combined size of for a given axis range
  - tensor span/view can be constructed from a raw pointer and shape
3. the tensor classes can change their rank at runtime (previously rank was fixed at compile-time)
4. remove device code from tensor classes (as they are unused)
5. enforce strict conditions on tensor class APIs to improve debugging ability

* fix parametric relu activation

* add squeeze/unsqueeze tensor API

* add reorg layer

* optimize permute and enable 2d permute

* enable 1d and 2d slice

* add split layer

* add shuffle channel layer

* allow tensors of different ranks in reshape primitive

* patch SliceOp to allow Crop Layer

* allow extra shape inputs in reshape layer

* use `std::move_backward` instead of `std::move` for insert in resizable_static_array

* improve workspace management

* add spatial LRN

* add nms (cpu) to region layer

* add max pooling with argmax ( and a fix to limits.hpp)

* add max unpooling layer

* rename DNN_TARGET_CUDA_FP32 to DNN_TARGET_CUDA

* update supportBackend to be more rigorous

* remove stray include from preventing non-cuda build

* include op_cuda.hpp outside condition #if

* refactoring, fixes and many optimizations

* drop DNN_TARGET_CUDA_FP64

* fix gcc errors

* increase max. tensor rank limit to six

* add Interp layer

* drop custom layers; use BackendNode

* vectorize activation kernels

* fixes for gcc

* remove wrong assertion

* fix broken assertion in unpooling primitive

* fix build errors in non-CUDA build

* completely remove workspace from public API

* fix permute layer

* enable accuracy and perf. tests for DNN_TARGET_CUDA

* add asynchronous forward

* vectorize eltwise ops

* vectorize fill kernel

* fixes for gcc

* remove CSL headers from public API

* remove csl header source group from cmake

* update min. cudnn version in cmake

* add numerically stable FP32 log1pexp

* refactor code

* add FP16 specialization to cudnn based tensor addition

* vectorize scale1 and bias1 + minor refactoring

* fix doxygen build

* fix invalid alignment assertion

* clear backend wrappers before allocateLayers

* ignore memory lock failures

* do not allocate internal blobs

* integrate NVTX

* add numerically stable half precision log1pexp

* fix indentation, following coding style,  improve docs

* remove accidental modification of IE code

* Revert "add asynchronous forward"

This reverts commit 1154b9da9da07e9b52f8a81bdcea48cf31c56f70.

* [cmake] throw error for unsupported CC versions

* fix rebase issues

* add more docs, refactor code, fix bugs

* minor refactoring and fixes

* resolve warnings/errors from clang

* remove haveCUDA() checks from supportBackend()

* remove NVTX integration

* changes based on review comments

* avoid exception when no CUDA device is present

* add color code for CUDA in Net::dump
2019-10-21 14:28:00 +03:00
Alexander Alekhin
48d41ab088 dnn: bump API version 2019-09-02 14:25:18 +03:00
Alexander Alekhin
70dfae31a2 experimental version++ 2019-09-02 14:17:36 +03:00
luz.paz
fcc7d8dd4e Fix modules/ typos
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`

backporting of commit: ec43292e1e
2019-08-16 17:34:29 +03:00
luz.paz
ec43292e1e Fix modules/ typos
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`
2019-08-15 18:02:09 -04:00
Diego
f7f2438478 Merge pull request #15082 from dvd42:segmentation-module
Segmentation module (#15082)
2019-08-13 23:38:48 +03:00
Dmitry Kurtaev
a9839af903 Add preprocessing warps for separate parameters 2019-08-07 14:51:41 +03:00