Commit Graph

177 Commits

Author SHA1 Message Date
NesQl
0bcdf7d03e
Merge pull request #16724 from liqi-c:3.4-tengine
* Add Tengine support .

* Modify printf to CV_LOG_WARNING

* a few minor fixes in the code

* Renew Tengine version

* Add header file for CV_LOG_WARNING

* Add #ifdef HAVE_TENGINE in tengine_graph_convolution.cpp

* remove trailing whitespace

* Remove trailing whitespace

* Modify for compile problem

* Modify some code style error

* remove whitespace

* Move some code style problem

* test

* add ios limit and build problem

* Modified as alalek suggested

* Add cmake 2.8 support

* modify cmake 3.5.1 problem

* test and set BUILD_ANDROID_PROJECTS OFF

* remove some compile error

* remove some extra code in tengine

* close test.

* Test again

* disable android.

* delete ndk version judgement

* Remove setenv() call . and add License information

* Set tengine default OFF. Close test .

Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
2020-03-09 14:59:23 +00:00
Alexander Alekhin
124bf8339f dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
2020-03-03 08:07:54 +00:00
Alexander Alekhin
29d214474f dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
2020-03-03 07:45:09 +00:00
Alexander Alekhin
560f85f8e5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-01-28 14:26:57 +03:00
Liubov Batanina
a3ae69893c Extend nGraph Deconvolution layer support 2020-01-23 15:10:42 +03:00
Yashas Samaga B L
17c485eb03 Merge pull request #16092 from YashasSamaga:cuda4dnn-conv-act-fuse
cuda4dnn: fuse activations with convolutions

* fuse ReLU, ReLU6, TanH, Sigmoid with conv

* fix OpenCL errors

* improve ReLU, add power, swish and mish

* fix missing fusion entries

* fix handling of unsetAttached

* remove whole file indentation

* optimize power = 1.0, use IDENTITY instead of NONE

* handle edge case: change backend and then clear
2019-12-14 22:26:58 +03:00
Alexander Alekhin
92b9888837 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-12 13:02:19 +03:00
Alexander Alekhin
939099b9ce Merge pull request #16107 from dkurt:dnn_ie_ngraph_v1_conv 2019-12-10 12:10:50 +00:00
Dmitry Kurtaev
fe77223dee Modify nGraph's ConvolutionBackpropData and GroupConvolution 2019-12-10 14:14:00 +03:00
Dmitry Kurtaev
c2ca3ee2fa Fix weights fusion for Convolution and Deconvolution layers in nGraph 2019-12-09 19:06:47 +03:00
Alexander Alekhin
4b0132ed7a Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-02 16:26:52 +03:00
Lubov Batanina
7523c777c5 Merge pull request #15537 from l-bat:ngraph
* Support nGraph

* Fix resize
2019-12-02 16:16:06 +03:00
Yashas Samaga B L
613c12e590 Merge pull request #14827 from YashasSamaga:cuda4dnn-csl-low
CUDA backend for the DNN module

* stub cuda4dnn design

* minor fixes for tests and doxygen

* add csl public api directory to module headers

* add low-level CSL components

* add high-level CSL components

* integrate csl::Tensor into backbone code

* switch to CPU iff unsupported; otherwise, fail on error

* add fully connected layer

* add softmax layer

* add activation layers

* support arbitary rank TensorDescriptor

* pass input wrappers to `initCUDA()`

* add 1d/2d/3d-convolution

* add pooling layer

* reorganize and refactor code

* fixes for gcc, clang and doxygen; remove cxx14/17 code

* add blank_layer

* add LRN layer

* add rounding modes for pooling layer

* split tensor.hpp into tensor.hpp and tensor_ops.hpp

* add concat layer

* add scale layer

* add batch normalization layer

* split math.cu into activations.cu and math.hpp

* add eltwise layer

* add flatten layer

* add tensor transform api

* add asymmetric padding support for convolution layer

* add reshape layer

* fix rebase issues

* add permute layer

* add padding support for concat layer

* refactor and reorganize code

* add normalize layer

* optimize bias addition in scale layer

* add prior box layer

* fix and optimize normalize layer

* add asymmetric padding support for pooling layer

* add event API

* improve pooling performance for some padding scenarios

* avoid over-allocation of compute resources to kernels

* improve prior box performance

* enable layer fusion

* add const layer

* add resize layer

* add slice layer

* add padding layer

* add deconvolution layer

* fix channelwise  ReLU initialization

* add vector traits

* add vectorized versions of relu, clipped_relu, power

* add vectorized concat kernels

* improve concat_with_offsets performance

* vectorize scale and bias kernels

* add support for multi-billion element tensors

* vectorize prior box kernels

* fix address alignment check

* improve bias addition performance of conv/deconv/fc layers

* restructure code for supporting multiple targets

* add DNN_TARGET_CUDA_FP64

* add DNN_TARGET_FP16

* improve vectorization

* add region layer

* improve tensor API, add dynamic ranks

1. use ManagedPtr instead of a Tensor in backend wrapper
2. add new methods to tensor classes
  - size_range: computes the combined size of for a given axis range
  - tensor span/view can be constructed from a raw pointer and shape
3. the tensor classes can change their rank at runtime (previously rank was fixed at compile-time)
4. remove device code from tensor classes (as they are unused)
5. enforce strict conditions on tensor class APIs to improve debugging ability

* fix parametric relu activation

* add squeeze/unsqueeze tensor API

* add reorg layer

* optimize permute and enable 2d permute

* enable 1d and 2d slice

* add split layer

* add shuffle channel layer

* allow tensors of different ranks in reshape primitive

* patch SliceOp to allow Crop Layer

* allow extra shape inputs in reshape layer

* use `std::move_backward` instead of `std::move` for insert in resizable_static_array

* improve workspace management

* add spatial LRN

* add nms (cpu) to region layer

* add max pooling with argmax ( and a fix to limits.hpp)

* add max unpooling layer

* rename DNN_TARGET_CUDA_FP32 to DNN_TARGET_CUDA

* update supportBackend to be more rigorous

* remove stray include from preventing non-cuda build

* include op_cuda.hpp outside condition #if

* refactoring, fixes and many optimizations

* drop DNN_TARGET_CUDA_FP64

* fix gcc errors

* increase max. tensor rank limit to six

* add Interp layer

* drop custom layers; use BackendNode

* vectorize activation kernels

* fixes for gcc

* remove wrong assertion

* fix broken assertion in unpooling primitive

* fix build errors in non-CUDA build

* completely remove workspace from public API

* fix permute layer

* enable accuracy and perf. tests for DNN_TARGET_CUDA

* add asynchronous forward

* vectorize eltwise ops

* vectorize fill kernel

* fixes for gcc

* remove CSL headers from public API

* remove csl header source group from cmake

* update min. cudnn version in cmake

* add numerically stable FP32 log1pexp

* refactor code

* add FP16 specialization to cudnn based tensor addition

* vectorize scale1 and bias1 + minor refactoring

* fix doxygen build

* fix invalid alignment assertion

* clear backend wrappers before allocateLayers

* ignore memory lock failures

* do not allocate internal blobs

* integrate NVTX

* add numerically stable half precision log1pexp

* fix indentation, following coding style,  improve docs

* remove accidental modification of IE code

* Revert "add asynchronous forward"

This reverts commit 1154b9da9da07e9b52f8a81bdcea48cf31c56f70.

* [cmake] throw error for unsupported CC versions

* fix rebase issues

* add more docs, refactor code, fix bugs

* minor refactoring and fixes

* resolve warnings/errors from clang

* remove haveCUDA() checks from supportBackend()

* remove NVTX integration

* changes based on review comments

* avoid exception when no CUDA device is present

* add color code for CUDA in Net::dump
2019-10-21 14:28:00 +03:00
Alexander Alekhin
e2a5a6a05c Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-09-25 18:32:44 +00:00
Lubov Batanina
e923712d81 Merge pull request #15572 from l-bat:deconv3d
Fix computation of internal shapes in Deconvolution layer

* Fix computation of internal shapes

* Refactoring
2019-09-25 15:35:04 +03:00
Alexander Alekhin
2ad0487cec Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-08-13 18:32:29 +00:00
Lubov Batanina
0e1ef8f8e1 Merge pull request #15184 from l-bat:IE_R2
Support new IE API (#15184)

* Add support OpenVINO R2 for layers

* Add Core API

* Fix tests

* Fix expectNoFallbacksFromIE for ONNX nets

* Remove deprecated API

* Remove td

* Remove TargetDevice

* Fix Async

* Add test

* Fix detectMyriadX

* Fix test

* Fix warning
2019-08-06 22:20:26 +03:00
Alexander Alekhin
f6c573880e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-07-12 18:45:06 +00:00
Lubov Batanina
34f6b05467 Merge pull request #14996 from l-bat:ocv_deconv3d
* Support Deconvolution3D on IE backend

* Add test tag

* Fix tests
2019-07-12 15:51:44 +03:00
Lubov Batanina
8bcd7e122a Merge pull request #14842 from l-bat:ocv_conv3d
* Support Conv3D on OCV backend

* Add header

* Add perf tests

* Support pool3d

* Enable Resnet34_kinetics on OCV backend

* Add test

* Fix conv

* Optimize Conv2D
2019-07-11 20:13:52 +03:00
Alexander Alekhin
65552bf403 dnn: fix build with Vulkan 2019-07-01 17:54:40 +03:00
Alexander Alekhin
66d7956e67 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-06-15 16:25:11 +00:00
Dmitry Kurtaev
eba696a41e Merge pull request #14792 from dkurt:dnn_ie_min_version_r5
* Remove Inference Engine 2018R3 and 2018R4

* Fix 2018R5
2019-06-14 18:17:02 +03:00
Alexander Alekhin
f3de2b4be7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-06-05 19:11:52 +03:00
Dmitry Kurtaev
9c0af1f675 Enable more deconvolution layer configurations with IE backend 2019-06-03 08:15:52 +03:00
Alexander Alekhin
b2abd8ca41 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-05-07 16:04:54 +00:00
Dmitry Kurtaev
471b83ccd5 Modify paddings computation for SAME pad mode 2019-05-06 10:49:10 +03:00
Alexander Alekhin
e28e3c9491 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-05-01 08:27:45 +00:00
Lubov Batanina
77fa59c3da Merge pull request #14301 from l-bat:conv3d
Support Convolution3D layer on IE backend (#14301)

* Add Convolution3D layer

* Disable CXX11

* Fixed tests

* Add Pooling3D layer

* Merge Conv2d with Conv3d and Pool2d with Pool3d layers

* Split pads

* Add Deconvolution layer

* Refactoring

* Deduplication

* Refactoring

* Add utils for Convolution and Pooling layers
2019-04-30 17:08:17 +03:00
Alexander Alekhin
5dc606097c Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-04-02 20:54:41 +00:00
Dmitry Kurtaev
e3286c9055 Enable 1x1 convolution optimization 2019-04-02 14:05:17 +03:00
Alexander Alekhin
8c0b0714e7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-03-11 19:20:22 +00:00
Alexander Nesterov
74574dfae4 Added optimization fuse 2019-03-05 18:12:03 -01:00
Alexander Alekhin
c3cf35ab63 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-26 17:34:42 +03:00
Alexander Alekhin
ca4fd1e427 Merge pull request #13884 from dkurt:dnn_drop_ie_r1_r2 2019-02-22 11:21:43 +00:00
Dmitry Kurtaev
ed710eaa1c Make Inference Engine R3 as a minimal supported version 2019-02-21 09:32:26 +03:00
Dmitry Kurtaev
bfd663c281 Add a test for grouped deconvolution from ONNX 2019-02-21 08:54:35 +03:00
Alexander Alekhin
8bde6aea4b Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-19 19:49:13 +00:00
Dmitry Kurtaev
ca5976e3d4 Fix IE backend considering future changes. 2019-02-18 19:26:04 +03:00
Alexander Alekhin
665408e57f Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-01 13:17:32 +03:00
Alexander Alekhin
a42bbc9722 Merge pull request #13736 from dkurt:dnn_ie_future 2019-02-01 10:01:39 +00:00
Dmitry Kurtaev
c918ac298c Fix IE tests 2019-01-31 14:14:38 +03:00
Dmitry Kurtaev
ac262f5b5d Clone convolution layer weights only for fusion 2019-01-29 14:29:47 +03:00
Dmitry Kurtaev
ff775b2e54 Remove ASSERT_ANY_THROW checks fpr Myriad plugin and FP32 networks 2019-01-25 20:09:54 +03:00
Alexander Alekhin
631b246881 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-01-22 18:00:34 +00:00
Dmitry Kurtaev
f0ddf302b2 Move Inference Engine to new API 2019-01-17 14:28:48 +03:00
Alexander Alekhin
7e2ebecd52 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-01-10 12:29:41 +03:00
Dmitry Kurtaev
d0504c95f4 Add a text message for Convolution layer's input channels check 2019-01-09 13:10:19 +03:00
Alexander Alekhin
7fa7fa0226 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-21 08:33:39 +00:00
Dmitry Kurtaev
0d117312c9 DNN_TARGET_FPGA using Intel's Inference Engine 2018-11-19 11:41:43 +03:00
Alexander Alekhin
22dbcf98c5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-17 14:17:35 +00:00
Dmitry Kurtaev
b5c54e447c Extra hyperparameters for Intel's Inference Engine layers 2018-11-15 20:06:37 +03:00
WuZhiwen
6e3ea8b49d Merge pull request #12703 from wzw-intel:vkcom
* dnn: Add a Vulkan based backend

This commit adds a new backend "DNN_BACKEND_VKCOM" and a
new target "DNN_TARGET_VULKAN". VKCOM means vulkan based
computation library.

This backend uses Vulkan API and SPIR-V shaders to do
the inference computation for layers. The layer types
that implemented in DNN_BACKEND_VKCOM include:
Conv, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling,
AvePooling, Permute

This is just a beginning work for Vulkan in OpenCV DNN,
more layer types will be supported and performance
tuning is on the way.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* dnn/vulkan: Add FindVulkan.cmake to detect Vulkan SDK

In order to build dnn with Vulkan support, need installing
Vulkan SDK and setting environment variable "VULKAN_SDK" and
add "-DWITH_VULKAN=ON" to cmake command.

You can download Vulkan SDK from:
https://vulkan.lunarg.com/sdk/home#linux

For how to install, see
https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/mac/getting_started.html
respectively for linux, windows and mac.

To run the vulkan backend, also need installing mesa driver.
On Ubuntu, use this command 'sudo apt-get install mesa-vulkan-drivers'

To test, use command '$BUILD_DIR/bin/opencv_test_dnn --gtest_filter=*VkCom*'

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* dnn/Vulkan: dynamically load Vulkan runtime

No compile-time dependency on Vulkan library.
If Vulkan runtime is unavailable, fallback to CPU path.

Use environment "OPENCL_VULKAN_RUNTIME" to specify path to your
own vulkan runtime library.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* dnn/Vulkan: Add a python script to compile GLSL shaders to SPIR-V shaders

The SPIR-V shaders are in format of text-based 32-bit hexadecimal
numbers, and inserted into .cpp files as unsigned int32 array.

* dnn/Vulkan: Put Vulkan headers into 3rdparty directory and some other fixes

Vulkan header files are copied from
https://github.com/KhronosGroup/Vulkan-Docs/tree/master/include/vulkan
to 3rdparty/include

Fix the Copyright declaration issue.

Refine OpenCVDetectVulkan.cmake

* dnn/Vulkan: Add vulkan backend tests into existing ones.

Also fixed some test failures.

- Don't use bool variable as uniform for shader
- Fix dispathed group number beyond max issue
- Bypass "group > 1" convolution. This should be support in future.

* dnn/Vulkan: Fix multiple initialization in one thread.
2018-10-29 17:51:26 +03:00
Dmitry Kurtaev
dc3406eed9 Fix Pooling and Convolution layers from Intel's Inference Engine 2018-10-15 16:40:28 +03:00
Alexander Alekhin
9d02d42afe dnn(ocl4dnn): don't use getUMat()
especially in CPU only processing
2018-10-05 15:24:51 +03:00
Dmitry Kurtaev
24ab751547 Merge pull request #12565 from dkurt:dnn_non_intel_gpu
* Remove isIntel check from deep learning layers

* Remove fp16->fp32 fallbacks where it's not necessary

* Fix Kernel::run to prevent localsize > globalsize
2018-09-26 16:27:00 +03:00
Lubov Batanina
43f889ae1f Merge pull request #12519 from l-bat:l-bat/onnx_parser
Support asymmetric padding in pooling layer (#12519)

* Add Inception_V1 support in ONNX

* Add asymmetric padding in OpenCL and Inference engine

* Refactoring
2018-09-17 20:26:17 +03:00
Dmitry Kurtaev
d486204a0d Merge pull request #12264 from dkurt:dnn_remove_forward_method
* Remove a forward method in dnn::Layer

* Add a test

* Fix tests

* Mark multiple dnn::Layer::finalize methods as deprecated

* Replace back dnn's inputBlobs to vector of pointers

* Remove Layer::forward_fallback from CV_OCL_RUN scopes
2018-09-06 13:26:47 +03:00
Dmitry Kurtaev
50bceea038 Include preprocessing nodes to object detection TensorFlow networks (#12211)
* Include preprocessing nodes to object detection TensorFlow networks

* Enable more fusion

* faster_rcnn_resnet50_coco_2018_01_28 test
2018-08-31 15:41:56 +03:00
Dmitry Kurtaev
3e027df583 Enable more deep learning tests using Intel's Inference Engine backend 2018-08-27 18:37:35 +03:00
Alexander Alekhin
d2e08a524e core: repair CV_Assert() messages
Multi-argument CV_Assert() is accessible via CV_Assert_N() (with malformed messages).
2018-08-15 17:43:10 +03:00
Dmitry Kurtaev
be08730cd6 MVN layer using Intel's Inference Engine backend 2018-08-02 17:49:03 +03:00
Maksim Shabunin
cbb1e867e5 More issues found by static analysis 2018-07-24 16:04:42 +03:00
Alexander Alekhin
ee743afebe dnn(ocl): don't use getUMat() for long live objects 2018-07-20 17:53:55 +03:00
Vadim Pisarevsky
523b6f32ba Merge pull request #11867 from dkurt:dnn_ie_layers 2018-07-06 13:13:20 +00:00
Dmitry Kurtaev
019c2f2115 Enable more deep learning tests 2018-07-05 14:23:15 +03:00
Alexander Alekhin
b09a4a98d4 opencv: Use cv::AutoBuffer<>::data() 2018-07-04 19:11:29 +03:00
Dmitry Kurtaev
2c291bc2fb Enable FastNeuralStyle and OpenFace networks with IE backend 2018-06-09 15:57:12 +03:00
rockzhan
1187a7fa34 Merge pull request #11649 from rockzhan:dnn_dw_prelu
dnn: Fix output mismatch when forward dnn model contain [depthwise conv(group=1) + bn + prelu]  (#11649)

* this can make sure [depthwise conv(group=1) + bn + prelu] output not shift

* add TEST to show the output mismatch in [DWconv+Prelu]

* fix typo

* change loading image to init cvMat directly

* build runtime model, without loading external model

* remove whitespace

* change way to create a cvmat

* add bias_term, add target output

* fix [dwconv + prelu] value mismatch when no optimizations

* fix Test error when change output channels

* add parametric test

* change num_output to group value

* change conv code and change test back
2018-06-07 13:45:54 +00:00
Vadim Pisarevsky
3cbd2e2764 Merge pull request #11650 from dkurt:dnn_default_backend 2018-06-06 09:30:39 +00:00
Dmitry Kurtaev
b781ac7346 Make Intel's Inference Engine backend is default if no preferable backend is specified. 2018-06-04 18:31:46 +03:00
Kuang Fangjun
9ae28415ec fix doc. 2018-06-03 17:44:24 +08:00
Alexander Alekhin
44572fac44 Merge pull request #11557 from tomoaki0705:relaxIntelOnlyOCL4DNN 2018-05-29 15:25:22 +00:00
Tomoaki Teshima
2e9e71ab9e make ocl4dnn available to run on other platform than Intel GPU 2018-05-29 19:18:10 +09:00
Maksim Shabunin
895e10c317 dnn: fixed IE support on Windows 2018-05-23 12:46:14 +03:00
Li Peng
3dd916882a fp16 ocl support for googlenet
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:45:02 +08:00
Dmitry Kurtaev
c99c3e761e Fuse multipliers but not convolution layers weights 2018-05-10 19:24:38 +03:00
Dmitry Kurtaev
66ce8cd7ea Fix bugs found by valgrind 2018-04-17 17:53:51 +03:00
Dmitry Kurtaev
709cf5d038 OpenCL GPU target for Inference Engine deep learning backend
Enable FP16 GPU target for DL Inference Engine backend.
2018-04-09 17:21:35 +03:00
Alexander Alekhin
1060c0f439 dnn: apply CV_OVERRIDE/CV_FINAL 2018-03-28 18:43:27 +03:00
Alexander Alekhin
6c051a55e5 cmake: don't add include <module>/src directory to avoid conflicts
during opencv_world builds
2018-03-19 11:14:15 +03:00
Alexander Alekhin
5b868ccd82 Merge pull request #10992 from dkurt:dnn_opencl_tests 2018-03-09 10:06:40 +00:00
Dmitry Kurtaev
0f01b40dd5 Reset OpenCL kernels if batch size changes 2018-03-07 17:06:59 +03:00
Alexander Alekhin
514f4193db Merge pull request #10959 from alalek:cmake_ocl4dnn 2018-03-07 10:26:14 +00:00
Alexander Alekhin
1b83bc48a1 dnn: make OpenCL DNN code optional 2018-03-01 12:12:40 +03:00
Wu Zhiwen
ef937dd676 ocl4dnn: Fix SAME padding mode for convolve
Signed-off-by: Wu, Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-28 21:02:41 +08:00
Li Peng
608968aa83 Deconvolution ocl fix
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-23 18:31:30 +08:00
Li Peng
c524f669c7 Fallback for "SAME" padMode in ocl convolution and pooling
It fixes tensorflow ocl testcase of MobileNetSSD and Inception_v2_SSD

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-22 21:17:59 +08:00
Li Peng
2863f950d6 ReLU6 layer ocl support
include relu6 ocl kernel and layer fusion support

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-20 15:11:09 +08:00
Alexander Alekhin
cff79609c8 Merge pull request #10854 from pengli:dnn 2018-02-14 12:49:53 +00:00
Vadim Pisarevsky
6dfd7e3da2 Merge pull request #10850 from dkurt:dnn_tf_deconv_tests 2018-02-14 10:35:14 +00:00
Li Peng
5992c46606 add fallback case for ocl convolution
The ocl convolution doesn't support tensorflow padMode well.
Add fallback check if we meet this situation, it could fix the
tensorflow MobileNet SSD failure.

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-14 00:04:38 +08:00
Dmitry Kurtaev
514e6df460 Refactored deep learning layers fusion 2018-02-13 14:35:58 +03:00
Dmitry Kurtaev
a6baedd02c Fix deconvolution layer. Add batch norm layer with mean-variance normalization from TensorFlow. 2018-02-13 11:00:27 +03:00
Dmitry Kurtaev
10e1de74d2 Intel Inference Engine deep learning backend (#10608)
* Intel Inference Engine deep learning backend.

* OpenFace network using Inference Engine backend
2018-02-06 11:57:35 +03:00
Li Peng
e15928b49e convolution and tanh layer fusion
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-25 17:45:33 +08:00
Li Peng
2124361ff7 ocl support for Deconvolution layer
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-18 23:40:22 +08:00
Dmitry Kurtaev
1f4fdfd599 Untrainable version of Scale layer from Caffe 2018-01-13 10:35:29 +03:00
Dmitry Kurtaev
64a9e92390 Merge pull request #10466 from dkurt:reduce_umat_try_2
* UMat blobs are wrapped

* Replace getUMat and getMat at OpenCLBackendWrapper
2018-01-10 21:50:54 +03:00
Alexander Alekhin
7d67d60fb1 cmake(opt): AVX512_SKX 2017-12-29 07:18:11 +00:00