Commit Graph

878 Commits

Author SHA1 Message Date
Wu Zhiwen
34e9d1eb3c dnn/Vulkan: support log softmax
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2018-10-31 09:47:38 +08:00
Wu Zhiwen
3914c17b0d dnn/Vulkan: Refine error handle mechanism
Fallback to OPENCV backend and CPU target if catch exception from
vkcom backend.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2018-10-31 09:47:33 +08:00
Wu Zhiwen
7fff245f87 dnn/Vulkan: Rename function_list.inl
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2018-10-30 08:29:43 +08:00
WuZhiwen
6e3ea8b49d Merge pull request #12703 from wzw-intel:vkcom
* dnn: Add a Vulkan based backend

This commit adds a new backend "DNN_BACKEND_VKCOM" and a
new target "DNN_TARGET_VULKAN". VKCOM means vulkan based
computation library.

This backend uses Vulkan API and SPIR-V shaders to do
the inference computation for layers. The layer types
that implemented in DNN_BACKEND_VKCOM include:
Conv, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling,
AvePooling, Permute

This is just a beginning work for Vulkan in OpenCV DNN,
more layer types will be supported and performance
tuning is on the way.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* dnn/vulkan: Add FindVulkan.cmake to detect Vulkan SDK

In order to build dnn with Vulkan support, need installing
Vulkan SDK and setting environment variable "VULKAN_SDK" and
add "-DWITH_VULKAN=ON" to cmake command.

You can download Vulkan SDK from:
https://vulkan.lunarg.com/sdk/home#linux

For how to install, see
https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/mac/getting_started.html
respectively for linux, windows and mac.

To run the vulkan backend, also need installing mesa driver.
On Ubuntu, use this command 'sudo apt-get install mesa-vulkan-drivers'

To test, use command '$BUILD_DIR/bin/opencv_test_dnn --gtest_filter=*VkCom*'

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* dnn/Vulkan: dynamically load Vulkan runtime

No compile-time dependency on Vulkan library.
If Vulkan runtime is unavailable, fallback to CPU path.

Use environment "OPENCL_VULKAN_RUNTIME" to specify path to your
own vulkan runtime library.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* dnn/Vulkan: Add a python script to compile GLSL shaders to SPIR-V shaders

The SPIR-V shaders are in format of text-based 32-bit hexadecimal
numbers, and inserted into .cpp files as unsigned int32 array.

* dnn/Vulkan: Put Vulkan headers into 3rdparty directory and some other fixes

Vulkan header files are copied from
https://github.com/KhronosGroup/Vulkan-Docs/tree/master/include/vulkan
to 3rdparty/include

Fix the Copyright declaration issue.

Refine OpenCVDetectVulkan.cmake

* dnn/Vulkan: Add vulkan backend tests into existing ones.

Also fixed some test failures.

- Don't use bool variable as uniform for shader
- Fix dispathed group number beyond max issue
- Bypass "group > 1" convolution. This should be support in future.

* dnn/Vulkan: Fix multiple initialization in one thread.
2018-10-29 17:51:26 +03:00
Alexander Alekhin
50bec53afc Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-10-26 17:56:55 +03:00
Maksim Shabunin
0ccd810738 Fixed several issues found by static analysis 2018-10-25 10:45:59 +03:00
Antonio Borondo
7a3cb2280b Recognize ConvolutionDepthwise as Convolution 2018-10-24 08:37:51 +01:00
Alexander Alekhin
9c23f2f1a6 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-10-20 11:37:54 +00:00
Dmitry Kurtaev
365451dab0 Implement getBatchSize for Intel's Inference Engine networks 2018-10-17 14:02:37 +03:00
Alexander Alekhin
edacd91a27 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-10-15 20:15:42 +00:00
Alexander Alekhin
113793fee7 Merge pull request #12837 from dkurt:dnn_fix_ie 2018-10-15 19:17:18 +00:00
Alexander Alekhin
f8a27d2603 Merge pull request #12775 from radomsak:radomsak_dnn_fix_caffe_importer_reused_layers 2018-10-15 14:44:23 +00:00
Dmitry Kurtaev
dc3406eed9 Fix Pooling and Convolution layers from Intel's Inference Engine 2018-10-15 16:40:28 +03:00
Adam Radomski
cc3ec5d453 Fix dnn caffe importer extract blobs from reused layers 2018-10-10 10:44:56 +02:00
Alexander Alekhin
dada5a422d Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-10-09 21:20:15 +00:00
Lubov Batanina
50811e04f2 Merge pull request #12596 from l-bat:l-bat/shufflenet_onnx
* Add Shufflenet support in ONNX

* Add test for transpose layer
2018-10-08 22:18:41 +03:00
Alexander Alekhin
26ba4f3c1d Merge pull request #12754 from alalek:dnn_ocl4dnn_async_expressions 2018-10-08 15:22:24 +00:00
Alexander Alekhin
634dd656d5 dnn: don't use Mat expressions with async UMat functions 2018-10-05 17:09:50 +03:00
Alexander Alekhin
9d02d42afe dnn(ocl4dnn): don't use getUMat()
especially in CPU only processing
2018-10-05 15:24:51 +03:00
Alexander Alekhin
eec468fa13 dnn(ocl4dnn): calculate activation expression once
- to avoid multiple conditional calls via sub_group() functions
2018-10-02 21:23:41 +00:00
Alexander Alekhin
690fb0544c Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-10-02 14:31:05 +03:00
Alexander Alekhin
0f031b6680 dnn(ocl4dnn): drop weights_buf
- avoid memory access violation during "prefetch" stage
2018-09-30 20:35:41 +00:00
Alexander Alekhin
a8b0db4e5d Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-28 14:14:47 +03:00
Alexander Alekhin
fae329a0ca
Merge pull request #12650 from alalek:dnn_ocl4dnn_verification_test
* dnn(ocl4dnn): update kernel checks

* dnn: workaround for IDLF kernels on Intel iGPU

* dnn(test): remove "skip" check for unstable cases
2018-09-27 12:54:23 +03:00
Dmitry Kurtaev
24ab751547 Merge pull request #12565 from dkurt:dnn_non_intel_gpu
* Remove isIntel check from deep learning layers

* Remove fp16->fp32 fallbacks where it's not necessary

* Fix Kernel::run to prevent localsize > globalsize
2018-09-26 16:27:00 +03:00
Dmitry Kurtaev
c8f3579f93 Fix #12542 (#12603)
* Fix #12542

* Remove ignore of non-virtual-dtor error
2018-09-26 16:08:51 +03:00
Alexander Alekhin
3eec8fd0eb dnn: fix printf format warning 2018-09-26 14:06:04 +03:00
Dmitry Kurtaev
f8398d80bc add Net::getUnconnectedOutLayersNames method 2018-09-25 18:10:45 +03:00
Maksim Shabunin
e0f524d3b7 Fixed several incorrect printf format specifiers 2018-09-24 11:31:40 +03:00
Alexander Alekhin
861415133e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-19 10:58:43 +03:00
Dmitry Kurtaev
8ac7b21716 Enable Myriad device for OpenVINO models test 2018-09-18 13:49:24 +03:00
Alexander Alekhin
e6171d17f8 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-18 12:49:52 +03:00
Alexander Alekhin
27a4e370f9 Merge pull request #12559 from dkurt:dnn_remove_usrtype1 2018-09-17 18:13:29 +00:00
Lubov Batanina
43f889ae1f Merge pull request #12519 from l-bat:l-bat/onnx_parser
Support asymmetric padding in pooling layer (#12519)

* Add Inception_V1 support in ONNX

* Add asymmetric padding in OpenCL and Inference engine

* Refactoring
2018-09-17 20:26:17 +03:00
Dmitry Kurtaev
7d75526373 Use TorchType enum 2018-09-17 18:55:05 +03:00
Dmitry Kurtaev
a7b3d2581f Replace CV_USRTYPE1 for int64 to CV_32SC2 in Torch importer 2018-09-17 12:31:09 +03:00
Alexander Alekhin
808ba552c5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-14 23:44:35 +00:00
Alexander Alekhin
dbfeb8892d Merge pull request #12403 from dkurt:dnn_replace_darknet_reorg 2018-09-13 20:58:11 +00:00
George Mironov
cb5da8983f Rename tensorflow namespace 2018-09-12 21:33:11 +03:00
Dmitry Kurtaev
09fa758725 Replace Darknet's Reorg to permute layer 2018-09-12 18:13:39 +03:00
Vadim Pisarevsky
f4b9acb4db Merge pull request #12497 from tomoaki0705:removeRawSSE 2018-09-12 11:59:44 +00:00
Marat K
38f8fc6c82 Merge pull request #12249 from kopytjuk:feature/region-layer-batch-mode
Feature/region layer batch mode (#12249)

* Add batch mode for Darknet networks.

Swap variables in test_darknet.

Adapt reorg layer to batch mode.

Adapt region layer.

Add OpenCL implementation.

Remove trailing whitespace.

Bugifx reorg opencl implementation.

Fix bug in OpenCL reorg.

Fix modulo bug.

Fix bug.

Reorg openCL.

Restore reorg layer opencl code.

OpenCl fix.

Work on openCL reorg.

Remove whitespace.

Fix openCL region layer implementation.

Fix bug.

Fix softmax region opencl bug.

Fix opencl bug.

Fix openCL bug.

Update aff_trans.cpp

When the fullAffine parameter is set to false, the estimateRigidTransform function maybe return empty, then the _localAffineEstimate function will be called, but the bug in it will result in incorrect results.

core(libva): support YV12 too

Added to CPU path only.
OpenCL code path still expects NV12 only (according to Intel OpenCL extension)

cmake: allow to specify own libva paths

via CMake:
- `-DVA_LIBRARIES=/opt/intel/mediasdk/lib64/libva.so.2\;/opt/intel/mediasdk/lib64/libva-drm.so.2`

android: NDK17 support

tested with NDK 17b (17.1.4828580)

Enable more deep learning tests using Intel's Inference Engine backend

ts: don't pass NULL for std::string() constructor

openvino: use 2018R3 defines

experimental version++

OpenCV version++

OpenCV 3.4.3

OpenCV version '-openvino'

openvino: use 2018R3 defines

Fixed windows build with InferenceEngine

dnn: fix variance setting bug for PriorBoxLayer

- The size of second channel should be size[2] of output tensor,
- The Scalar should be {variance[0], variance[0], variance[0], variance[0]}
  for _variance.size() == 1 case.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

Fix lifetime of networks which are loaded from Model Optimizer IRs

Adds a small note describing BUILD_opencv_world (#12332)

* Added a mall note describing BUILD_opencv_world cmake option to the Installation in Windows tutorial.

* Made slight changes in BUILD_opencv_world documentation.

* Update windows_install.markdown

improved grammar

Update opengl_interop.cpp

resolves #12307

java: fix LIST_GET macro

fix typo

Added option to fail on missing testdata

Fixed that object_detection.py does not work in python3.

cleanup: IPP Async (IPP_A)

except header file with conversion routines (will be removed in OpenCV 4.0)

imgcodecs: add null pointer check

Include preprocessing nodes to object detection TensorFlow networks (#12211)

* Include preprocessing nodes to object detection TensorFlow networks

* Enable more fusion

* faster_rcnn_resnet50_coco_2018_01_28 test

countNonZero function reworked to use wide universal intrinsics instead of SSE2 intrinsics

resolve #5788

imgcodecs(webp): multiple fixes

- don't reallocate passed 'img' (test fixed - must use IMREAD_UNCHANGED / IMREAD_ANYCOLOR)
- avoid memory DDOS
- avoid reading of whole file during header processing
- avoid data access after allocated buffer during header processing (missing checks)
- use WebPFree() to free allocated buffers (libwebp >= 0.5.0)
- drop unused & undefined `.close()` method
- added checks for channels >= 5 in encoder

ml: fix adjusting K in KNearest (#12358)

dnn(perf): fix and merge Convolution tests

- OpenCL tests didn't run any OpenCL kernels
- use real configuration from existed models (the first 100 cases)
- batch size = 1

dnn(test): use dnnBackendsAndTargets() param generator

Bit-exact resize reworked to use wide intrinsics (#12038)

* Bit-exact resize reworked to use wide intrinsics

* Reworked bit-exact resize row data loading

* Added bit-exact resize row data loaders for SIMD256 and SIMD512

* Fixed type punned pointer dereferencing warning

* Reworked loading of source data for SIMD256 and SIMD512 bit-exact resize

Bit-exact GaussianBlur reworked to use wide intrinsics (#12073)

* Bit-exact GaussianBlur reworked to use wide intrinsics

* Added v_mul_hi universal intrinsic

* Removed custom SSE2 branch from bit-exact GaussianBlur

* Removed loop unrolling for gaussianBlur horizontal smoothing

doc: fix English gramma in tutorial out-of-focus-deblur filter (#12214)

* doc: fix English gramma in tutorial out-of-focus-deblur filter

* Update out_of_focus_deblur_filter.markdown

slightly modified one sentence

doc: add new tutorial motion deblur filter (#12215)

* doc: add new tutorial motion deblur filter

* Update motion_deblur_filter.markdown

a few minor changes

Replace Slice layer to Crop in Faster-RCNN networks from Caffe

js: use generated list of OpenCV headers

- replaces hand-written list

imgcodecs(webp): use safe cast to size_t on Win32

* Put Version status back to -dev.

follow the common codestyle

Exclude some target engines.

Refactor formulas.

Refactor code.

* Remove unused variable.

* Remove inference engine check for yolov2.

* Alter darknet batch tests to test with two different images.

* Add yolov3 second image GT.

* Fix bug.

* Fix bug.

* Add second test.

* Remove comment.

* Add NMS on network level.

* Add helper files to dev.

* syntax fix.

* Fix OD sample.

Fix sample dnn object detection.

Fix NMS boxes bug.

remove trailing whitespace.

Remove debug function.

Change thresholds for opencl tests.

* Adapt score diff and iou diff.

* Alter iouDiffs.

* Add debug messages.

* Adapt iouDiff.

* Fix tests
2018-09-12 13:29:43 +03:00
Hamdi Sahloul
10ae0c4364 Merge pull request #12486 from cv3d:fix_cpp11
Support MSVC 2013 (#12486)

* Added CV_CONSTEXPR macro

* Utilize CV_NOEXCEPT and CV_CONSTEXPR

* Provides some Ptr<> logical operators
2018-09-11 22:35:03 +03:00
Tomoaki Teshima
88b04c3cd4 remove raw SSE2 implementation 2018-09-11 21:28:18 +09:00
Lubov Batanina
0c8590027f Merge pull request #12071 from l-bat/l-bat:onnx_parser
* Add Squeezenet support in ONNX

* Add AlexNet support in ONNX

* Add Googlenet support in ONNX

* Add CaffeNet and RCNN support in ONNX

* Add VGG16 and VGG16 with batch normalization support in ONNX

* Add RCNN, ZFNet, ResNet18v1 and ResNet50v1 support in ONNX

* Add ResNet101_DUC_HDC

* Add Tiny Yolov2

* Add CNN_MNIST, MobileNetv2 and LResNet100 support in ONNX

* Add ONNX models for emotion recognition

* Add DenseNet121 support in ONNX

* Add Inception v1 support in ONNX

* Refactoring

* Fix tests

* Fix tests

* Skip unstable test

* Modify Reshape operation
2018-09-10 21:07:51 +03:00
Vadim Pisarevsky
b01f63835e Merge pull request #12467 from alalek:core_use_shared_ptr 2018-09-10 13:59:14 +00:00
Alexander Alekhin
dca657a2fd Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-10 00:10:21 +03:00
Alexander Alekhin
df8b057b44 avoid Ptr<> == NULL checks 2018-09-09 19:30:46 +00:00
Hamdi Sahloul
a39e0daacf Utilize CV_UNUSED macro 2018-09-07 20:33:52 +09:00
Alexander Alekhin
73bfe68821 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-07 12:40:27 +03:00
Dmitry Kurtaev
d486204a0d Merge pull request #12264 from dkurt:dnn_remove_forward_method
* Remove a forward method in dnn::Layer

* Add a test

* Fix tests

* Mark multiple dnn::Layer::finalize methods as deprecated

* Replace back dnn's inputBlobs to vector of pointers

* Remove Layer::forward_fallback from CV_OCL_RUN scopes
2018-09-06 13:26:47 +03:00
Alexander Alekhin
d74b98c3d9 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-09-04 18:39:03 +00:00
Dmitry Kurtaev
27a6be8763 Fix #12407 2018-09-04 17:48:52 +03:00
Alexander Alekhin
f10fd64630 dnn: update "guard" inline namespace
- differ from 3.4 branch
2018-09-03 20:46:57 +00:00
Dmitry Kurtaev
c7cf8fb35c Import SSDs from TensorFlow by training config (#12188)
* Remove TensorFlow and protobuf dependencies from object detection scripts

* Create text graphs for TensorFlow object detection networks from sample
2018-09-03 17:08:40 +03:00
Wu Zhiwen
a11d944f51 dnn: Remove a duplicated code snippet for flatten layer
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2018-09-03 10:57:33 +08:00
Vadim Pisarevsky
f9c8bb40b1 Merge pull request #12350 from dkurt:dnn_ie_caffe_faster_rcnn 2018-08-31 14:57:14 +00:00
Dmitry Kurtaev
50bceea038 Include preprocessing nodes to object detection TensorFlow networks (#12211)
* Include preprocessing nodes to object detection TensorFlow networks

* Enable more fusion

* faster_rcnn_resnet50_coco_2018_01_28 test
2018-08-31 15:41:56 +03:00
Alexander Alekhin
15e57d28f5 Merge pull request #12293 from alalek:cleanup_stl_string_replacement 2018-08-30 15:43:57 +00:00
Dmitry Kurtaev
ea43e28a37 Replace Slice layer to Crop in Faster-RCNN networks from Caffe 2018-08-30 17:57:08 +03:00
Alexander Alekhin
596a0125ed Merge pull request #12336 from dkurt:dnn_ie_fix_net_lifetime 2018-08-30 11:09:18 +00:00
Wu Zhiwen
ca51bbb7ff dnn: fix variance setting bug for PriorBoxLayer
- The size of second channel should be size[2] of output tensor,
- The Scalar should be {variance[0], variance[0], variance[0], variance[0]}
  for _variance.size() == 1 case.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2018-08-30 11:05:38 +08:00
Dmitry Kurtaev
4062ef5fcb Fix lifetime of networks which are loaded from Model Optimizer IRs 2018-08-29 13:34:26 +03:00
Dmitry Kurtaev
3e027df583 Enable more deep learning tests using Intel's Inference Engine backend 2018-08-27 18:37:35 +03:00
Alexander Alekhin
7f73b105ca core: std::string more changes 2018-08-27 15:41:01 +03:00
Dmitry Kurtaev
472b71ecef Merge pull request #12243 from dkurt:dnn_tf_mask_rcnn
* Support Mask-RCNN from TensorFlow

* Fix a sample
2018-08-24 14:47:32 +03:00
Alexander Alekhin
096366738b dnn(build): fix CV_Assert() usage 2018-08-22 16:04:40 +03:00
Alexander Alekhin
c9faa09d55 Merge pull request #12266 from mshabunin:fix-windows-ie-build 2018-08-21 13:07:44 +00:00
Maksim Shabunin
808c89adc1 Fixed windows build with InferenceEngine 2018-08-21 14:59:13 +03:00
Alexander Alekhin
d2e08a524e core: repair CV_Assert() messages
Multi-argument CV_Assert() is accessible via CV_Assert_N() (with malformed messages).
2018-08-15 17:43:10 +03:00
Alexander Alekhin
b9b66ca437 Merge pull request #12205 from dkurt:dnn_update_tf_face_detection 2018-08-14 10:53:12 +00:00
Dmitry Kurtaev
f056c0f137 UINT8 face detection network using Intel's Inference Engine backend 2018-08-13 18:38:47 +03:00
Alexander Alekhin
615883977f Merge pull request #12128 from dkurt:dnn_fix_12066 2018-08-10 14:14:16 +00:00
Vadim Pisarevsky
7c8ab271fc Merge pull request #12125 from dkurt:dnn_mobilenet_ppn 2018-08-06 14:40:50 +00:00
Vadim Pisarevsky
70b893333d Merge pull request #12130 from dkurt:dnn_ie_mvn 2018-08-06 14:37:46 +00:00
Dmitry Kurtaev
449696f1e5 Enable reshape-as-shape layer from TensorFlow 2018-08-06 17:35:06 +03:00
Vadim Pisarevsky
e0c93bcf6c Merge pull request #12082 from dkurt:dnn_ie_faster_rcnn 2018-08-06 14:28:58 +00:00
Alexander Alekhin
ac4a6aad15 Merge pull request #12050 from alalek:dnn_ocl_avoid_memory_access_violation 2018-08-05 14:47:01 +00:00
Dmitry Kurtaev
be08730cd6 MVN layer using Intel's Inference Engine backend 2018-08-02 17:49:03 +03:00
Dmitry Kurtaev
4fb086d6c3 MobileNet-SSD v1 from TensorFlow with shared convolution weights 2018-08-01 16:16:48 +03:00
Dmitry Kurtaev
8e034053af Faster-RCNN from TensorFlow on CPU with Intel's Inference Engine backend 2018-08-01 11:29:58 +03:00
Alexander Alekhin
814ebe39ae Merge pull request #12113 from dkurt:dnn_fix_ssd_on_myriad 2018-07-31 14:55:18 +00:00
Maksim Shabunin
7cf52de47e dnn: modified IE search, R2 compatibility fixed 2018-07-31 14:48:06 +03:00
Dmitry Kurtaev
ed0e79cb61 Add missing parameter to DetectionOutput layer from Intel's Inference Engine 2018-07-31 11:37:45 +03:00
Maksim Shabunin
fb1f12021b Fixed build with latest IE version 2018-07-27 19:56:35 +03:00
Alexander Alekhin
b597c87bed dnn(ocl): avoid memory access violation 2018-07-27 15:35:11 +03:00
Alexander Alekhin
9137e2d635 Merge pull request #12060 from alalek:dnn_debug_layers 2018-07-26 15:14:32 +00:00
Alexander Alekhin
c37d1a53b5 Merge pull request #12025 from Triplesalt:tfimport-relu 2018-07-26 15:08:05 +00:00
Triplesalt
9eb79926df Allow a different input order for Mul+Maximum.
Squashed : ReLU operand order tests.
2018-07-26 14:19:11 +02:00
Vadim Pisarevsky
fa466b022d Merge pull request #12052 from dkurt:dnn_ie_torch_tests 2018-07-26 09:09:35 +00:00
Dmitry Kurtaev
faa6c4e1e1 Faster-RCNN anf RFCN models on CPU using Intel's Inference Engine backend.
Enable Torch layers tests with Intel's Inference Engine backend.
2018-07-25 19:04:55 +03:00
Alexander Alekhin
45b5b3c13a dnn: check layer output for NaN/Inf 2018-07-25 16:25:18 +03:00
Maksim Shabunin
cbb1e867e5 More issues found by static analysis 2018-07-24 16:04:42 +03:00
Alexander Alekhin
8de08e0463 Merge pull request #12021 from dkurt:dnn_ie_tf_ssd 2018-07-24 13:03:41 +00:00
Alexander Alekhin
236f383969 Merge pull request #12037 from dkurt:test_openvino_models 2018-07-24 12:34:04 +00:00
Dmitry Kurtaev
28e08ae0bd Add a sample which tests OpenVINO models 2018-07-23 19:08:51 +03:00
Maksim Shabunin
e0603bb45f Fixed several issues found by static analysis tools 2018-07-23 17:22:47 +03:00
Alexander Alekhin
ee743afebe dnn(ocl): don't use getUMat() for long live objects 2018-07-20 17:53:55 +03:00
Maksim Shabunin
a4060e15a4 dnn, IE backend: updated to match new interface 2018-07-19 19:22:23 +03:00
Dmitry Kurtaev
c213a3823e Run entire SSDs from TensorFlow using Intel's Inference Engine 2018-07-19 17:05:56 +03:00
Dmitry Kurtaev
070393dfda uint8 inputs for deep learning networks 2018-07-19 14:37:33 +03:00
Alexander Alekhin
6c4f618db5 Merge pull request #11104 from asciian:reading_from_stream 2018-07-17 16:24:06 +00:00
Maksim Shabunin
1da46fe6fb Fixed issues found by static analysis (mostly DBZ) 2018-07-17 16:14:54 +03:00
Alexander Alekhin
78d07e841d Merge pull request #11959 from pengli:3.4 2018-07-17 11:20:02 +00:00
Li Peng
f0cadaa6e3 enable concat layer fuse for OCL target
Signed-off-by: Li Peng <peng.li@intel.com>
2018-07-17 12:46:16 +08:00
Alexander Alekhin
c9439476da Merge pull request #11970 from dkurt:dnn_enable_tf_tests 2018-07-16 15:51:27 +00:00
Alexander Alekhin
d6c669f5cf Merge pull request #11963 from dkurt:dnn_cl_fix_matmul 2018-07-16 11:10:32 +00:00
Dmitry Kurtaev
6eb8faea85 Enable TensorFlow networks tests for different backends and targets 2018-07-13 19:58:56 +03:00
Dmitry Kurtaev
de6f0a537d Fix fully-connected layer in case of number of rows less than 4 2018-07-13 16:35:37 +03:00
Dmitry Kurtaev
dcc1beb1f8 Clip kernel for OpenCL PriorBox layer 2018-07-13 14:49:13 +03:00
Alexander Alekhin
2508f7f971 dnn(ocl): fix wrong usage of stalled .getMat() pointers
Temporary object lifetime must be greater than pointer usage.
2018-07-11 19:11:36 +03:00
Dmitry Kurtaev
8b5f061dae Replace std::vector<char> to std::vector<uchar> for Java bindings of dnn importers 2018-07-11 18:58:56 +03:00
Alexander Alekhin
999aba3807 Merge pull request #11936 from berak:dnn_shufflelayer_name 2018-07-11 12:01:31 +00:00
Li Peng
4c5a86828a Fix gemmlike convolution input reading
use vload3 for half3 or float3 input vector reading,
also check read position to see if it exceed input width

Signed-off-by: Li Peng <peng.li@intel.com>
2018-07-11 15:25:21 +08:00
berak
a7b502f04a dnn: preserve name, type strings for ShuffleLayer 2018-07-11 08:19:23 +02:00
Dmitry Kurtaev
d57e5406f0 Add readNet* functions which parse models from byte arrays 2018-07-10 11:12:01 +03:00
Alexander Alekhin
7fe0727930 Merge pull request #11924 from alalek:dnn_ocl_fix_max_pool_forward 2018-07-09 16:25:34 +00:00
Alexander Alekhin
b6255ab9e7 dnn(ocl4dnn): fix args for 'max_pool_forward' kernel 2018-07-09 18:02:20 +03:00
Alexander Alekhin
e2b5d11290 dnn: allow to use external protobuf
"custom layers" feature will not work properly in these builds.
2018-07-09 17:28:45 +03:00
Dmitry Kurtaev
362d4f5395 Replace convertFp16 from dnn::Net::setInput() 2018-07-09 14:35:54 +03:00
asciian
61d8719b8d Reading net from std::ifstream
Remove some assertions

Replace std::ifstream to std::istream

Add test for new importer

Remove constructor to load file

Rename cfgStream and darknetModelStream to ifile

Add error notification to inform pathname to user

Use FileStorage instead of std::istream

Use FileNode instead of FileStorage

Fix typo
2018-07-09 10:02:05 +03:00
Vadim Pisarevsky
523b6f32ba Merge pull request #11867 from dkurt:dnn_ie_layers 2018-07-06 13:13:20 +00:00
Dmitry Kurtaev
019c2f2115 Enable more deep learning tests 2018-07-05 14:23:15 +03:00
Alexander Alekhin
0bb2c115aa Merge pull request #11719 from alalek:update_autobuffer_api 2018-07-05 10:01:15 +00:00
Alexander Alekhin
ccd2370bb7 Merge pull request #11890 from dkurt:keras_resize_nearest 2018-07-05 09:57:24 +00:00
Alexander Alekhin
b09a4a98d4 opencv: Use cv::AutoBuffer<>::data() 2018-07-04 19:11:29 +03:00
Dmitry Kurtaev
f25a01bb5a Disable fusion to output layers 2018-07-04 15:53:47 +03:00
Dmitry Kurtaev
36288eebe7 Nearest neighbor resize from Keras 2018-07-04 11:53:24 +03:00
Dmitry Kurtaev
7ed5d85f25 Add Reshape layer tests 2018-07-03 08:26:43 +03:00
Alexander Alekhin
9be3f7d41a Merge pull request #11854 from dkurt:dnn_tf_data_layouts_v2 2018-06-29 15:02:22 +00:00
Alexander Alekhin
f40231af5d Merge pull request #11851 from pengli:3.4 2018-06-29 15:01:20 +00:00
Li Peng
145eae321e pooling ocl kernel optimization
set global size with real output size, also optimize

max pooling index computation if necessary.

Signed-off-by: Li Peng <peng.li@intel.com>
2018-06-29 15:22:49 +08:00
Dmitry Kurtaev
d971678add Add a planar data layout tracking for TensorFlow importer 2018-06-29 09:50:14 +03:00
Dmitry Kurtaev
346871e27f Set output layers names and types for models in DLDT's intermediate representation 2018-06-28 10:21:45 +03:00
Dmitry Kurtaev
dbeb4a11be Parse strides and convolution kernel shapes considering data layout 2018-06-26 16:18:21 +03:00
Vadim Pisarevsky
e87425f047 Merge pull request #11835 from dkurt:dnn_tf_two_inputs 2018-06-26 12:12:24 +00:00
Dmitry Kurtaev
9510551c63 Multiple inputs for TensorFlow models 2018-06-26 14:03:59 +03:00
Vadim Pisarevsky
b80c7bca0d Merge pull request #11826 from dkurt:dnn_tf_data_layouts 2018-06-26 06:36:27 +00:00
Dmitry Kurtaev
715f40a48d Use layers consumers to predict data layout 2018-06-25 18:25:40 +03:00
Li, Peng
ab8022f74e update convolution opencl kernels in dnn module (#11762)
* optimize ocl kernel enqueue in fc layer

Signed-off-by: Li Peng <peng.li@intel.com>

* use CV_LOG_INFO in convolution auto tuning

Signed-off-by: Li Peng <peng.li@intel.com>

* update convolution IDLF kernel

extend parameter tuning range, also cleanup
ocl kernel implementation

Signed-off-by: Li Peng <peng.li@intel.com>

* update in-memory convolution cache config

fp16 and fp32 cache config are stored separately

Signed-off-by: Li Peng <peng.li@intel.com>
2018-06-25 17:06:18 +03:00
Dmitry Kurtaev
e8e9d1d021 Implement Interp layer using Resize layer 2018-06-22 19:26:47 +03:00
Alexander Alekhin
1894f1a37f Merge pull request #11773 from alalek:dnn_ocl_update_force_tuning_flag 2018-06-22 05:23:55 +00:00
Alexander Alekhin
50c607d206 dnn(ocl): fix external / predefined builtin configuration behavior
OPENCV_OCL4DNN_FORCE_AUTO_TUNING should ignore existed configuration from:
- builtin predefined configurations (for Intel OpenCL iGPUs)
- external configuration (via OPENCV_OCL4DNN_CONFIG_PATH)

Prefer external configuration over builtin.
2018-06-21 20:59:03 +03:00
Dmitry Kurtaev
4626246087 Add ShuffleChannel layer 2018-06-21 19:10:42 +03:00
Dmitry Kurtaev
40b85c1cd9 Remove undocumented feature to retreive layers outputs by indices 2018-06-20 14:44:21 +03:00
Alexander Alekhin
30d4e0261a Merge pull request #11766 from dkurt:dnn_darknet_avgpool_softmax 2018-06-14 13:18:30 +00:00
Dmitry Kurtaev
bd87eb6e66 Import average pooling and softmax layers from Darknet 2018-06-14 15:22:08 +03:00
Dmitry Kurtaev
693a7663e7 Import ClipByValue from Keras 2018-06-14 13:30:30 +03:00
Alexander Alekhin
5fd7cfbcad dnn: add runtime parameter OPENCV_DNN_BACKEND_DEFAULT
to control DNN_BACKEND_DEFAULT enumeration value behavior
2018-06-13 19:00:04 +03:00
Alexander Alekhin
f040282bf8 Merge pull request #11739 from dkurt:more_ie_models 2018-06-13 13:26:50 +00:00
Dmitry Kurtaev
7d727ac2fb Fuse top layers to batch normalization 2018-06-09 18:06:53 +03:00
Dmitry Kurtaev
2c291bc2fb Enable FastNeuralStyle and OpenFace networks with IE backend 2018-06-09 15:57:12 +03:00
rockzhan
1187a7fa34 Merge pull request #11649 from rockzhan:dnn_dw_prelu
dnn: Fix output mismatch when forward dnn model contain [depthwise conv(group=1) + bn + prelu]  (#11649)

* this can make sure [depthwise conv(group=1) + bn + prelu] output not shift

* add TEST to show the output mismatch in [DWconv+Prelu]

* fix typo

* change loading image to init cvMat directly

* build runtime model, without loading external model

* remove whitespace

* change way to create a cvmat

* add bias_term, add target output

* fix [dwconv + prelu] value mismatch when no optimizations

* fix Test error when change output channels

* add parametric test

* change num_output to group value

* change conv code and change test back
2018-06-07 13:45:54 +00:00
David
7175f257b5 Added ResizeBilinear op for tf (#11050)
* Added ResizeBilinear op for tf

Combined ResizeNearestNeighbor and ResizeBilinear layers into Resize (with an interpolation param).

Minor changes to tf_importer and resize layer to save some code lines

Minor changes in init.cpp

Minor changes in tf_importer.cpp

* Replaced implementation of a custom ResizeBilinear layer to all layers

* Use Mat::ptr. Replace interpolation flags
2018-06-07 16:29:04 +03:00
Dmitry Kurtaev
f3a6ae5f00 Wrap Inference Engine init to try-catch 2018-06-07 12:55:52 +03:00
Vadim Pisarevsky
3cbd2e2764 Merge pull request #11650 from dkurt:dnn_default_backend 2018-06-06 09:30:39 +00:00
Dmitry Kurtaev
b781ac7346 Make Intel's Inference Engine backend is default if no preferable backend is specified. 2018-06-04 18:31:46 +03:00
Vadim Pisarevsky
055f33ec46 Merge pull request #11657 from dkurt:dnn_ie_multiple_networks 2018-06-04 10:12:46 +00:00
Kuang Fangjun
9ae28415ec fix doc. 2018-06-03 17:44:24 +08:00
Dmitry Kurtaev
ab389142af Fix multiple networks with Intel's Inference Engine backend 2018-06-01 14:10:32 +03:00
Alexander Alekhin
da75e463a8
Merge pull request #11639 from alalek:fix_precomp_hpp 2018-05-31 16:35:21 +00:00
Alexander Alekhin
799b4f48e7 fix missing precomp.hpp 2018-05-31 16:53:44 +03:00
Dmitry Kurtaev
32bab45f81 Fix Inference Engine graphs with fused output layers 2018-05-31 16:21:08 +03:00
Vadim Pisarevsky
c58cc4c2ff Merge pull request #11255 from dkurt:dnn_tf_faster_rcnn 2018-05-31 11:07:39 +00:00
Dmitry Kurtaev
f96f934426 Update Intel's Inference Engine deep learning backend (#11587)
* Update Intel's Inference Engine deep learning backend

* Remove cpu_extension dependency

* Update Darknet accuracy tests
2018-05-31 14:05:21 +03:00
Dmitry Kurtaev
bf87a43185 Faster-RCNN object detection models from TensorFlow 2018-05-30 17:12:36 +03:00
Alexander Alekhin
44572fac44 Merge pull request #11557 from tomoaki0705:relaxIntelOnlyOCL4DNN 2018-05-29 15:25:22 +00:00
Tomoaki Teshima
2e9e71ab9e make ocl4dnn available to run on other platform than Intel GPU 2018-05-29 19:18:10 +09:00
Dmitry Kurtaev
085be6a445 Fix dilated convolution from Keras 2018-05-29 12:15:47 +03:00
Dmitry Kurtaev
2c3c59d018 Remove Shift deep learning layer 2018-05-28 18:18:56 +03:00
Alexander Alekhin
3654fb10d7 Merge pull request #11567 from alalek:code_quality 2018-05-23 15:47:11 +00:00
Maksim Shabunin
895e10c317 dnn: fixed IE support on Windows 2018-05-23 12:46:14 +03:00
Alexander Alekhin
471c17321f improve code quality
- eliminate rand() calls
- non initialized members/ variables
- unused return values
- missing/useless NULL checks
2018-05-22 17:08:31 +03:00
Maksim Shabunin
53a68783a5 dnn: support later IE versions 2018-05-22 15:18:18 +03:00
Alexander Alekhin
085b27fc3d Merge pull request #11390 from dkurt:east_text_detection 2018-05-21 13:02:29 +00:00
Dmitry Kurtaev
07dc6d2b45 Return a convex hull from rotatedRectangleIntersection 2018-05-18 14:20:17 +03:00
Alexander Alekhin
d6279bfff8 fix build warnings 2018-05-17 18:29:21 +03:00
Li Peng
ba5e8befa9 fp16 ocl support for more layers
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:45:04 +08:00
Li Peng
3dd916882a fp16 ocl support for googlenet
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:45:02 +08:00
Li Peng
329abb5b64 dnn fp16 support
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:44:39 +08:00
Alexander Alekhin
bb8ff2c463 Merge pull request #11494 from tomoaki0705:fixOpenCLDnn 2018-05-16 14:11:36 +00:00
Tomoaki Teshima
3f5347dd7a work around of the test failure of opencv_test_dnn
* let OpenCL kernel run only on Intel GPU
  * brush up the workaround based on 9a2b028 from alalek
2018-05-16 19:23:19 +09:00
Dmitry Kurtaev
8488f2e265 EAST: An Efficient and Accurate Scene Text Detector (https://arxiv.org/abs/1704.03155v2) 2018-05-11 14:55:42 +03:00
Dmitry Kurtaev
c99c3e761e Fuse multipliers but not convolution layers weights 2018-05-10 19:24:38 +03:00
Dmitry Kurtaev
777d77848c Free Convolution and MatMul weights after TensorFlow layers import 2018-05-04 11:20:14 +03:00
Dmitry Kurtaev
9ffe4694db Reduce memory consumption at Caffe importer 2018-05-04 09:24:13 +03:00
zuoshaobo
4ff6a1bc7b Merge pull request #11425 from zuoshaobo:relu_negative_slope
* FIX INF_ENGINE RELU ERROR

* set slope to variable

* tab in indentwq
2018-05-03 13:36:49 +03:00
Alexander Alekhin
083b08742d Merge pull request #11406 from alalek:core_matsize_dims 2018-04-28 14:38:42 +00:00
Alexander Alekhin
65b0b319eb eliminate MSVS2017 build warning
modules\dnn\src\layers\prior_box_layer.cpp(208): warning C4834: discarding return value of function with 'nodiscard' attribute
2018-04-28 15:14:41 +03:00
Alexander Alekhin
8c349ff8ff core: added MatSize::dims() method
to avoid accessing of 'p[-1]' (static code analysers dislike this)
2018-04-27 16:57:29 +03:00
Alexander Alekhin
576d2dbac0 refactor: don't use CV_ErrorNoReturn() internally 2018-04-24 15:38:42 +03:00
Dmitry Kurtaev
4ec456f0a0 Custom layers for deep learning networks (#11129)
* Custom deep learning layers support

* Stack custom deep learning layers
2018-04-24 14:59:59 +03:00
Alexander Alekhin
29b4fd2774 Merge pull request #11351 from dkurt:dnn_enable_inf_engine_tests 2018-04-23 09:16:39 +00:00
Dmitry Kurtaev
d959d7b9f0 Fuse deconvolution layer subgraphs from Keras 2018-04-20 16:51:38 +03:00
Dmitry Kurtaev
bd77d100e1 Enable some tests for clDNN plugin from Intel's Inference Engine 2018-04-20 10:47:46 +03:00
Dmitry Kurtaev
3b4a292ca9 Let switch CPU/OpenCL targets for models from Intel's Model Optimizer 2018-04-19 10:23:57 +03:00
Vadim Pisarevsky
b290bdafb9 Merge pull request #11322 from dkurt:dnn_yolov3 2018-04-18 12:11:13 +00:00
Dmitry Kurtaev
66ce8cd7ea Fix bugs found by valgrind 2018-04-17 17:53:51 +03:00
Dmitry Kurtaev
97fec07d96 Support YOLOv3 model from Darknet 2018-04-16 18:44:12 +03:00
Alexander Alekhin
a2d6ee2d31 Merge pull request #11305 from tomoaki0705:typoNVIDIA 2018-04-13 12:56:42 +00:00
Tomoaki Teshima
a40354d16f use correct name for NVIDIA
* remove NVidia and Nvidia
  * replace Cuda with CUDA
  * keep the letters for API
2018-04-13 20:33:19 +09:00
Dmitry Kurtaev
b92c3182ab Blank and L2-normalization layers from Intel's Inference Engine 2018-04-12 15:21:08 +03:00
Vadim Pisarevsky
0b9d075958 Merge pull request #11295 from dkurt:dnn_repeated_conv_params 2018-04-11 15:25:24 +00:00
Vadim Pisarevsky
533bb89800 Merge pull request #11236 from dkurt:dnn_fuse_l2_norm 2018-04-11 15:09:55 +00:00
Vadim Pisarevsky
30175594e9 Merge pull request #11062 from dkurt:dnn_inf_engine_cldnn 2018-04-11 15:06:18 +00:00
Dmitry Kurtaev
512632e574 Parse repeated values of ConvolutionParameter 2018-04-11 14:38:05 +03:00
Dmitry Kurtaev
4ef6c91583 Fix multiple inputs for models from Intel's Model Optimizer 2018-04-11 13:28:07 +03:00
Dmitry Kurtaev
1ba72ca0d3 Fuse tf.nn.l2_normalize layer 2018-04-10 10:12:44 +03:00
Dmitry Kurtaev
709cf5d038 OpenCL GPU target for Inference Engine deep learning backend
Enable FP16 GPU target for DL Inference Engine backend.
2018-04-09 17:21:35 +03:00
Vladislav Sovrasov
0d9c63744e Add CPU default extensions loading in IE dnn backend (#11252)
* Add CPU default extensions loading in IE dnn backend

* Load cpu_extensions for the future Intel's Inference Engine
2018-04-09 16:22:19 +03:00
Dmitry Kurtaev
ef1aaf12c9 Fix Proposal deep learning layer 2018-04-04 14:48:29 +03:00
Dmitry Kurtaev
598039c0ed Fix embedded Torch's nn.ConcatTable 2018-03-31 11:11:10 +03:00
Alexander Alekhin
e8a67de0d2 Merge pull request #11182 from dkurt:fix_11102_part_2 2018-03-30 13:11:01 +00:00
Alexander Alekhin
1060c0f439 dnn: apply CV_OVERRIDE/CV_FINAL 2018-03-28 18:43:27 +03:00
Alexander Alekhin
167034fb04 Merge pull request #11098 from dkurt:dnn_native_inf_engine 2018-03-28 14:52:08 +00:00
Dmitry Kurtaev
e039fc3a63 Replace protobuf's ReleaseLast to RemoveLast to deallocate memory.
Change an order of PriorBox layer operations.
2018-03-28 17:27:36 +03:00
Dmitry Kurtaev
2f3a9ba1d4 Update OpenCVDetectInferenceEngine.cmake 2018-03-28 16:34:37 +03:00
Alexander Alekhin
9e0dee1259 Merge pull request #11112 from alalek:cmake_src_include_fix 2018-03-27 13:06:48 +00:00
Dmitry Kurtaev
7972f47ed4 Load networks from intermediate representation of Intel's Deep learning deployment toolkit. 2018-03-26 07:24:21 +03:00
Dmitry Kurtaev
e8fe6ee4e3 Fix prior box generation in case of squared proposals.
Fix batch norm in training phase.
2018-03-23 09:44:59 +03:00
Alexander Alekhin
6c051a55e5 cmake: don't add include <module>/src directory to avoid conflicts
during opencv_world builds
2018-03-19 11:14:15 +03:00
Dmitry Kurtaev
069f9add80 Fix an issue https://github.com/opencv/opencv/issues/11102 2018-03-18 10:49:12 +03:00
Alexander Alekhin
d68466bb6a Merge pull request #10940 from dkurt:dnn_tf_graph_optim 2018-03-14 14:36:25 +00:00
Alexander Alekhin
ab110c0ad1 Merge pull request #10979 from dkurt:unite_dnn_samples 2018-03-14 14:33:49 +00:00
Dmitry Kurtaev
538fd42363 Add test for Scalar arguments at CommandLineParser 2018-03-13 11:01:07 +03:00
Dmitry Kurtaev
ab20d2a3fc Update assertions in batch norm layer 2018-03-12 10:53:06 +03:00
Dmitry Kurtaev
69a8f110b6 Fuse subgraphs from Keras 2018-03-12 10:53:06 +03:00
Dmitry Kurtaev
9457bf10ab Fuse batch normalization and flatten TensorFlow subgraphs in runtime 2018-03-12 10:51:35 +03:00
Alexander Alekhin
5b868ccd82 Merge pull request #10992 from dkurt:dnn_opencl_tests 2018-03-09 10:06:40 +00:00
Dmitry Kurtaev
0f01b40dd5 Reset OpenCL kernels if batch size changes 2018-03-07 17:06:59 +03:00
Alexander Alekhin
514f4193db Merge pull request #10959 from alalek:cmake_ocl4dnn 2018-03-07 10:26:14 +00:00
Dmitry Kurtaev
e1c3237532 Parametric OpenCL deep learning tests 2018-03-05 20:53:18 +03:00
Dmitry Kurtaev
f2440ceae6 Update tutorials. A new cv::dnn::readNet function 2018-03-04 20:30:22 +03:00
Alexander Alekhin
fe97dc67dc Merge pull request #10962 from alalek:dnn_precomp_hpp 2018-03-02 11:38:16 +00:00
Alexander Alekhin
97c1f09961 Merge pull request #10955 from pengli:dnn 2018-03-02 11:35:59 +00:00
Alexander Alekhin
a9ebc61f2a dnn(workaround): switch to CPU target if compiled without OpenCL 2018-03-01 12:12:40 +03:00
Alexander Alekhin
1b83bc48a1 dnn: make OpenCL DNN code optional 2018-03-01 12:12:40 +03:00
Alexander Alekhin
a838a97092 dnn: fix precomp.hpp usage 2018-02-28 17:06:26 +03:00
Wu Zhiwen
ef937dd676 ocl4dnn: Fix SAME padding mode for convolve
Signed-off-by: Wu, Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-28 21:02:41 +08:00
Maksim Shabunin
7c855aa3e1 Fixed two issues found by static analysis 2018-02-26 00:16:02 +03:00
Li Peng
608968aa83 Deconvolution ocl fix
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-23 18:31:30 +08:00
Li, Peng
5caf6244a3 Merge pull request #10922 from pengli:dnn
* ave pooling ocl fix

support the padded area control in ave pooling

Signed-off-by: Li Peng <peng.li@intel.com>

* warning fix: ununitialized field
2018-02-22 21:01:12 +03:00
Maksim Shabunin
92e9d4ec3a Fixed several issues detected by static analysis 2018-02-22 17:11:33 +03:00
Vadim Pisarevsky
5e0f95b948 Merge pull request #9708 from dkurt:tf_face_detector 2018-02-22 12:04:26 +00:00
Li Peng
e7d35d51fa Fix for opencv face detector ocl test
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-22 23:37:54 +08:00
Li Peng
c524f669c7 Fallback for "SAME" padMode in ocl convolution and pooling
It fixes tensorflow ocl testcase of MobileNetSSD and Inception_v2_SSD

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-22 21:17:59 +08:00
Dmitry Kurtaev
eab556e1e0 OpenCV face detection network in TensorFlow 2018-02-21 19:58:24 +03:00
Alexander Alekhin
53305d4a7e Merge pull request #10891 from pengli:dnn 2018-02-20 08:59:07 +00:00
Li Peng
2863f950d6 ReLU6 layer ocl support
include relu6 ocl kernel and layer fusion support

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-20 15:11:09 +08:00
Dmitry Kurtaev
8b4871a28d Use only absolute prior boxes explicit sizes. Remove scales attributes. (#10874)
* Use only absolute prior boxes explicit sizes. Remove scales attributes.

* Simplified PriorBox layer forward pass
2018-02-19 17:25:18 +03:00
Alexander Alekhin
0e4eed0ba1 Merge pull request #10867 from dkurt:dnn_fix_ave_pooling_area 2018-02-16 11:17:32 +00:00
Alexander Alekhin
c020a7bb67 build: portable integer types 2018-02-15 23:43:02 +03:00
Dmitry Kurtaev
f8d0d6365e Add a flag to manage average pooling with padding 2018-02-14 16:56:31 +03:00
Alexander Alekhin
cff79609c8 Merge pull request #10854 from pengli:dnn 2018-02-14 12:49:53 +00:00
Vadim Pisarevsky
ef70b0baa4 Merge pull request #10865 from dkurt:dnn_inf_engine_getInputsInfo 2018-02-14 12:25:18 +00:00
Dmitry Kurtaev
a66b5e2c13 Add const getInputsInfo 2018-02-14 14:17:44 +03:00
Vadim Pisarevsky
6dfd7e3da2 Merge pull request #10850 from dkurt:dnn_tf_deconv_tests 2018-02-14 10:35:14 +00:00
Li Peng
5992c46606 add fallback case for ocl convolution
The ocl convolution doesn't support tensorflow padMode well.
Add fallback check if we meet this situation, it could fix the
tensorflow MobileNet SSD failure.

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-14 00:04:38 +08:00
Li Peng
00d2f34888 ocl fix for detection_output and prior_box layer
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-13 23:09:14 +08:00
Dmitry Kurtaev
514e6df460 Refactored deep learning layers fusion 2018-02-13 14:35:58 +03:00
Dmitry Kurtaev
a6baedd02c Fix deconvolution layer. Add batch norm layer with mean-variance normalization from TensorFlow. 2018-02-13 11:00:27 +03:00
Alexander Alekhin
66f3c1ae79 Merge pull request #10843 from luzpaz:misc-modules-typos 2018-02-12 13:47:12 +00:00
Sui Libin
1ad814a191 fix faster_rcnn sample crashed at PoolingInvoker on Windows7(x64). (#10724)
* fix faster_rcnn sample crashed at PoolingInvoker operator() of pooling_layer.

* find_odj onmouse bug about find matched point status.

* reverted AutoBuffer back to std::vector
2018-02-12 16:07:56 +03:00
luz.paz
5718d09e39 Misc. modules/ typos
Found via `codespell`
2018-02-12 07:09:43 -05:00
Rémi Ratajczak
b67523550f dnn : Added an imagesFromBlob method to the dnn module (#10607)
* Added the imagesFromBlob method to the dnn module.

* Rewritten imagesFromBlob based on first dkurt comments

* Updated code with getPlane()

* Modify comment of imagesFromBlob() in dnn module

* modified comments, removed useless assertions & added OutputArrayOfArray

* replaced tabs with whitespaces & put vectorOfChannels instantiation outside the loop

* Changed pre-commit.sample to pre-commit in .git/hooks/

* Added a test for imagesFromBlob in test_misc.cpp (dnn)

* Changed nbOfImages, robustified test with cv::randu, modified assertion
2018-02-12 14:51:07 +03:00
Dmitry Kurtaev
7fe97376c2 MobileNet-SSD from TensorFlow 1.3 and Inception-V2-SSD using Inference Engine backend 2018-02-09 13:45:45 +03:00
Dmitry Kurtaev
ed94136548 OpenCV face detection network using Inference Engine backend 2018-02-06 17:53:24 +03:00
Alexander Alekhin
398ebbac98 Merge pull request #10795 from pengli:dnn 2018-02-06 10:04:29 +00:00
Li Peng
c43498c6ad check vector emptiness before access it
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-06 22:59:51 +08:00
Li Peng
389fa5d38e slice layer ocl update
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-06 22:59:47 +08:00
Dmitry Kurtaev
10e1de74d2 Intel Inference Engine deep learning backend (#10608)
* Intel Inference Engine deep learning backend.

* OpenFace network using Inference Engine backend
2018-02-06 11:57:35 +03:00
Maksim Shabunin
e56d6054aa Do not build protobuf without dnn (#10689)
* Do not build protobuf if dnn is disabled

* Added BUILD_LIST cmake option to the cache

* Moved protobuf to the top level

* Fixed static build

* Fixed world build

* fixup! Fixed world build
2018-02-01 16:30:23 +03:00
Vadim Pisarevsky
713ec7be45 Merge pull request #10746 from dkurt:dnn_batch_norm_from_nvidia_caffe 2018-02-01 13:22:09 +00:00
Alexander Alekhin
42569cfd61 Merge pull request #10748 from dkurt:fix_dnn_slice_layer 2018-02-01 13:21:17 +00:00
Alexander Alekhin
9d25bd583f Merge pull request #10754 from dkurt:dnn_ocl_gemv_min_globalsize 2018-02-01 12:39:27 +00:00
Dmitry Kurtaev
65a6674c6e ocl4dnnGEMV in case of row_size < 4 2018-02-01 14:06:47 +03:00
Alexander Alekhin
9698b93d10 Merge pull request #10717 from pengli:dnn 2018-02-01 10:49:54 +00:00
Li Peng
6aec71d7ee mvn layer ocl update
it fuse ocl kernels to reduce kernel enqueue

Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-01 17:48:12 +08:00
Li Peng
83b16ab7b7 fix extra spaces in build option
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-01 17:46:11 +08:00
Li Peng
54c81cbde4 eltwise layer SUM op update
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-01 17:46:06 +08:00
Dmitry Kurtaev
184862582c Fix slice layer from TensorFlow 2018-01-31 19:12:37 +03:00
Arjan van de Ven
a75840d19c Merge pull request #10468 from fenrus75:avx512-2
* Add a 512 bit codepath to the AVX512 fastConv function

this patch adds a 512 wide codepath to the fastConv() function for
AVX512 use.
The basic idea is to process the first N * 16 elements of the vector
with avx512, and then run the rest of the vector using the traditional
AVX2 codepath.

* dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary)

* dnn: change "vecsize" condition for AVX512

* dnn: fix indentation
2018-01-31 16:34:12 +03:00
Alexander Alekhin
f06c44f1f1 Merge pull request #10701 from dkurt:tf_ave_pooling 2018-01-31 13:28:09 +00:00
Dmitry Kurtaev
844f1d0281 Fix Batch Normalization layer imported from NVIDIA Caffe. 2018-01-31 16:25:45 +03:00
Dmitry Kurtaev
a2e9bfbaf4 Fix padding for average pooling from TensorFlow 2018-01-31 15:54:30 +03:00
Li Peng
7a4c5e9421 slice layer ocl support
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-29 22:34:32 +08:00
Alexander Alekhin
2876670de3 dnn(ocl): fix build options for Apple OpenCL 2018-01-28 01:54:25 +00:00
Alexander Alekhin
104502c5be Merge pull request #10676 from dkurt:dnn_for_newer_mobilenet_ssd 2018-01-26 04:02:21 +00:00
Li Peng
2493083935 mvn, batch_norm and relu layer fusion
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-25 18:57:05 +08:00
Li Peng
e15928b49e convolution and tanh layer fusion
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-25 17:45:33 +08:00
Dmitry Kurtaev
9e9926a2f0 PriorBox layer with explicit normalized sizes 2018-01-24 14:01:42 +03:00
Dmitry Kurtaev
a3d74704e5 OpenCV face detection network test 2018-01-23 09:27:58 +03:00
Alexander Alekhin
26e0f408f0 Merge pull request #10639 from pengli:dnn 2018-01-19 10:01:41 +00:00
Li Peng
fe494297e4 more update on MVN layer ocl implementation
cut one ocl kernel if normVariance is disabled,
also use native_powr for performance reason.

Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-19 22:54:04 +08:00
Alexander Alekhin
c3569211d5 Merge pull request #10591 from drkoller:master 2018-01-19 09:44:21 +00:00
Li Peng
2124361ff7 ocl support for Deconvolution layer
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-18 23:40:22 +08:00
David Koller
d1a3b530be Make DNN Crop layer match Caffe default offset behavior
and add parametric unit test for crop layer.
2018-01-17 10:52:36 -05:00
Li Peng
e77af4ae33 MVN layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-17 17:11:32 +08:00
Li Peng
7bc017601f Power, Tanh and Channels ReLU layer ocl support
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-17 17:11:27 +08:00
Li Peng
4189214d04 batch_norm layer ocl update
use a batch_norm ocl kernel to do the work

Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-16 19:01:58 +08:00
Alexander Alekhin
1255bd8d4b Merge pull request #10585 from dkurt:dnn_weightless_scale 2018-01-15 06:07:50 +00:00
Dmitry Kurtaev
6a395d88ff dnn::blobFromImage with OutputArray 2018-01-13 18:20:24 +03:00
Dmitry Kurtaev
1f4fdfd599 Untrainable version of Scale layer from Caffe 2018-01-13 10:35:29 +03:00
Dmitry Kurtaev
64a9e92390 Merge pull request #10466 from dkurt:reduce_umat_try_2
* UMat blobs are wrapped

* Replace getUMat and getMat at OpenCLBackendWrapper
2018-01-10 21:50:54 +03:00
Alexander Alekhin
4d4f291553 Merge pull request #10513 from pengli:dnn 2018-01-09 19:24:28 +00:00
Li Peng
e3b42bf93b batch_norm and blank layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-09 21:58:46 +08:00
Alexander Alekhin
da0904df2d Merge pull request #10550 from dkurt:replace_psroi_pooling_tag 2018-01-08 19:19:00 +00:00
Dmitry Kurtaev
27b55ea761 Replace Caffe's psroi_pooling_param tag from 10001 to 10002 2018-01-08 13:29:20 +03:00
Alexander Alekhin
6674a024fc dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option
replaces REUSE_DNN_MEMORY compile-time option
2018-01-07 18:38:14 +00:00
Arthur Williams
8a67858068 Fixed missing #include "../precomp.hpp" 2018-01-05 15:10:39 +00:00
Li Peng
67f9406cbe add normalize_bbox layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-05 19:38:36 +08:00
Li Peng
f99a135eda add eltwise layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-05 19:38:30 +08:00
Li Peng
34bfd7ef51 add ocl implementation of proposal layer
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-04 18:40:51 +08:00
Alexander Alekhin
7d67d60fb1 cmake(opt): AVX512_SKX 2017-12-29 07:18:11 +00:00
Alexander Alekhin
8e7af7f089 Merge pull request #10456 from dkurt:dnn_allocate_mem_for_optimized_concat 2017-12-28 16:04:51 +00:00
Alexander Alekhin
a65b5df5da Merge pull request #10416 from fenrus75:avx512 2017-12-28 15:56:56 +00:00
Alexander Alekhin
898ca38257 cmake: AVX512 -> AVX_512F 2017-12-28 15:20:27 +00:00
Dmitry Kurtaev
a9807d8f54 Allocate new memory for optimized concat to prevent collisions.
Add a flag to disable memory reusing in dnn module.
2017-12-28 16:45:53 +03:00
Li Peng
00f03c5739 Add ocl version FasterRCNN accuracy test
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-28 19:15:15 +08:00
Alexander Alekhin
99a9c10b57 Merge pull request #10424 from dkurt:fix_concat_optim 2017-12-28 01:26:14 +00:00
Alexander Alekhin
f3880c60a6 Merge pull request #10428 from pengli:dnn 2017-12-27 13:18:10 +00:00
Arjan van de Ven
2938860b3f Provide a few AVX512 optimized functions for the DNN module
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.

AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-26 16:00:17 +00:00
Dmitry Kurtaev
70c605a03d Limit Concat layer optimization 2017-12-26 16:49:33 +03:00
Li Peng
84e2fa79a0 dnn(ocl4dnn): update pre-tuned kernel config
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-26 20:14:41 +08:00
Alexander Alekhin
adf43e7d2a build: fix MSVS2010 build error 2017-12-23 00:06:34 +00:00
Alexander Alekhin
019b7c5a66 Merge pull request #10402 from dkurt:dnn_tf_quantized 2017-12-22 15:58:56 +00:00
Alexander Alekhin
59e825ee02 Merge pull request #10385 from pengli:dnn 2017-12-22 15:48:40 +00:00
Dmitry Kurtaev
bcc669f3f7 TensorFlow weights dequantization 2017-12-22 17:25:10 +03:00
Alexander Alekhin
97af608030 Merge pull request #10397 from mshabunin:fix-incorrect-assert 2017-12-22 14:07:02 +00:00
Li Peng
181b448c4d add one more convolution kernel tuning candidate
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-22 21:37:00 +08:00
Vadim Pisarevsky
0742e12f0b Merge pull request #10265 from dkurt:nms_for_region_layer 2017-12-22 13:29:37 +00:00
Maksim Shabunin
aa46e31c6d Replaced incorrect CV_Assert calls with CV_Error 2017-12-22 15:20:13 +03:00
Vadim Pisarevsky
325cbd7c84 Merge pull request #10364 from dkurt:dnn_smooth_tf_data_layout 2017-12-22 09:56:45 +00:00
Li Peng
c5fc8e03ff cleanup unnecessary macros in convolution ocl kernel
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 20:32:36 +08:00
Li Peng
0aa5e43a14 refactor candidate generation of convolution auto-tuning
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 23:05:54 +08:00
Dmitry Kurtaev
c67e75b68f Refactor NMS procedure at RegionLayer 2017-12-21 12:21:45 +03:00
Dmitry Kurtaev
7e48fa58eb Manage TensorFlow's NHWC data layout is smoother 2017-12-20 14:13:40 +03:00
Dmitry Kurtaev
0ed2cbc931 R-FCN models support 2017-12-20 10:43:22 +03:00
Alexander Alekhin
dcdd6af5a8 Merge pull request #10341 from pengli:dnn 2017-12-19 14:04:55 +00:00
Li Peng
436d7e4eaf add depthwise convolution kernel
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:59:13 +08:00
Li Peng
910d7dab1f prior box layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:44:10 +08:00
Dmitry Kurtaev
6aabd6cc7a Remove cv::dnn::Importer 2017-12-18 18:08:28 +03:00
Dmitry Kurtaev
2b43d4f477 Fix default pooling layer type 2017-12-17 16:46:40 +03:00
Maksim Shabunin
1033f2b1bd Fixed 3 issues found by static analysis 2017-12-15 17:29:26 +03:00
Vadim Pisarevsky
62359f70ff Merge pull request #10306 from dkurt:faster_rcnn 2017-12-15 12:23:53 +00:00
Dmitry Kurtaev
08112f3821 Faster-RCNN models support 2017-12-15 12:16:21 +03:00
Alexander Alekhin
0da947e6b3 dnn: more debug information 2017-12-14 19:21:17 +03:00
Alexander Alekhin
c231472ad6 Merge pull request #10290 from tomoaki0705:fixVS2012Round 2017-12-13 15:30:21 +00:00
Tomoaki Teshima
ecb6bcf2e0 fix build error on Visual Studio 2012
* round doesn't exists in standard library of Visual Studio 2012
  * apply the correct computation of ROI
2017-12-13 17:40:07 +03:00
Vitaly Tuzov
51cb56ef2c Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. (#9468) 2017-12-13 15:00:38 +03:00
Alexander Alekhin
eff42f6387 dnn: more debug info 2017-12-12 12:04:10 +03:00
Vadim Pisarevsky
7e680bd9ff Merge pull request #10215 from dkurt:dnn_js 2017-12-11 12:47:52 +00:00
Vadim Pisarevsky
c24f10d647 Merge pull request #10268 from dkurt:fix_scale_layer 2017-12-08 18:46:50 +00:00
Dmitry Kurtaev
f503515082 JavaScript bindings for dnn module 2017-12-08 18:33:48 +03:00
Dmitry Kurtaev
e307065c8e Scale layer in case of 2D inputs 2017-12-08 17:34:59 +03:00
Alexander Alekhin
f2070c9f5d Merge pull request #10255 from dkurt:dnn_roi_pooling 2017-12-08 11:20:07 +00:00
Dmitry Kurtaev
17dcf0e82d ROIPooling layer 2017-12-07 19:04:38 +03:00
Dmitry Kurtaev
ef0650179b Fix conv/deconv/fc layers FLOPS computation 2017-12-07 11:42:04 +03:00
Alexander Alekhin
6074f92d48 Merge pull request #10228 from pengli:dnn_new 2017-12-06 15:50:12 +00:00
Li Peng
59cbaca4d3 detection_output layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-06 22:35:59 +08:00
Li Peng
66feea6cac region layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
7707c9bfba reorg layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
85b1c4060c support axis in concat layer ocl path
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
07bec6bdcd reshape layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:40 +08:00
Li Peng
7b7033ac60 permute layer ocl implementation
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-05 22:10:05 +08:00
Dmitry Kurtaev
bbbec300a6 nn.BatchNormalization and nn.Dropout layers from Torch 2017-12-04 12:57:21 +03:00
Alexander Alekhin
cc2ee923e4 Merge pull request #10164 from pengli:dnn 2017-11-29 12:05:10 +00:00
Wu Zhiwen
1f465a0ef9 dnn(ocl4dnn): fuseLayer() use umat_input/outputBlobs for OpenCL target
Also, fix bug when use OPENCL target but no OpenCL runtime

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-27 22:25:53 +08:00
Dmitry Kurtaev
99ed085752 Update PriorBox layer 2017-11-27 16:47:20 +03:00
Alexander Alekhin
13f374660f dnn(ocl4dnn): drop unused batch_size_ in pooling 2017-11-23 20:46:56 +00:00
Alexander Alekhin
e34b64c979 dnn(ocl4dnn): refactor pooling OpenCL calls 2017-11-23 20:46:44 +00:00
Alexander Alekhin
f071a48ec7 Merge pull request #10143 from pengli:ocl4dnn 2017-11-23 18:47:14 +00:00
Alexander Alekhin
107582c767 Merge pull request #9996 from dkurt:dnn_multiple_inputs 2017-11-23 18:22:37 +00:00
Li Peng
636d6368ee use OutputArrayOfArrays in net forward interface
It allows umat buffers used in net forward interface

Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-24 02:19:10 +08:00
Wu, Zhiwen
04edc8fe3a cleanup ocl4dnn spatial convolution kernels
remove unused macros and half definition macros,
also remove unused ocl::Queue

Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-24 02:19:10 +08:00
Alexander Alekhin
49a5280198 Merge pull request #10139 from alalek:dnn_rename_caffe_proto_package 2017-11-23 14:10:42 +00:00
Alexander Alekhin
f37f4cf3b4 Merge pull request #9994 from r2d3:dnn_memory_load 2017-11-22 18:15:00 +00:00
Alexander Alekhin
e7d62d6ef3 Merge pull request #10126 from alalek:dnn_issue_10125 2017-11-22 18:03:51 +00:00
Alexander Alekhin
1c88a566e0 dnn: rename caffe protobuf package 2017-11-22 18:34:07 +03:00
Alexander Alekhin
9db5cbf9a4 dnn: sync output/internals blobs back 2017-11-22 14:00:58 +03:00
Vadim Pisarevsky
f8ad289311 Merge pull request #10092 from alalek:dnn_rename_caffe_proto 2017-11-22 08:16:20 +00:00
Alexander Alekhin
0f34628af7 dnn: drop OpenCL code path for DetectionOutputLayer
getUMat()/getMat() calls are scope based. Results of these calls can't be
stored somewhere for future usage.
2017-11-21 17:28:42 +03:00
Alexander Alekhin
438e456ce9 Merge pull request #10113 from wzw-intel:fusion 2017-11-20 18:13:33 +00:00
Alexander Alekhin
f6d927ef3b dnn: avoid conflicts with original caffe.proto
rename caffe.proto => opencv-caffe.proto
2017-11-20 19:04:00 +03:00
David Geldreich
f723cede2e add loading TensorFlow/Caffe net from memory buffer
add a corresponding test
2017-11-20 16:28:22 +01:00
Dmitry Kurtaev
6c5dd5cf6d Replace caffe::NormalizedBBox to local structure 2017-11-20 18:03:31 +03:00
Wu Zhiwen
45d11dde57 dnn(ocl4dnn): add fusion support for Power activation and eltwise add
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-20 14:58:53 +08:00
Wu Zhiwen
394101d6ed dnn(ocl4dnn): Fix relu fusion bug
Incorrect type of negative_slope result in this bug.

Also and OCL test for darknet to validate this patch.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-17 16:21:56 +08:00
Wu Zhiwen
88e6daa315 dnn(ocl4dnn): Fix wrong measurement for tuning time
convolution kernel use default queue to run, so that ocl::Timer
, to measure the kernel run time, should use the default queue too.
Also remove useless parameter for convolve()

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-16 13:09:57 +08:00
Li Peng
55260a8d3c reshape mat before doing computation in fc layer
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-13 09:29:50 +08:00
Alexander Alekhin
bafdc44d37 Merge pull request #10061 from Sahloul:dnn_torch_fix 2017-11-10 05:05:52 +00:00
Alexander Alekhin
8a3a75cc16 Merge pull request #9882 from pengli:ocl4dnn 2017-11-09 18:54:43 +00:00
Hamdi Sahloul
06bda58a2c DNN Torch - workaround when torch importer is disabled 2017-11-10 00:44:06 +09:00
Li Peng
8f99083726 Add new layer forward interface
Add layer forward interface with InputArrayOfArrays and
OutputArrayOfArrays parameters, it allows UMat buffer to be
processed and transferred in the layers.

Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-09 15:59:39 +08:00
Alexander Alekhin
97181a90ba dnn(ocl4dnn/conv): bailout on missing kernel configuration 2017-11-07 17:02:17 +03:00
Alexander Alekhin
6e4f9433d0 Merge pull request #9998 from alalek:ocl_fix_dnn_softmax_9991 2017-11-03 09:16:39 +00:00
Dmitry Kurtaev
20a2dc6ac5 Fix multiple inputs models from Caffe.
Fixed Concat optimization.
2017-11-02 18:55:08 +03:00
Alexander Alekhin
bacc96f4e8 dnn(ocl): fix softmax global/local size consistency 2017-11-02 17:08:40 +03:00
Dmitry Kurtaev
14af2a0c0c Fixed Halide's copy_to_device invocation 2017-11-01 14:01:54 +03:00
Vadim Pisarevsky
bc348eb8ab Merge pull request #9963 from dkurt:fix_caffe_shrinker 2017-10-31 12:27:19 +00:00