Alexander Alekhin
596a0125ed
Merge pull request #12336 from dkurt:dnn_ie_fix_net_lifetime
2018-08-30 11:09:18 +00:00
Wu Zhiwen
ca51bbb7ff
dnn: fix variance setting bug for PriorBoxLayer
...
- The size of second channel should be size[2] of output tensor,
- The Scalar should be {variance[0], variance[0], variance[0], variance[0]}
for _variance.size() == 1 case.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2018-08-30 11:05:38 +08:00
Dmitry Kurtaev
4062ef5fcb
Fix lifetime of networks which are loaded from Model Optimizer IRs
2018-08-29 13:34:26 +03:00
Dmitry Kurtaev
3e027df583
Enable more deep learning tests using Intel's Inference Engine backend
2018-08-27 18:37:35 +03:00
Alexander Alekhin
7f73b105ca
core: std::string more changes
2018-08-27 15:41:01 +03:00
Dmitry Kurtaev
472b71ecef
Merge pull request #12243 from dkurt:dnn_tf_mask_rcnn
...
* Support Mask-RCNN from TensorFlow
* Fix a sample
2018-08-24 14:47:32 +03:00
Alexander Alekhin
096366738b
dnn(build): fix CV_Assert() usage
2018-08-22 16:04:40 +03:00
Alexander Alekhin
c9faa09d55
Merge pull request #12266 from mshabunin:fix-windows-ie-build
2018-08-21 13:07:44 +00:00
Maksim Shabunin
808c89adc1
Fixed windows build with InferenceEngine
2018-08-21 14:59:13 +03:00
Alexander Alekhin
d2e08a524e
core: repair CV_Assert() messages
...
Multi-argument CV_Assert() is accessible via CV_Assert_N() (with malformed messages).
2018-08-15 17:43:10 +03:00
Alexander Alekhin
b9b66ca437
Merge pull request #12205 from dkurt:dnn_update_tf_face_detection
2018-08-14 10:53:12 +00:00
Dmitry Kurtaev
f056c0f137
UINT8 face detection network using Intel's Inference Engine backend
2018-08-13 18:38:47 +03:00
Alexander Alekhin
615883977f
Merge pull request #12128 from dkurt:dnn_fix_12066
2018-08-10 14:14:16 +00:00
Vadim Pisarevsky
7c8ab271fc
Merge pull request #12125 from dkurt:dnn_mobilenet_ppn
2018-08-06 14:40:50 +00:00
Vadim Pisarevsky
70b893333d
Merge pull request #12130 from dkurt:dnn_ie_mvn
2018-08-06 14:37:46 +00:00
Dmitry Kurtaev
449696f1e5
Enable reshape-as-shape layer from TensorFlow
2018-08-06 17:35:06 +03:00
Vadim Pisarevsky
e0c93bcf6c
Merge pull request #12082 from dkurt:dnn_ie_faster_rcnn
2018-08-06 14:28:58 +00:00
Alexander Alekhin
ac4a6aad15
Merge pull request #12050 from alalek:dnn_ocl_avoid_memory_access_violation
2018-08-05 14:47:01 +00:00
Dmitry Kurtaev
be08730cd6
MVN layer using Intel's Inference Engine backend
2018-08-02 17:49:03 +03:00
Dmitry Kurtaev
4fb086d6c3
MobileNet-SSD v1 from TensorFlow with shared convolution weights
2018-08-01 16:16:48 +03:00
Dmitry Kurtaev
8e034053af
Faster-RCNN from TensorFlow on CPU with Intel's Inference Engine backend
2018-08-01 11:29:58 +03:00
Alexander Alekhin
814ebe39ae
Merge pull request #12113 from dkurt:dnn_fix_ssd_on_myriad
2018-07-31 14:55:18 +00:00
Maksim Shabunin
7cf52de47e
dnn: modified IE search, R2 compatibility fixed
2018-07-31 14:48:06 +03:00
Dmitry Kurtaev
ed0e79cb61
Add missing parameter to DetectionOutput layer from Intel's Inference Engine
2018-07-31 11:37:45 +03:00
Maksim Shabunin
fb1f12021b
Fixed build with latest IE version
2018-07-27 19:56:35 +03:00
Alexander Alekhin
b597c87bed
dnn(ocl): avoid memory access violation
2018-07-27 15:35:11 +03:00
Alexander Alekhin
9137e2d635
Merge pull request #12060 from alalek:dnn_debug_layers
2018-07-26 15:14:32 +00:00
Alexander Alekhin
c37d1a53b5
Merge pull request #12025 from Triplesalt:tfimport-relu
2018-07-26 15:08:05 +00:00
Triplesalt
9eb79926df
Allow a different input order for Mul+Maximum.
...
Squashed : ReLU operand order tests.
2018-07-26 14:19:11 +02:00
Vadim Pisarevsky
fa466b022d
Merge pull request #12052 from dkurt:dnn_ie_torch_tests
2018-07-26 09:09:35 +00:00
Dmitry Kurtaev
faa6c4e1e1
Faster-RCNN anf RFCN models on CPU using Intel's Inference Engine backend.
...
Enable Torch layers tests with Intel's Inference Engine backend.
2018-07-25 19:04:55 +03:00
Alexander Alekhin
45b5b3c13a
dnn: check layer output for NaN/Inf
2018-07-25 16:25:18 +03:00
Maksim Shabunin
cbb1e867e5
More issues found by static analysis
2018-07-24 16:04:42 +03:00
Alexander Alekhin
8de08e0463
Merge pull request #12021 from dkurt:dnn_ie_tf_ssd
2018-07-24 13:03:41 +00:00
Alexander Alekhin
236f383969
Merge pull request #12037 from dkurt:test_openvino_models
2018-07-24 12:34:04 +00:00
Dmitry Kurtaev
28e08ae0bd
Add a sample which tests OpenVINO models
2018-07-23 19:08:51 +03:00
Maksim Shabunin
e0603bb45f
Fixed several issues found by static analysis tools
2018-07-23 17:22:47 +03:00
Alexander Alekhin
ee743afebe
dnn(ocl): don't use getUMat() for long live objects
2018-07-20 17:53:55 +03:00
Maksim Shabunin
a4060e15a4
dnn, IE backend: updated to match new interface
2018-07-19 19:22:23 +03:00
Dmitry Kurtaev
c213a3823e
Run entire SSDs from TensorFlow using Intel's Inference Engine
2018-07-19 17:05:56 +03:00
Dmitry Kurtaev
070393dfda
uint8 inputs for deep learning networks
2018-07-19 14:37:33 +03:00
Alexander Alekhin
6c4f618db5
Merge pull request #11104 from asciian:reading_from_stream
2018-07-17 16:24:06 +00:00
Maksim Shabunin
1da46fe6fb
Fixed issues found by static analysis (mostly DBZ)
2018-07-17 16:14:54 +03:00
Alexander Alekhin
78d07e841d
Merge pull request #11959 from pengli:3.4
2018-07-17 11:20:02 +00:00
Li Peng
f0cadaa6e3
enable concat layer fuse for OCL target
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-07-17 12:46:16 +08:00
Alexander Alekhin
c9439476da
Merge pull request #11970 from dkurt:dnn_enable_tf_tests
2018-07-16 15:51:27 +00:00
Alexander Alekhin
d6c669f5cf
Merge pull request #11963 from dkurt:dnn_cl_fix_matmul
2018-07-16 11:10:32 +00:00
Dmitry Kurtaev
6eb8faea85
Enable TensorFlow networks tests for different backends and targets
2018-07-13 19:58:56 +03:00
Dmitry Kurtaev
de6f0a537d
Fix fully-connected layer in case of number of rows less than 4
2018-07-13 16:35:37 +03:00
Dmitry Kurtaev
dcc1beb1f8
Clip kernel for OpenCL PriorBox layer
2018-07-13 14:49:13 +03:00
Alexander Alekhin
2508f7f971
dnn(ocl): fix wrong usage of stalled .getMat() pointers
...
Temporary object lifetime must be greater than pointer usage.
2018-07-11 19:11:36 +03:00
Dmitry Kurtaev
8b5f061dae
Replace std::vector<char> to std::vector<uchar> for Java bindings of dnn importers
2018-07-11 18:58:56 +03:00
Alexander Alekhin
999aba3807
Merge pull request #11936 from berak:dnn_shufflelayer_name
2018-07-11 12:01:31 +00:00
Li Peng
4c5a86828a
Fix gemmlike convolution input reading
...
use vload3 for half3 or float3 input vector reading,
also check read position to see if it exceed input width
Signed-off-by: Li Peng <peng.li@intel.com>
2018-07-11 15:25:21 +08:00
berak
a7b502f04a
dnn: preserve name, type strings for ShuffleLayer
2018-07-11 08:19:23 +02:00
Dmitry Kurtaev
d57e5406f0
Add readNet* functions which parse models from byte arrays
2018-07-10 11:12:01 +03:00
Alexander Alekhin
7fe0727930
Merge pull request #11924 from alalek:dnn_ocl_fix_max_pool_forward
2018-07-09 16:25:34 +00:00
Alexander Alekhin
b6255ab9e7
dnn(ocl4dnn): fix args for 'max_pool_forward' kernel
2018-07-09 18:02:20 +03:00
Alexander Alekhin
e2b5d11290
dnn: allow to use external protobuf
...
"custom layers" feature will not work properly in these builds.
2018-07-09 17:28:45 +03:00
Dmitry Kurtaev
362d4f5395
Replace convertFp16 from dnn::Net::setInput()
2018-07-09 14:35:54 +03:00
asciian
61d8719b8d
Reading net from std::ifstream
...
Remove some assertions
Replace std::ifstream to std::istream
Add test for new importer
Remove constructor to load file
Rename cfgStream and darknetModelStream to ifile
Add error notification to inform pathname to user
Use FileStorage instead of std::istream
Use FileNode instead of FileStorage
Fix typo
2018-07-09 10:02:05 +03:00
Vadim Pisarevsky
523b6f32ba
Merge pull request #11867 from dkurt:dnn_ie_layers
2018-07-06 13:13:20 +00:00
Dmitry Kurtaev
019c2f2115
Enable more deep learning tests
2018-07-05 14:23:15 +03:00
Alexander Alekhin
0bb2c115aa
Merge pull request #11719 from alalek:update_autobuffer_api
2018-07-05 10:01:15 +00:00
Alexander Alekhin
ccd2370bb7
Merge pull request #11890 from dkurt:keras_resize_nearest
2018-07-05 09:57:24 +00:00
Alexander Alekhin
b09a4a98d4
opencv: Use cv::AutoBuffer<>::data()
2018-07-04 19:11:29 +03:00
Dmitry Kurtaev
f25a01bb5a
Disable fusion to output layers
2018-07-04 15:53:47 +03:00
Dmitry Kurtaev
36288eebe7
Nearest neighbor resize from Keras
2018-07-04 11:53:24 +03:00
Dmitry Kurtaev
7ed5d85f25
Add Reshape layer tests
2018-07-03 08:26:43 +03:00
Alexander Alekhin
9be3f7d41a
Merge pull request #11854 from dkurt:dnn_tf_data_layouts_v2
2018-06-29 15:02:22 +00:00
Alexander Alekhin
f40231af5d
Merge pull request #11851 from pengli:3.4
2018-06-29 15:01:20 +00:00
Li Peng
145eae321e
pooling ocl kernel optimization
...
set global size with real output size, also optimize
max pooling index computation if necessary.
Signed-off-by: Li Peng <peng.li@intel.com>
2018-06-29 15:22:49 +08:00
Dmitry Kurtaev
d971678add
Add a planar data layout tracking for TensorFlow importer
2018-06-29 09:50:14 +03:00
Dmitry Kurtaev
346871e27f
Set output layers names and types for models in DLDT's intermediate representation
2018-06-28 10:21:45 +03:00
Dmitry Kurtaev
dbeb4a11be
Parse strides and convolution kernel shapes considering data layout
2018-06-26 16:18:21 +03:00
Vadim Pisarevsky
e87425f047
Merge pull request #11835 from dkurt:dnn_tf_two_inputs
2018-06-26 12:12:24 +00:00
Dmitry Kurtaev
9510551c63
Multiple inputs for TensorFlow models
2018-06-26 14:03:59 +03:00
Vadim Pisarevsky
b80c7bca0d
Merge pull request #11826 from dkurt:dnn_tf_data_layouts
2018-06-26 06:36:27 +00:00
Dmitry Kurtaev
715f40a48d
Use layers consumers to predict data layout
2018-06-25 18:25:40 +03:00
Li, Peng
ab8022f74e
update convolution opencl kernels in dnn module ( #11762 )
...
* optimize ocl kernel enqueue in fc layer
Signed-off-by: Li Peng <peng.li@intel.com>
* use CV_LOG_INFO in convolution auto tuning
Signed-off-by: Li Peng <peng.li@intel.com>
* update convolution IDLF kernel
extend parameter tuning range, also cleanup
ocl kernel implementation
Signed-off-by: Li Peng <peng.li@intel.com>
* update in-memory convolution cache config
fp16 and fp32 cache config are stored separately
Signed-off-by: Li Peng <peng.li@intel.com>
2018-06-25 17:06:18 +03:00
Dmitry Kurtaev
e8e9d1d021
Implement Interp layer using Resize layer
2018-06-22 19:26:47 +03:00
Alexander Alekhin
1894f1a37f
Merge pull request #11773 from alalek:dnn_ocl_update_force_tuning_flag
2018-06-22 05:23:55 +00:00
Alexander Alekhin
50c607d206
dnn(ocl): fix external / predefined builtin configuration behavior
...
OPENCV_OCL4DNN_FORCE_AUTO_TUNING should ignore existed configuration from:
- builtin predefined configurations (for Intel OpenCL iGPUs)
- external configuration (via OPENCV_OCL4DNN_CONFIG_PATH)
Prefer external configuration over builtin.
2018-06-21 20:59:03 +03:00
Dmitry Kurtaev
4626246087
Add ShuffleChannel layer
2018-06-21 19:10:42 +03:00
Dmitry Kurtaev
40b85c1cd9
Remove undocumented feature to retreive layers outputs by indices
2018-06-20 14:44:21 +03:00
Alexander Alekhin
30d4e0261a
Merge pull request #11766 from dkurt:dnn_darknet_avgpool_softmax
2018-06-14 13:18:30 +00:00
Dmitry Kurtaev
bd87eb6e66
Import average pooling and softmax layers from Darknet
2018-06-14 15:22:08 +03:00
Dmitry Kurtaev
693a7663e7
Import ClipByValue from Keras
2018-06-14 13:30:30 +03:00
Alexander Alekhin
5fd7cfbcad
dnn: add runtime parameter OPENCV_DNN_BACKEND_DEFAULT
...
to control DNN_BACKEND_DEFAULT enumeration value behavior
2018-06-13 19:00:04 +03:00
Alexander Alekhin
f040282bf8
Merge pull request #11739 from dkurt:more_ie_models
2018-06-13 13:26:50 +00:00
Dmitry Kurtaev
7d727ac2fb
Fuse top layers to batch normalization
2018-06-09 18:06:53 +03:00
Dmitry Kurtaev
2c291bc2fb
Enable FastNeuralStyle and OpenFace networks with IE backend
2018-06-09 15:57:12 +03:00
rockzhan
1187a7fa34
Merge pull request #11649 from rockzhan:dnn_dw_prelu
...
dnn: Fix output mismatch when forward dnn model contain [depthwise conv(group=1) + bn + prelu] (#11649 )
* this can make sure [depthwise conv(group=1) + bn + prelu] output not shift
* add TEST to show the output mismatch in [DWconv+Prelu]
* fix typo
* change loading image to init cvMat directly
* build runtime model, without loading external model
* remove whitespace
* change way to create a cvmat
* add bias_term, add target output
* fix [dwconv + prelu] value mismatch when no optimizations
* fix Test error when change output channels
* add parametric test
* change num_output to group value
* change conv code and change test back
2018-06-07 13:45:54 +00:00
David
7175f257b5
Added ResizeBilinear op for tf ( #11050 )
...
* Added ResizeBilinear op for tf
Combined ResizeNearestNeighbor and ResizeBilinear layers into Resize (with an interpolation param).
Minor changes to tf_importer and resize layer to save some code lines
Minor changes in init.cpp
Minor changes in tf_importer.cpp
* Replaced implementation of a custom ResizeBilinear layer to all layers
* Use Mat::ptr. Replace interpolation flags
2018-06-07 16:29:04 +03:00
Dmitry Kurtaev
f3a6ae5f00
Wrap Inference Engine init to try-catch
2018-06-07 12:55:52 +03:00
Vadim Pisarevsky
3cbd2e2764
Merge pull request #11650 from dkurt:dnn_default_backend
2018-06-06 09:30:39 +00:00
Dmitry Kurtaev
b781ac7346
Make Intel's Inference Engine backend is default if no preferable backend is specified.
2018-06-04 18:31:46 +03:00
Vadim Pisarevsky
055f33ec46
Merge pull request #11657 from dkurt:dnn_ie_multiple_networks
2018-06-04 10:12:46 +00:00
Kuang Fangjun
9ae28415ec
fix doc.
2018-06-03 17:44:24 +08:00
Dmitry Kurtaev
ab389142af
Fix multiple networks with Intel's Inference Engine backend
2018-06-01 14:10:32 +03:00
Alexander Alekhin
da75e463a8
Merge pull request #11639 from alalek:fix_precomp_hpp
2018-05-31 16:35:21 +00:00
Alexander Alekhin
799b4f48e7
fix missing precomp.hpp
2018-05-31 16:53:44 +03:00
Dmitry Kurtaev
32bab45f81
Fix Inference Engine graphs with fused output layers
2018-05-31 16:21:08 +03:00
Vadim Pisarevsky
c58cc4c2ff
Merge pull request #11255 from dkurt:dnn_tf_faster_rcnn
2018-05-31 11:07:39 +00:00
Dmitry Kurtaev
f96f934426
Update Intel's Inference Engine deep learning backend ( #11587 )
...
* Update Intel's Inference Engine deep learning backend
* Remove cpu_extension dependency
* Update Darknet accuracy tests
2018-05-31 14:05:21 +03:00
Dmitry Kurtaev
bf87a43185
Faster-RCNN object detection models from TensorFlow
2018-05-30 17:12:36 +03:00
Alexander Alekhin
44572fac44
Merge pull request #11557 from tomoaki0705:relaxIntelOnlyOCL4DNN
2018-05-29 15:25:22 +00:00
Tomoaki Teshima
2e9e71ab9e
make ocl4dnn available to run on other platform than Intel GPU
2018-05-29 19:18:10 +09:00
Dmitry Kurtaev
085be6a445
Fix dilated convolution from Keras
2018-05-29 12:15:47 +03:00
Dmitry Kurtaev
2c3c59d018
Remove Shift deep learning layer
2018-05-28 18:18:56 +03:00
Alexander Alekhin
3654fb10d7
Merge pull request #11567 from alalek:code_quality
2018-05-23 15:47:11 +00:00
Maksim Shabunin
895e10c317
dnn: fixed IE support on Windows
2018-05-23 12:46:14 +03:00
Alexander Alekhin
471c17321f
improve code quality
...
- eliminate rand() calls
- non initialized members/ variables
- unused return values
- missing/useless NULL checks
2018-05-22 17:08:31 +03:00
Maksim Shabunin
53a68783a5
dnn: support later IE versions
2018-05-22 15:18:18 +03:00
Alexander Alekhin
085b27fc3d
Merge pull request #11390 from dkurt:east_text_detection
2018-05-21 13:02:29 +00:00
Dmitry Kurtaev
07dc6d2b45
Return a convex hull from rotatedRectangleIntersection
2018-05-18 14:20:17 +03:00
Alexander Alekhin
d6279bfff8
fix build warnings
2018-05-17 18:29:21 +03:00
Li Peng
ba5e8befa9
fp16 ocl support for more layers
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:45:04 +08:00
Li Peng
3dd916882a
fp16 ocl support for googlenet
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:45:02 +08:00
Li Peng
329abb5b64
dnn fp16 support
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:44:39 +08:00
Alexander Alekhin
bb8ff2c463
Merge pull request #11494 from tomoaki0705:fixOpenCLDnn
2018-05-16 14:11:36 +00:00
Tomoaki Teshima
3f5347dd7a
work around of the test failure of opencv_test_dnn
...
* let OpenCL kernel run only on Intel GPU
* brush up the workaround based on 9a2b028 from alalek
2018-05-16 19:23:19 +09:00
Dmitry Kurtaev
8488f2e265
EAST: An Efficient and Accurate Scene Text Detector ( https://arxiv.org/abs/1704.03155v2 )
2018-05-11 14:55:42 +03:00
Dmitry Kurtaev
c99c3e761e
Fuse multipliers but not convolution layers weights
2018-05-10 19:24:38 +03:00
Dmitry Kurtaev
777d77848c
Free Convolution and MatMul weights after TensorFlow layers import
2018-05-04 11:20:14 +03:00
Dmitry Kurtaev
9ffe4694db
Reduce memory consumption at Caffe importer
2018-05-04 09:24:13 +03:00
zuoshaobo
4ff6a1bc7b
Merge pull request #11425 from zuoshaobo:relu_negative_slope
...
* FIX INF_ENGINE RELU ERROR
* set slope to variable
* tab in indentwq
2018-05-03 13:36:49 +03:00
Alexander Alekhin
083b08742d
Merge pull request #11406 from alalek:core_matsize_dims
2018-04-28 14:38:42 +00:00
Alexander Alekhin
65b0b319eb
eliminate MSVS2017 build warning
...
modules\dnn\src\layers\prior_box_layer.cpp(208): warning C4834: discarding return value of function with 'nodiscard' attribute
2018-04-28 15:14:41 +03:00
Alexander Alekhin
8c349ff8ff
core: added MatSize::dims() method
...
to avoid accessing of 'p[-1]' (static code analysers dislike this)
2018-04-27 16:57:29 +03:00
Alexander Alekhin
576d2dbac0
refactor: don't use CV_ErrorNoReturn() internally
2018-04-24 15:38:42 +03:00
Dmitry Kurtaev
4ec456f0a0
Custom layers for deep learning networks ( #11129 )
...
* Custom deep learning layers support
* Stack custom deep learning layers
2018-04-24 14:59:59 +03:00
Alexander Alekhin
29b4fd2774
Merge pull request #11351 from dkurt:dnn_enable_inf_engine_tests
2018-04-23 09:16:39 +00:00
Dmitry Kurtaev
d959d7b9f0
Fuse deconvolution layer subgraphs from Keras
2018-04-20 16:51:38 +03:00
Dmitry Kurtaev
bd77d100e1
Enable some tests for clDNN plugin from Intel's Inference Engine
2018-04-20 10:47:46 +03:00
Dmitry Kurtaev
3b4a292ca9
Let switch CPU/OpenCL targets for models from Intel's Model Optimizer
2018-04-19 10:23:57 +03:00
Vadim Pisarevsky
b290bdafb9
Merge pull request #11322 from dkurt:dnn_yolov3
2018-04-18 12:11:13 +00:00
Dmitry Kurtaev
66ce8cd7ea
Fix bugs found by valgrind
2018-04-17 17:53:51 +03:00
Dmitry Kurtaev
97fec07d96
Support YOLOv3 model from Darknet
2018-04-16 18:44:12 +03:00
Alexander Alekhin
a2d6ee2d31
Merge pull request #11305 from tomoaki0705:typoNVIDIA
2018-04-13 12:56:42 +00:00
Tomoaki Teshima
a40354d16f
use correct name for NVIDIA
...
* remove NVidia and Nvidia
* replace Cuda with CUDA
* keep the letters for API
2018-04-13 20:33:19 +09:00
Dmitry Kurtaev
b92c3182ab
Blank and L2-normalization layers from Intel's Inference Engine
2018-04-12 15:21:08 +03:00
Vadim Pisarevsky
0b9d075958
Merge pull request #11295 from dkurt:dnn_repeated_conv_params
2018-04-11 15:25:24 +00:00
Vadim Pisarevsky
533bb89800
Merge pull request #11236 from dkurt:dnn_fuse_l2_norm
2018-04-11 15:09:55 +00:00
Vadim Pisarevsky
30175594e9
Merge pull request #11062 from dkurt:dnn_inf_engine_cldnn
2018-04-11 15:06:18 +00:00
Dmitry Kurtaev
512632e574
Parse repeated values of ConvolutionParameter
2018-04-11 14:38:05 +03:00
Dmitry Kurtaev
4ef6c91583
Fix multiple inputs for models from Intel's Model Optimizer
2018-04-11 13:28:07 +03:00
Dmitry Kurtaev
1ba72ca0d3
Fuse tf.nn.l2_normalize layer
2018-04-10 10:12:44 +03:00
Dmitry Kurtaev
709cf5d038
OpenCL GPU target for Inference Engine deep learning backend
...
Enable FP16 GPU target for DL Inference Engine backend.
2018-04-09 17:21:35 +03:00
Vladislav Sovrasov
0d9c63744e
Add CPU default extensions loading in IE dnn backend ( #11252 )
...
* Add CPU default extensions loading in IE dnn backend
* Load cpu_extensions for the future Intel's Inference Engine
2018-04-09 16:22:19 +03:00
Dmitry Kurtaev
ef1aaf12c9
Fix Proposal deep learning layer
2018-04-04 14:48:29 +03:00
Dmitry Kurtaev
598039c0ed
Fix embedded Torch's nn.ConcatTable
2018-03-31 11:11:10 +03:00
Alexander Alekhin
e8a67de0d2
Merge pull request #11182 from dkurt:fix_11102_part_2
2018-03-30 13:11:01 +00:00
Alexander Alekhin
1060c0f439
dnn: apply CV_OVERRIDE/CV_FINAL
2018-03-28 18:43:27 +03:00
Alexander Alekhin
167034fb04
Merge pull request #11098 from dkurt:dnn_native_inf_engine
2018-03-28 14:52:08 +00:00
Dmitry Kurtaev
e039fc3a63
Replace protobuf's ReleaseLast to RemoveLast to deallocate memory.
...
Change an order of PriorBox layer operations.
2018-03-28 17:27:36 +03:00
Dmitry Kurtaev
2f3a9ba1d4
Update OpenCVDetectInferenceEngine.cmake
2018-03-28 16:34:37 +03:00
Alexander Alekhin
9e0dee1259
Merge pull request #11112 from alalek:cmake_src_include_fix
2018-03-27 13:06:48 +00:00
Dmitry Kurtaev
7972f47ed4
Load networks from intermediate representation of Intel's Deep learning deployment toolkit.
2018-03-26 07:24:21 +03:00
Dmitry Kurtaev
e8fe6ee4e3
Fix prior box generation in case of squared proposals.
...
Fix batch norm in training phase.
2018-03-23 09:44:59 +03:00
Alexander Alekhin
6c051a55e5
cmake: don't add include <module>/src directory to avoid conflicts
...
during opencv_world builds
2018-03-19 11:14:15 +03:00
Dmitry Kurtaev
069f9add80
Fix an issue https://github.com/opencv/opencv/issues/11102
2018-03-18 10:49:12 +03:00
Alexander Alekhin
d68466bb6a
Merge pull request #10940 from dkurt:dnn_tf_graph_optim
2018-03-14 14:36:25 +00:00
Alexander Alekhin
ab110c0ad1
Merge pull request #10979 from dkurt:unite_dnn_samples
2018-03-14 14:33:49 +00:00
Dmitry Kurtaev
538fd42363
Add test for Scalar arguments at CommandLineParser
2018-03-13 11:01:07 +03:00
Dmitry Kurtaev
ab20d2a3fc
Update assertions in batch norm layer
2018-03-12 10:53:06 +03:00
Dmitry Kurtaev
69a8f110b6
Fuse subgraphs from Keras
2018-03-12 10:53:06 +03:00
Dmitry Kurtaev
9457bf10ab
Fuse batch normalization and flatten TensorFlow subgraphs in runtime
2018-03-12 10:51:35 +03:00
Alexander Alekhin
5b868ccd82
Merge pull request #10992 from dkurt:dnn_opencl_tests
2018-03-09 10:06:40 +00:00
Dmitry Kurtaev
0f01b40dd5
Reset OpenCL kernels if batch size changes
2018-03-07 17:06:59 +03:00
Alexander Alekhin
514f4193db
Merge pull request #10959 from alalek:cmake_ocl4dnn
2018-03-07 10:26:14 +00:00
Dmitry Kurtaev
e1c3237532
Parametric OpenCL deep learning tests
2018-03-05 20:53:18 +03:00
Dmitry Kurtaev
f2440ceae6
Update tutorials. A new cv::dnn::readNet function
2018-03-04 20:30:22 +03:00
Alexander Alekhin
fe97dc67dc
Merge pull request #10962 from alalek:dnn_precomp_hpp
2018-03-02 11:38:16 +00:00
Alexander Alekhin
97c1f09961
Merge pull request #10955 from pengli:dnn
2018-03-02 11:35:59 +00:00
Alexander Alekhin
a9ebc61f2a
dnn(workaround): switch to CPU target if compiled without OpenCL
2018-03-01 12:12:40 +03:00
Alexander Alekhin
1b83bc48a1
dnn: make OpenCL DNN code optional
2018-03-01 12:12:40 +03:00
Alexander Alekhin
a838a97092
dnn: fix precomp.hpp usage
2018-02-28 17:06:26 +03:00
Wu Zhiwen
ef937dd676
ocl4dnn: Fix SAME padding mode for convolve
...
Signed-off-by: Wu, Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-28 21:02:41 +08:00
Maksim Shabunin
7c855aa3e1
Fixed two issues found by static analysis
2018-02-26 00:16:02 +03:00
Li Peng
608968aa83
Deconvolution ocl fix
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-23 18:31:30 +08:00
Li, Peng
5caf6244a3
Merge pull request #10922 from pengli:dnn
...
* ave pooling ocl fix
support the padded area control in ave pooling
Signed-off-by: Li Peng <peng.li@intel.com>
* warning fix: ununitialized field
2018-02-22 21:01:12 +03:00
Maksim Shabunin
92e9d4ec3a
Fixed several issues detected by static analysis
2018-02-22 17:11:33 +03:00
Vadim Pisarevsky
5e0f95b948
Merge pull request #9708 from dkurt:tf_face_detector
2018-02-22 12:04:26 +00:00
Li Peng
e7d35d51fa
Fix for opencv face detector ocl test
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-22 23:37:54 +08:00
Li Peng
c524f669c7
Fallback for "SAME" padMode in ocl convolution and pooling
...
It fixes tensorflow ocl testcase of MobileNetSSD and Inception_v2_SSD
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-22 21:17:59 +08:00
Dmitry Kurtaev
eab556e1e0
OpenCV face detection network in TensorFlow
2018-02-21 19:58:24 +03:00
Alexander Alekhin
53305d4a7e
Merge pull request #10891 from pengli:dnn
2018-02-20 08:59:07 +00:00
Li Peng
2863f950d6
ReLU6 layer ocl support
...
include relu6 ocl kernel and layer fusion support
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-20 15:11:09 +08:00
Dmitry Kurtaev
8b4871a28d
Use only absolute prior boxes explicit sizes. Remove scales attributes. ( #10874 )
...
* Use only absolute prior boxes explicit sizes. Remove scales attributes.
* Simplified PriorBox layer forward pass
2018-02-19 17:25:18 +03:00
Alexander Alekhin
0e4eed0ba1
Merge pull request #10867 from dkurt:dnn_fix_ave_pooling_area
2018-02-16 11:17:32 +00:00
Alexander Alekhin
c020a7bb67
build: portable integer types
2018-02-15 23:43:02 +03:00
Dmitry Kurtaev
f8d0d6365e
Add a flag to manage average pooling with padding
2018-02-14 16:56:31 +03:00
Alexander Alekhin
cff79609c8
Merge pull request #10854 from pengli:dnn
2018-02-14 12:49:53 +00:00
Vadim Pisarevsky
ef70b0baa4
Merge pull request #10865 from dkurt:dnn_inf_engine_getInputsInfo
2018-02-14 12:25:18 +00:00
Dmitry Kurtaev
a66b5e2c13
Add const getInputsInfo
2018-02-14 14:17:44 +03:00
Vadim Pisarevsky
6dfd7e3da2
Merge pull request #10850 from dkurt:dnn_tf_deconv_tests
2018-02-14 10:35:14 +00:00
Li Peng
5992c46606
add fallback case for ocl convolution
...
The ocl convolution doesn't support tensorflow padMode well.
Add fallback check if we meet this situation, it could fix the
tensorflow MobileNet SSD failure.
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-14 00:04:38 +08:00
Li Peng
00d2f34888
ocl fix for detection_output and prior_box layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-13 23:09:14 +08:00
Dmitry Kurtaev
514e6df460
Refactored deep learning layers fusion
2018-02-13 14:35:58 +03:00
Dmitry Kurtaev
a6baedd02c
Fix deconvolution layer. Add batch norm layer with mean-variance normalization from TensorFlow.
2018-02-13 11:00:27 +03:00
Alexander Alekhin
66f3c1ae79
Merge pull request #10843 from luzpaz:misc-modules-typos
2018-02-12 13:47:12 +00:00
Sui Libin
1ad814a191
fix faster_rcnn sample crashed at PoolingInvoker on Windows7(x64). ( #10724 )
...
* fix faster_rcnn sample crashed at PoolingInvoker operator() of pooling_layer.
* find_odj onmouse bug about find matched point status.
* reverted AutoBuffer back to std::vector
2018-02-12 16:07:56 +03:00
luz.paz
5718d09e39
Misc. modules/ typos
...
Found via `codespell`
2018-02-12 07:09:43 -05:00
Rémi Ratajczak
b67523550f
dnn : Added an imagesFromBlob method to the dnn module ( #10607 )
...
* Added the imagesFromBlob method to the dnn module.
* Rewritten imagesFromBlob based on first dkurt comments
* Updated code with getPlane()
* Modify comment of imagesFromBlob() in dnn module
* modified comments, removed useless assertions & added OutputArrayOfArray
* replaced tabs with whitespaces & put vectorOfChannels instantiation outside the loop
* Changed pre-commit.sample to pre-commit in .git/hooks/
* Added a test for imagesFromBlob in test_misc.cpp (dnn)
* Changed nbOfImages, robustified test with cv::randu, modified assertion
2018-02-12 14:51:07 +03:00
Dmitry Kurtaev
7fe97376c2
MobileNet-SSD from TensorFlow 1.3 and Inception-V2-SSD using Inference Engine backend
2018-02-09 13:45:45 +03:00
Dmitry Kurtaev
ed94136548
OpenCV face detection network using Inference Engine backend
2018-02-06 17:53:24 +03:00
Alexander Alekhin
398ebbac98
Merge pull request #10795 from pengli:dnn
2018-02-06 10:04:29 +00:00
Li Peng
c43498c6ad
check vector emptiness before access it
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-06 22:59:51 +08:00
Li Peng
389fa5d38e
slice layer ocl update
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-06 22:59:47 +08:00
Dmitry Kurtaev
10e1de74d2
Intel Inference Engine deep learning backend ( #10608 )
...
* Intel Inference Engine deep learning backend.
* OpenFace network using Inference Engine backend
2018-02-06 11:57:35 +03:00
Maksim Shabunin
e56d6054aa
Do not build protobuf without dnn ( #10689 )
...
* Do not build protobuf if dnn is disabled
* Added BUILD_LIST cmake option to the cache
* Moved protobuf to the top level
* Fixed static build
* Fixed world build
* fixup! Fixed world build
2018-02-01 16:30:23 +03:00
Vadim Pisarevsky
713ec7be45
Merge pull request #10746 from dkurt:dnn_batch_norm_from_nvidia_caffe
2018-02-01 13:22:09 +00:00
Alexander Alekhin
42569cfd61
Merge pull request #10748 from dkurt:fix_dnn_slice_layer
2018-02-01 13:21:17 +00:00
Alexander Alekhin
9d25bd583f
Merge pull request #10754 from dkurt:dnn_ocl_gemv_min_globalsize
2018-02-01 12:39:27 +00:00
Dmitry Kurtaev
65a6674c6e
ocl4dnnGEMV in case of row_size < 4
2018-02-01 14:06:47 +03:00
Alexander Alekhin
9698b93d10
Merge pull request #10717 from pengli:dnn
2018-02-01 10:49:54 +00:00
Li Peng
6aec71d7ee
mvn layer ocl update
...
it fuse ocl kernels to reduce kernel enqueue
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-01 17:48:12 +08:00
Li Peng
83b16ab7b7
fix extra spaces in build option
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-01 17:46:11 +08:00
Li Peng
54c81cbde4
eltwise layer SUM op update
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-02-01 17:46:06 +08:00
Dmitry Kurtaev
184862582c
Fix slice layer from TensorFlow
2018-01-31 19:12:37 +03:00
Arjan van de Ven
a75840d19c
Merge pull request #10468 from fenrus75:avx512-2
...
* Add a 512 bit codepath to the AVX512 fastConv function
this patch adds a 512 wide codepath to the fastConv() function for
AVX512 use.
The basic idea is to process the first N * 16 elements of the vector
with avx512, and then run the rest of the vector using the traditional
AVX2 codepath.
* dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary)
* dnn: change "vecsize" condition for AVX512
* dnn: fix indentation
2018-01-31 16:34:12 +03:00
Alexander Alekhin
f06c44f1f1
Merge pull request #10701 from dkurt:tf_ave_pooling
2018-01-31 13:28:09 +00:00
Dmitry Kurtaev
844f1d0281
Fix Batch Normalization layer imported from NVIDIA Caffe.
2018-01-31 16:25:45 +03:00
Dmitry Kurtaev
a2e9bfbaf4
Fix padding for average pooling from TensorFlow
2018-01-31 15:54:30 +03:00
Li Peng
7a4c5e9421
slice layer ocl support
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-29 22:34:32 +08:00
Alexander Alekhin
2876670de3
dnn(ocl): fix build options for Apple OpenCL
2018-01-28 01:54:25 +00:00
Alexander Alekhin
104502c5be
Merge pull request #10676 from dkurt:dnn_for_newer_mobilenet_ssd
2018-01-26 04:02:21 +00:00
Li Peng
2493083935
mvn, batch_norm and relu layer fusion
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-25 18:57:05 +08:00
Li Peng
e15928b49e
convolution and tanh layer fusion
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-25 17:45:33 +08:00
Dmitry Kurtaev
9e9926a2f0
PriorBox layer with explicit normalized sizes
2018-01-24 14:01:42 +03:00
Dmitry Kurtaev
a3d74704e5
OpenCV face detection network test
2018-01-23 09:27:58 +03:00
Alexander Alekhin
26e0f408f0
Merge pull request #10639 from pengli:dnn
2018-01-19 10:01:41 +00:00
Li Peng
fe494297e4
more update on MVN layer ocl implementation
...
cut one ocl kernel if normVariance is disabled,
also use native_powr for performance reason.
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-19 22:54:04 +08:00
Alexander Alekhin
c3569211d5
Merge pull request #10591 from drkoller:master
2018-01-19 09:44:21 +00:00
Li Peng
2124361ff7
ocl support for Deconvolution layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-18 23:40:22 +08:00
David Koller
d1a3b530be
Make DNN Crop layer match Caffe default offset behavior
...
and add parametric unit test for crop layer.
2018-01-17 10:52:36 -05:00
Li Peng
e77af4ae33
MVN layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-17 17:11:32 +08:00
Li Peng
7bc017601f
Power, Tanh and Channels ReLU layer ocl support
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-17 17:11:27 +08:00
Li Peng
4189214d04
batch_norm layer ocl update
...
use a batch_norm ocl kernel to do the work
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-16 19:01:58 +08:00
Alexander Alekhin
1255bd8d4b
Merge pull request #10585 from dkurt:dnn_weightless_scale
2018-01-15 06:07:50 +00:00
Dmitry Kurtaev
6a395d88ff
dnn::blobFromImage with OutputArray
2018-01-13 18:20:24 +03:00
Dmitry Kurtaev
1f4fdfd599
Untrainable version of Scale layer from Caffe
2018-01-13 10:35:29 +03:00
Dmitry Kurtaev
64a9e92390
Merge pull request #10466 from dkurt:reduce_umat_try_2
...
* UMat blobs are wrapped
* Replace getUMat and getMat at OpenCLBackendWrapper
2018-01-10 21:50:54 +03:00
Alexander Alekhin
4d4f291553
Merge pull request #10513 from pengli:dnn
2018-01-09 19:24:28 +00:00
Li Peng
e3b42bf93b
batch_norm and blank layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-09 21:58:46 +08:00
Alexander Alekhin
da0904df2d
Merge pull request #10550 from dkurt:replace_psroi_pooling_tag
2018-01-08 19:19:00 +00:00
Dmitry Kurtaev
27b55ea761
Replace Caffe's psroi_pooling_param tag from 10001 to 10002
2018-01-08 13:29:20 +03:00
Alexander Alekhin
6674a024fc
dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option
...
replaces REUSE_DNN_MEMORY compile-time option
2018-01-07 18:38:14 +00:00
Arthur Williams
8a67858068
Fixed missing #include "../precomp.hpp"
2018-01-05 15:10:39 +00:00
Li Peng
67f9406cbe
add normalize_bbox layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-05 19:38:36 +08:00
Li Peng
f99a135eda
add eltwise layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-05 19:38:30 +08:00
Li Peng
34bfd7ef51
add ocl implementation of proposal layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-04 18:40:51 +08:00
Alexander Alekhin
7d67d60fb1
cmake(opt): AVX512_SKX
2017-12-29 07:18:11 +00:00
Alexander Alekhin
8e7af7f089
Merge pull request #10456 from dkurt:dnn_allocate_mem_for_optimized_concat
2017-12-28 16:04:51 +00:00
Alexander Alekhin
a65b5df5da
Merge pull request #10416 from fenrus75:avx512
2017-12-28 15:56:56 +00:00
Alexander Alekhin
898ca38257
cmake: AVX512 -> AVX_512F
2017-12-28 15:20:27 +00:00
Dmitry Kurtaev
a9807d8f54
Allocate new memory for optimized concat to prevent collisions.
...
Add a flag to disable memory reusing in dnn module.
2017-12-28 16:45:53 +03:00
Li Peng
00f03c5739
Add ocl version FasterRCNN accuracy test
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-28 19:15:15 +08:00
Alexander Alekhin
99a9c10b57
Merge pull request #10424 from dkurt:fix_concat_optim
2017-12-28 01:26:14 +00:00
Alexander Alekhin
f3880c60a6
Merge pull request #10428 from pengli:dnn
2017-12-27 13:18:10 +00:00
Arjan van de Ven
2938860b3f
Provide a few AVX512 optimized functions for the DNN module
...
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-26 16:00:17 +00:00
Dmitry Kurtaev
70c605a03d
Limit Concat layer optimization
2017-12-26 16:49:33 +03:00
Li Peng
84e2fa79a0
dnn(ocl4dnn): update pre-tuned kernel config
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-26 20:14:41 +08:00
Alexander Alekhin
adf43e7d2a
build: fix MSVS2010 build error
2017-12-23 00:06:34 +00:00
Alexander Alekhin
019b7c5a66
Merge pull request #10402 from dkurt:dnn_tf_quantized
2017-12-22 15:58:56 +00:00
Alexander Alekhin
59e825ee02
Merge pull request #10385 from pengli:dnn
2017-12-22 15:48:40 +00:00
Dmitry Kurtaev
bcc669f3f7
TensorFlow weights dequantization
2017-12-22 17:25:10 +03:00
Alexander Alekhin
97af608030
Merge pull request #10397 from mshabunin:fix-incorrect-assert
2017-12-22 14:07:02 +00:00
Li Peng
181b448c4d
add one more convolution kernel tuning candidate
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-22 21:37:00 +08:00
Vadim Pisarevsky
0742e12f0b
Merge pull request #10265 from dkurt:nms_for_region_layer
2017-12-22 13:29:37 +00:00
Maksim Shabunin
aa46e31c6d
Replaced incorrect CV_Assert calls with CV_Error
2017-12-22 15:20:13 +03:00
Vadim Pisarevsky
325cbd7c84
Merge pull request #10364 from dkurt:dnn_smooth_tf_data_layout
2017-12-22 09:56:45 +00:00
Li Peng
c5fc8e03ff
cleanup unnecessary macros in convolution ocl kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 20:32:36 +08:00
Li Peng
0aa5e43a14
refactor candidate generation of convolution auto-tuning
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 23:05:54 +08:00
Dmitry Kurtaev
c67e75b68f
Refactor NMS procedure at RegionLayer
2017-12-21 12:21:45 +03:00
Dmitry Kurtaev
7e48fa58eb
Manage TensorFlow's NHWC data layout is smoother
2017-12-20 14:13:40 +03:00
Dmitry Kurtaev
0ed2cbc931
R-FCN models support
2017-12-20 10:43:22 +03:00
Alexander Alekhin
dcdd6af5a8
Merge pull request #10341 from pengli:dnn
2017-12-19 14:04:55 +00:00
Li Peng
436d7e4eaf
add depthwise convolution kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:59:13 +08:00
Li Peng
910d7dab1f
prior box layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:44:10 +08:00
Dmitry Kurtaev
6aabd6cc7a
Remove cv::dnn::Importer
2017-12-18 18:08:28 +03:00
Dmitry Kurtaev
2b43d4f477
Fix default pooling layer type
2017-12-17 16:46:40 +03:00
Maksim Shabunin
1033f2b1bd
Fixed 3 issues found by static analysis
2017-12-15 17:29:26 +03:00
Vadim Pisarevsky
62359f70ff
Merge pull request #10306 from dkurt:faster_rcnn
2017-12-15 12:23:53 +00:00
Dmitry Kurtaev
08112f3821
Faster-RCNN models support
2017-12-15 12:16:21 +03:00
Alexander Alekhin
0da947e6b3
dnn: more debug information
2017-12-14 19:21:17 +03:00
Alexander Alekhin
c231472ad6
Merge pull request #10290 from tomoaki0705:fixVS2012Round
2017-12-13 15:30:21 +00:00
Tomoaki Teshima
ecb6bcf2e0
fix build error on Visual Studio 2012
...
* round doesn't exists in standard library of Visual Studio 2012
* apply the correct computation of ROI
2017-12-13 17:40:07 +03:00
Vitaly Tuzov
51cb56ef2c
Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. ( #9468 )
2017-12-13 15:00:38 +03:00
Alexander Alekhin
eff42f6387
dnn: more debug info
2017-12-12 12:04:10 +03:00
Vadim Pisarevsky
7e680bd9ff
Merge pull request #10215 from dkurt:dnn_js
2017-12-11 12:47:52 +00:00
Vadim Pisarevsky
c24f10d647
Merge pull request #10268 from dkurt:fix_scale_layer
2017-12-08 18:46:50 +00:00
Dmitry Kurtaev
f503515082
JavaScript bindings for dnn module
2017-12-08 18:33:48 +03:00
Dmitry Kurtaev
e307065c8e
Scale layer in case of 2D inputs
2017-12-08 17:34:59 +03:00
Alexander Alekhin
f2070c9f5d
Merge pull request #10255 from dkurt:dnn_roi_pooling
2017-12-08 11:20:07 +00:00
Dmitry Kurtaev
17dcf0e82d
ROIPooling layer
2017-12-07 19:04:38 +03:00
Dmitry Kurtaev
ef0650179b
Fix conv/deconv/fc layers FLOPS computation
2017-12-07 11:42:04 +03:00
Alexander Alekhin
6074f92d48
Merge pull request #10228 from pengli:dnn_new
2017-12-06 15:50:12 +00:00
Li Peng
59cbaca4d3
detection_output layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-06 22:35:59 +08:00
Li Peng
66feea6cac
region layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
7707c9bfba
reorg layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
85b1c4060c
support axis in concat layer ocl path
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
07bec6bdcd
reshape layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:40 +08:00
Li Peng
7b7033ac60
permute layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-05 22:10:05 +08:00
Dmitry Kurtaev
bbbec300a6
nn.BatchNormalization and nn.Dropout layers from Torch
2017-12-04 12:57:21 +03:00
Alexander Alekhin
cc2ee923e4
Merge pull request #10164 from pengli:dnn
2017-11-29 12:05:10 +00:00
Wu Zhiwen
1f465a0ef9
dnn(ocl4dnn): fuseLayer() use umat_input/outputBlobs for OpenCL target
...
Also, fix bug when use OPENCL target but no OpenCL runtime
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-27 22:25:53 +08:00
Dmitry Kurtaev
99ed085752
Update PriorBox layer
2017-11-27 16:47:20 +03:00
Alexander Alekhin
13f374660f
dnn(ocl4dnn): drop unused batch_size_ in pooling
2017-11-23 20:46:56 +00:00
Alexander Alekhin
e34b64c979
dnn(ocl4dnn): refactor pooling OpenCL calls
2017-11-23 20:46:44 +00:00
Alexander Alekhin
f071a48ec7
Merge pull request #10143 from pengli:ocl4dnn
2017-11-23 18:47:14 +00:00
Alexander Alekhin
107582c767
Merge pull request #9996 from dkurt:dnn_multiple_inputs
2017-11-23 18:22:37 +00:00
Li Peng
636d6368ee
use OutputArrayOfArrays in net forward interface
...
It allows umat buffers used in net forward interface
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-24 02:19:10 +08:00
Wu, Zhiwen
04edc8fe3a
cleanup ocl4dnn spatial convolution kernels
...
remove unused macros and half definition macros,
also remove unused ocl::Queue
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-24 02:19:10 +08:00
Alexander Alekhin
49a5280198
Merge pull request #10139 from alalek:dnn_rename_caffe_proto_package
2017-11-23 14:10:42 +00:00
Alexander Alekhin
f37f4cf3b4
Merge pull request #9994 from r2d3:dnn_memory_load
2017-11-22 18:15:00 +00:00
Alexander Alekhin
e7d62d6ef3
Merge pull request #10126 from alalek:dnn_issue_10125
2017-11-22 18:03:51 +00:00
Alexander Alekhin
1c88a566e0
dnn: rename caffe protobuf package
2017-11-22 18:34:07 +03:00
Alexander Alekhin
9db5cbf9a4
dnn: sync output/internals blobs back
2017-11-22 14:00:58 +03:00
Vadim Pisarevsky
f8ad289311
Merge pull request #10092 from alalek:dnn_rename_caffe_proto
2017-11-22 08:16:20 +00:00
Alexander Alekhin
0f34628af7
dnn: drop OpenCL code path for DetectionOutputLayer
...
getUMat()/getMat() calls are scope based. Results of these calls can't be
stored somewhere for future usage.
2017-11-21 17:28:42 +03:00
Alexander Alekhin
438e456ce9
Merge pull request #10113 from wzw-intel:fusion
2017-11-20 18:13:33 +00:00
Alexander Alekhin
f6d927ef3b
dnn: avoid conflicts with original caffe.proto
...
rename caffe.proto => opencv-caffe.proto
2017-11-20 19:04:00 +03:00
David Geldreich
f723cede2e
add loading TensorFlow/Caffe net from memory buffer
...
add a corresponding test
2017-11-20 16:28:22 +01:00
Dmitry Kurtaev
6c5dd5cf6d
Replace caffe::NormalizedBBox to local structure
2017-11-20 18:03:31 +03:00
Wu Zhiwen
45d11dde57
dnn(ocl4dnn): add fusion support for Power activation and eltwise add
...
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-20 14:58:53 +08:00
Wu Zhiwen
394101d6ed
dnn(ocl4dnn): Fix relu fusion bug
...
Incorrect type of negative_slope result in this bug.
Also and OCL test for darknet to validate this patch.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-17 16:21:56 +08:00
Wu Zhiwen
88e6daa315
dnn(ocl4dnn): Fix wrong measurement for tuning time
...
convolution kernel use default queue to run, so that ocl::Timer
, to measure the kernel run time, should use the default queue too.
Also remove useless parameter for convolve()
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-16 13:09:57 +08:00
Li Peng
55260a8d3c
reshape mat before doing computation in fc layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-13 09:29:50 +08:00
Alexander Alekhin
bafdc44d37
Merge pull request #10061 from Sahloul:dnn_torch_fix
2017-11-10 05:05:52 +00:00
Alexander Alekhin
8a3a75cc16
Merge pull request #9882 from pengli:ocl4dnn
2017-11-09 18:54:43 +00:00
Hamdi Sahloul
06bda58a2c
DNN Torch - workaround when torch importer is disabled
2017-11-10 00:44:06 +09:00
Li Peng
8f99083726
Add new layer forward interface
...
Add layer forward interface with InputArrayOfArrays and
OutputArrayOfArrays parameters, it allows UMat buffer to be
processed and transferred in the layers.
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-09 15:59:39 +08:00
Alexander Alekhin
97181a90ba
dnn(ocl4dnn/conv): bailout on missing kernel configuration
2017-11-07 17:02:17 +03:00
Alexander Alekhin
6e4f9433d0
Merge pull request #9998 from alalek:ocl_fix_dnn_softmax_9991
2017-11-03 09:16:39 +00:00
Dmitry Kurtaev
20a2dc6ac5
Fix multiple inputs models from Caffe.
...
Fixed Concat optimization.
2017-11-02 18:55:08 +03:00
Alexander Alekhin
bacc96f4e8
dnn(ocl): fix softmax global/local size consistency
2017-11-02 17:08:40 +03:00
Dmitry Kurtaev
14af2a0c0c
Fixed Halide's copy_to_device invocation
2017-11-01 14:01:54 +03:00
Vadim Pisarevsky
bc348eb8ab
Merge pull request #9963 from dkurt:fix_caffe_shrinker
2017-10-31 12:27:19 +00:00
Dmitry Kurtaev
e1ebc4e991
Specify layer types for Caffe FP32->FP16 weights converter
2017-10-31 12:31:40 +03:00
Dmitry Kurtaev
03cefa7bfe
Set zero confidences in case of no detections
2017-10-30 10:17:57 +03:00
Vadim Pisarevsky
e0e40405ed
Merge pull request #9847 from wzw-intel:ocl4dnn_fusion
2017-10-27 13:59:46 +00:00
Vadim Pisarevsky
ff037ebe5f
Merge pull request #9845 from dkurt:fast_neural_style_models
2017-10-27 13:59:02 +00:00
Vadim Pisarevsky
5384d2f090
Merge pull request #9880 from dkurt:caffe_ceil_mode
2017-10-27 11:51:46 +00:00
Dmitry Kurtaev
4b52b8df34
Layers for fast-neural-style models: https://github.com/jcjohnson/fast-neural-style
2017-10-27 14:26:45 +03:00
Vadim Pisarevsky
bc93775385
Merge pull request #9862 from sovrasov:dnn_nms
2017-10-27 11:19:57 +00:00
Vadim Pisarevsky
825c0ffdb4
Merge pull request #9874 from dkurt:fix_identity_permute_layer
2017-10-27 11:11:48 +00:00
Vadim Pisarevsky
69f2590359
Merge pull request #9921 from dkurt:fix_prelu_after_fully_connected
2017-10-27 11:10:59 +00:00
Vadim Pisarevsky
7b8fb64f21
Merge pull request #9939 from alalek:fix_dnn_getUMat_crash
2017-10-27 11:06:22 +00:00
Vladislav Sovrasov
5bf39ceb5d
dnn: handle 4-channel images in blobFromImage ( #9944 )
2017-10-27 14:06:53 +03:00
Alexander Alekhin
436a1f72a5
dnn: fix sporadic crashes in getUMat()
...
Incorrect "total" buffer size calculated in StdMatAllocator::allocate() due wrong step values.
2017-10-25 18:07:05 +03:00
Vladislav Sovrasov
7e3e9144de
dnn: add an accuracy test for NMS
2017-10-25 13:40:56 +03:00
Vladislav Sovrasov
c704942b8a
dnn: add a documentation for NMS, fix missing experimantal namespace
2017-10-25 13:35:49 +03:00
Vladislav Sovrasov
acedb4a579
dnn: make NMS function public
2017-10-25 13:35:49 +03:00
Dmitry Kurtaev
a36ebaecdc
PReLU layer for multidimensional input
2017-10-23 16:13:03 +03:00
Alexander Alekhin
185faf99bd
ocl: simplify ocl::Timer interface
2017-10-18 16:01:21 +03:00
Dmitry Kurtaev
b903ff8992
Ceil mode from experimental version of Caffe, https://github.com/BVLC/caffe/pull/3057
2017-10-18 14:04:53 +03:00
Dmitry Kurtaev
a3a446c197
Output blobs shapes initialization in case of identity permutation (NCHW->NCHW)
2017-10-17 17:15:25 +03:00
Wu Zhiwen
2d8f2c2aea
dnn(ocl4dnn): add fusion support
...
ocl4dnn supports following fusion styles:
Conv + [BN] + [Scale] + [ReLU/PReLU]
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-10-16 19:18:36 +08:00
Maksim Shabunin
b066dd36ff
Fixed uninitialized class fields
2017-10-16 13:47:43 +03:00
Alexander Alekhin
4857cae6ed
dnn: don't use "experimental_dnn_v1" namespace directly
2017-10-12 18:16:53 +03:00
Alexander Alekhin
df5b2224d7
Merge pull request #9829 from pengli:ocl4dnn
2017-10-12 11:26:20 +00:00
Li Peng
937b8e4277
dnn(ocl4dnn): support log softmax in ocl4dnn
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-10-12 09:51:13 +08:00
Vadim Pisarevsky
e356ca2369
Merge pull request #9835 from sovrasov:blob_from_img_crop_opt
2017-10-11 17:18:40 +00:00
Vadim Pisarevsky
8b168175ec
Merge pull request #9636 from dkurt:duplicate_lp_norm_layer
2017-10-11 13:36:14 +00:00
Vadim Pisarevsky
0873ebb9b0
Merge pull request #9820 from sovrasov:text_detector_dnn
2017-10-11 13:31:46 +00:00
Vadim Pisarevsky
babd21c764
Merge pull request #9823 from alalek:dnn_halide_bypass_tbb_threads
2017-10-11 13:28:38 +00:00
Vladislav Sovrasov
47e1133e71
dnn: add crop flag to blobFromImage
2017-10-11 15:46:20 +03:00
Vladislav Sovrasov
f7175f5050
dnn: fix additional text boxes handling after the latest adaptations for TF
2017-10-11 14:04:48 +03:00
Vladislav Sovrasov
050916fd6b
dnn: modify priorBox layer
2017-10-11 11:43:50 +03:00
Dmitry Kurtaev
905a9dada2
Removed LPNormalize layer.
2017-10-10 20:38:55 +03:00
Alexander Alekhin
3935e13603
dnn(halide): don't compile Halide via parallel_for_()
...
To avoid problem with reduced stack size of inner threads.
2017-10-10 18:06:03 +03:00
Vadim Pisarevsky
b7ff9ddcdd
Merge pull request #9705 from AlexeyAB:dnn_darknet_yolo_v2
2017-10-10 12:02:03 +00:00
Vadim Pisarevsky
046045239c
Merge pull request #9750 from dkurt:feature_dnn_tf_text_graph
2017-10-10 10:06:24 +00:00
AlexeyAB
ecc34dc521
Added DNN Darknet Yolo v2 for object detection
2017-10-09 21:08:44 +03:00
Dmitry Kurtaev
eabf728682
PReLU layer from Caffe
2017-10-09 20:30:37 +03:00
Vadim Pisarevsky
fee87ea3f7
Merge pull request #9800 from alalek:fix_build_msvs2010
2017-10-09 12:33:08 +00:00
Vadim Pisarevsky
6a80834ed4
Merge pull request #9803 from wzw-intel:ocl_timer
2017-10-09 12:11:22 +00:00
Maksim Shabunin
5a22d81fe5
Fixed warnings produced by static analyzer
2017-10-09 13:37:18 +03:00
Wu Zhiwen
dbe9ee0924
ocl: simplify ocl::Timer
...
Use clFinish to gurantee commands completed, instead of waiting for events.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-10-09 13:48:38 +08:00
Alexander Alekhin
e615fafe2d
build: fix MSVS2010
2017-10-08 23:32:22 +03:00
Dmitry Kurtaev
e4aa39f9e5
Text TensorFlow graphs parsing. MobileNet-SSD for 90 classes.
2017-10-08 22:25:29 +03:00
Vadim Pisarevsky
21bd834a59
Merge pull request #9772 from dkurt:fix_caffe_eltwise_and_fc_layers
2017-10-06 13:47:54 +00:00
Vadim Pisarevsky
b969d86415
Merge pull request #9787 from dkurt:feature_dnn_resize_nearest_neighbor
2017-10-06 13:46:50 +00:00
Vadim Pisarevsky
fe58b58937
Merge pull request #9778 from dkurt:dnn_colorization
2017-10-06 11:48:05 +00:00
Dmitry Kurtaev
b9f94c9315
Nearest neighbor resize layer
2017-10-06 14:33:26 +03:00
Dmitry Kurtaev
e268606e26
Grayscale colorization model ( https://github.com/richzhang/colorization ) test.
2017-10-06 09:33:41 +03:00
Dmitry Kurtaev
ad8bbaf008
Multidimensional eltwise layer.
...
Fixed fully-connected layer axis.
2017-10-04 14:01:44 +03:00
Dmitry Kurtaev
2a21c10837
Fix TensorFlow split layer
2017-10-02 22:44:42 +03:00
pengli
e340ff9c3a
Merge pull request #9114 from pengli:dnn_rebase
...
add libdnn acceleration to dnn module (#9114 )
* import libdnn code
Signed-off-by: Li Peng <peng.li@intel.com>
* add convolution layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add pooling layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add softmax layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add lrn layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add innerproduct layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add HAVE_OPENCL macro
Signed-off-by: Li Peng <peng.li@intel.com>
* fix for convolution ocl
Signed-off-by: Li Peng <peng.li@intel.com>
* enable getUMat() for multi-dimension Mat
Signed-off-by: Li Peng <peng.li@intel.com>
* use getUMat for ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* use CV_OCL_RUN macro
Signed-off-by: Li Peng <peng.li@intel.com>
* set OPENCL target when it is available
and disable fuseLayer for OCL target for the time being
Signed-off-by: Li Peng <peng.li@intel.com>
* fix innerproduct accuracy test
Signed-off-by: Li Peng <peng.li@intel.com>
* remove trailing space
Signed-off-by: Li Peng <peng.li@intel.com>
* Fixed tensorflow demo bug.
Root cause is that tensorflow has different algorithm with libdnn
to calculate convolution output dimension.
libdnn don't calculate output dimension anymore and just use one
passed in by config.
* split gemm ocl file
split it into gemm_buffer.cl and gemm_image.cl
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix compile failure
Signed-off-by: Li Peng <peng.li@intel.com>
* check env flag for auto tuning
Signed-off-by: Li Peng <peng.li@intel.com>
* switch to new ocl kernels for softmax layer
Signed-off-by: Li Peng <peng.li@intel.com>
* update softmax layer
on some platform subgroup extension may not work well,
fallback to non subgroup ocl acceleration.
Signed-off-by: Li Peng <peng.li@intel.com>
* fallback to cpu path for fc layer with multi output
Signed-off-by: Li Peng <peng.li@intel.com>
* update output message
Signed-off-by: Li Peng <peng.li@intel.com>
* update fully connected layer
fallback to gemm API if libdnn return false
Signed-off-by: Li Peng <peng.li@intel.com>
* Add ReLU OCL implementation
* disable layer fusion for now
Signed-off-by: Li Peng <peng.li@intel.com>
* Add OCL implementation for concat layer
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* libdnn: update license and copyrights
Also refine libdnn coding style
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
* DNN: Don't link OpenCL library explicitly
* DNN: Make default preferableTarget to DNN_TARGET_CPU
User should set it to DNN_TARGET_OPENCL explicitly if want to
use OpenCL acceleration.
Also don't fusion when using DNN_TARGET_OPENCL
* DNN: refine coding style
* Add getOpenCLErrorString
* DNN: Use int32_t/uint32_t instread of alias
* Use namespace ocl4dnn to include libdnn things
* remove extra copyTo in softmax ocl path
Signed-off-by: Li Peng <peng.li@intel.com>
* update ReLU layer ocl path
Signed-off-by: Li Peng <peng.li@intel.com>
* Add prefer target property for layer class
It is used to indicate the target for layer forwarding,
either the default CPU target or OCL target.
Signed-off-by: Li Peng <peng.li@intel.com>
* Add cl_event based timer for cv::ocl
* Rename libdnn to ocl4dnn
Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: wzw <zhiwen.wu@intel.com>
* use UMat for ocl4dnn internal buffer
Remove allocateMemory which use clCreateBuffer directly
Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: wzw <zhiwen.wu@intel.com>
* enable buffer gemm in ocl4dnn innerproduct
Signed-off-by: Li Peng <peng.li@intel.com>
* replace int_tp globally for ocl4dnn kernels.
Signed-off-by: wzw <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
* create UMat for layer params
Signed-off-by: Li Peng <peng.li@intel.com>
* update sign ocl kernel
Signed-off-by: Li Peng <peng.li@intel.com>
* update image based gemm of inner product layer
Signed-off-by: Li Peng <peng.li@intel.com>
* remove buffer gemm of inner product layer
call cv::gemm API instead
Signed-off-by: Li Peng <peng.li@intel.com>
* change ocl4dnn forward parameter to UMat
Signed-off-by: Li Peng <peng.li@intel.com>
* Refine auto-tuning mechanism.
- Use OPENCV_OCL4DNN_KERNEL_CONFIG_PATH to set cache directory
for fine-tuned kernel configuration.
e.g. export OPENCV_OCL4DNN_KERNEL_CONFIG_PATH=/home/tmp,
the cache directory will be /home/tmp/spatialkernels/ on Linux.
- Define environment OPENCV_OCL4DNN_ENABLE_AUTO_TUNING to enable
auto-tuning.
- OPENCV_OPENCL_ENABLE_PROFILING is only used to enable profiling
for OpenCL command queue. This fix basic kernel get wrong running
time, i.e. 0ms.
- If creating cache directory failed, disable auto-tuning.
* Detect and create cache dir on windows
Signed-off-by: Li Peng <peng.li@intel.com>
* Refine gemm like convolution kernel.
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix redundant swizzleWeights calling when use cached kernel config.
* Fix "out of resource" bug when auto-tuning too many kernels.
* replace cl_mem with UMat in ocl4dnnConvSpatial class
* OCL4DNN: reduce the tuning kernel candidate.
This patch could reduce 75% of the tuning candidates with less
than 2% performance impact for the final result.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
* replace cl_mem with umat in ocl4dnn convolution
Signed-off-by: Li Peng <peng.li@intel.com>
* remove weight_image_ of ocl4dnn inner product
Actually it is unused in the computation
Signed-off-by: Li Peng <peng.li@intel.com>
* Various fixes for ocl4dnn
1. OCL_PERFORMANCE_CHECK(ocl::Device::getDefault().isIntel())
2. Ptr<OCL4DNNInnerProduct<float> > innerProductOp
3. Code comments cleanup
4. ignore check on OCL cpu device
Signed-off-by: Li Peng <peng.li@intel.com>
* add build option for log softmax
Signed-off-by: Li Peng <peng.li@intel.com>
* remove unused ocl kernels in ocl4dnn
Signed-off-by: Li Peng <peng.li@intel.com>
* replace ocl4dnnSet with opencv setTo
Signed-off-by: Li Peng <peng.li@intel.com>
* replace ALIGN with cv::alignSize
Signed-off-by: Li Peng <peng.li@intel.com>
* check kernel build options
Signed-off-by: Li Peng <peng.li@intel.com>
* Handle program compilation fail properly.
* Use std::numeric_limits<float>::infinity() for large float number
* check ocl4dnn kernel compilation result
Signed-off-by: Li Peng <peng.li@intel.com>
* remove unused ctx_id
Signed-off-by: Li Peng <peng.li@intel.com>
* change clEnqueueNDRangeKernel to kernel.run()
Signed-off-by: Li Peng <peng.li@intel.com>
* change cl_mem to UMat in image based gemm
Signed-off-by: Li Peng <peng.li@intel.com>
* check intel subgroup support for lrn and pooling layer
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix convolution bug if group is greater than 1
Signed-off-by: Li Peng <peng.li@intel.com>
* Set default layer preferableTarget to be DNN_TARGET_CPU
Signed-off-by: Li Peng <peng.li@intel.com>
* Add ocl perf test for convolution
Signed-off-by: Li Peng <peng.li@intel.com>
* Add more ocl accuracy test
Signed-off-by: Li Peng <peng.li@intel.com>
* replace cl_image with ocl::Image2D
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix build failure in elementwise layer
Signed-off-by: Li Peng <peng.li@intel.com>
* use getUMat() to get blob data
Signed-off-by: Li Peng <peng.li@intel.com>
* replace cl_mem handle with ocl::KernelArg
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(build): don't use C++11, OPENCL_LIBRARIES fix
* dnn(ocl4dnn): remove unused OpenCL kernels
* dnn(ocl4dnn): extract OpenCL code into .cl files
* dnn(ocl4dnn): refine auto-tuning
Defaultly disable auto-tuning, set OPENCV_OCL4DNN_ENABLE_AUTO_TUNING
environment variable to enable it.
Use a set of pre-tuned configs as default config if auto-tuning is disabled.
These configs are tuned for Intel GPU with 48/72 EUs, and for googlenet,
AlexNet, ResNet-50
If default config is not suitable, use the first available kernel config
from the candidates. Candidate priority from high to low is gemm like kernel,
IDLF kernel, basick kernel.
* dnn(ocl4dnn): pooling doesn't use OpenCL subgroups
* dnn(ocl4dnn): fix perf test
OpenCV has default 3sec time limit for each performance test.
Warmup OpenCL backend outside of perf measurement loop.
* use ocl::KernelArg as much as possible
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): fix bias bug for gemm like kernel
* dnn(ocl4dnn): wrap cl_mem into UMat
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): Refine signature of kernel config
- Use more readable string as signture of kernel config
- Don't count device name and vendor in signature string
- Default kernel configurations are tuned for Intel GPU with
24/48/72 EUs, and for googlenet, AlexNet, ResNet-50 net model.
* dnn(ocl4dnn): swap width/height in configuration
* dnn(ocl4dnn): enable configs for Intel OpenCL runtime only
* core: make configuration helper functions accessible from non-core modules
* dnn(ocl4dnn): update kernel auto-tuning behavior
Avoid unwanted creation of directories
* dnn(ocl4dnn): simplify kernel to workaround OpenCL compiler crash
* dnn(ocl4dnn): remove redundant code
* dnn(ocl4dnn): Add more clear message for simd size dismatch.
* dnn(ocl4dnn): add const to const argument
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): force compiler use a specific SIMD size for IDLF kernel
* dnn(ocl4dnn): drop unused tuneLocalSize()
* dnn(ocl4dnn): specify OpenCL queue for Timer and convolve() method
* dnn(ocl4dnn): sanitize file names used for cache
* dnn(perf): enable Network tests with OpenCL
* dnn(ocl4dnn/conv): drop computeGlobalSize()
* dnn(ocl4dnn/conv): drop unused fields
* dnn(ocl4dnn/conv): simplify ctor
* dnn(ocl4dnn/conv): refactor kernelConfig localSize=NULL
* dnn(ocl4dnn/conv): drop unsupported double / untested half types
* dnn(ocl4dnn/conv): drop unused variable
* dnn(ocl4dnn/conv): alignSize/divUp
* dnn(ocl4dnn/conv): use enum values
* dnn(ocl4dnn): drop unused innerproduct variable
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): add an generic function to check cl option support
* dnn(ocl4dnn): run softmax subgroup version kernel first
Signed-off-by: Li Peng <peng.li@intel.com>
2017-10-02 15:38:00 +03:00
Vadim Pisarevsky
5e93c82023
Merge pull request #9491 from dkurt:tf_lstm
2017-09-28 21:04:06 +00:00
Vadim Pisarevsky
68cc2e292d
Merge pull request #9734 from dkurt:fix_deconv_layer_kernel_layout
2017-09-28 11:42:57 +00:00
Vadim Pisarevsky
45365e4df1
Merge pull request #9691 from dkurt:padding_layer_refactoring
2017-09-28 11:34:28 +00:00
Dmitry Kurtaev
6e593cd1f0
Swap dimensions of deconvolution kernel
2017-09-27 22:38:34 +03:00
Alexander Alekhin
3dee92ec50
fix usage of CV_FMA3 macro
2017-09-26 17:23:54 +03:00
Dmitry Kurtaev
84cec17913
LSTM layer for TensorFlow importer
2017-09-26 12:59:36 +03:00
Dmitry Kurtaev
222149b9c6
Refactored Padding layer
2017-09-22 12:39:00 +03:00
Dmitry Kurtaev
17a85b16fc
Remove reorder_dims attribute of Reshape layer
2017-09-21 16:42:03 +03:00
Dmitry Kurtaev
bdd8cc697a
Import wrapped Dropout subgraphs from TensorFlow
2017-09-20 13:30:25 +03:00
Vadim Pisarevsky
f7df5dd32c
Merge pull request #9305 from dkurt:public_dnn_importer_is_deprecated
2017-09-18 09:35:35 +00:00
Vadim Pisarevsky
e012ccda4a
Merge pull request #9517 from dkurt:tf_mobilenet
2017-09-18 09:31:19 +00:00
Dmitry Kurtaev
bd8e6b7e14
Make external cv::dnn::Importer usage is deprecated
2017-09-18 08:52:36 +03:00
Dmitry Kurtaev
d891e9b1d8
Layers for MobileNet from TensorFlow
2017-09-15 20:17:30 +03:00
Dmitry Kurtaev
8646d5fb84
FP16 Caffe models import and export
2017-09-15 18:06:34 +03:00
Vadim Pisarevsky
6bf8fe815d
Merge pull request #9384 from dkurt:torch_split
2017-09-15 13:05:05 +00:00
Vadim Pisarevsky
41b23fde9f
Merge pull request #9524 from dkurt:dnn_torch_openface
2017-09-15 12:38:12 +00:00
Dmitry Kurtaev
0ce7c33bc8
Torch's Concat and ConcatTable doesn't use Split layer
2017-09-14 09:26:57 +03:00
Dmitry Kurtaev
7dc6b1d7d4
Layers for OpenFace face recognition network
2017-09-14 09:11:31 +03:00
Dmitry Kurtaev
58b890b9f7
Dilated convolution import from TensorFlow
2017-09-13 18:44:14 +03:00
Vadim Pisarevsky
8c7f19850f
Merge pull request #9576 from dkurt:feature_dnn_tf_importer_fp16
2017-09-13 13:57:38 +00:00
Vadim Pisarevsky
93c3f20deb
Merge pull request #9569 from dkurt:test_dnn_ssd_halide
2017-09-13 13:29:50 +00:00
Dmitry Kurtaev
ce41a15437
Import and convert FP16 weights from TensorFlow
2017-09-12 09:50:55 +03:00
Maksim Shabunin
248e2c7d47
Fixed some issues found by static analysis
2017-09-08 12:22:12 +03:00
Dmitry Kurtaev
cad7c4d51d
MobileNet-SSD and VGG-SSD topologies in Halide
2017-09-08 09:55:53 +03:00
Alexander Alekhin
3202bbe17c
Merge pull request #9349 from dkurt:tf_deconv
2017-08-24 15:58:38 +00:00
Alexander Alekhin
8e7e24ac80
Merge pull request #9394 from dkurt:fix_halide_wrapper
2017-08-24 15:56:54 +00:00
Alexander Alekhin
25a4559565
Merge pull request #9294 from arrybn:layers_perf
2017-08-24 09:37:49 +00:00
Aleksandr Rybnikov
8b1146deb2
Added function to get timings for layers
2017-08-23 13:40:05 +03:00
Dmitry Kurtaev
54f0616a13
Deconvolution layer from TensorFlow
2017-08-21 21:38:07 +03:00
Dmitry Kurtaev
4e28c00e7b
Fix Halide buffer behavior in case of OpenCL device memory allocation
2017-08-17 13:27:54 +03:00
dkurt
339793143c
Unit tests for TensorFlow importer
2017-08-03 11:29:48 +03:00
Alexander Alekhin
0bd357e7ec
Merge pull request #9296 from dkurt:halide_device_interface
2017-08-02 20:26:30 +00:00
dkurt
b1ef44b1ac
Replace halide_opencl_device_interface
2017-08-02 20:38:30 +03:00
Aleksandr Rybnikov
8d6b8b45b6
Added ELU and test for it
2017-08-02 11:13:59 +03:00
Alexander Alekhin
bab4bc0968
Merge pull request #9284 from ipuustin:dnn-opencl-fixes
2017-08-01 13:06:01 +00:00
Ismo Puustinen
c2de5cf735
dnn: force floating point literals to be float.
...
In OpenCL code in activations.cl, make the type of floating point
literals to be float. Otherwise the values will be interpreted as
doubles, causing Beignet to have type conversion issues.
2017-08-01 15:02:24 +03:00
Alexander Alekhin
2959e7aba9
Merge pull request #9188 from arrybn:mobilenet_ssd_sample
2017-08-01 11:12:54 +00:00
Aleksandr Rybnikov
ce1cc352d9
MobileNet SSD sample
2017-08-01 12:30:27 +03:00
Alexander Alekhin
3f102e5d3a
dnn: protobuf shutdown
2017-07-26 17:21:46 +03:00
Alexander Alekhin
878a6906cc
dnn: fix torch importer memory leaks
2017-07-25 12:20:55 +03:00
Tomoaki Teshima
0f91faddae
fix linker error when trying CPU_BASELINE=AVX
2017-07-21 21:13:47 +09:00
Alexander Alekhin
ab58cac236
Merge pull request #9194 from tomoaki0705:fixBuildErrorDnn
2017-07-20 15:27:07 +00:00
Alexander Alekhin
08c94aa5c0
build: reuse int32_t workaround from softfloat.hpp
2017-07-20 14:01:21 +03:00
Tomoaki Teshima
1989bc33a7
fix build error on Visual Studio 2012
2017-07-20 11:00:04 +09:00
Aleksandr Rybnikov
7d1140340e
Rewrote googlenet tests
2017-07-18 18:49:14 +03:00
Vadim Pisarevsky
0488d9bdb2
optimize out scaleLayer & concatLayer whenever possible
...
fixed problem in concat layer by disabling memory re-use in layers with multiple inputs
trying to fix the tests when Halide is used to run deep nets
another attempt to fix Halide tests
see if the Halide tests will pass with concat layer fusion turned off
trying to fix failures in halide tests; another try
one more experiment to make halide_concat & halide_enet tests pass
continue attempts to fix halide tests
moving on
uncomment parallel concat layer
seemingly fixed failures in Halide tests and re-enabled concat layer fusion; thanks to dkurt for the patch
2017-07-14 18:30:53 +03:00
Alexander Alekhin
4784c7be5f
dnn: cleanup dispatched code, fix SIMD128 types
2017-07-13 19:00:34 +03:00
Alexander Alekhin
c3e6de293f
dnn: code cleanup, refactor detection output layer
2017-07-13 19:00:34 +03:00
Alexander Alekhin
544908d06c
dnn: some minor fixes in docs, indentation, unused code
2017-07-13 15:33:49 +03:00
Alexander Alekhin
520da7aaaf
Merge pull request #9111 from vpisarev:dnn_optim_avx1
2017-07-13 12:27:05 +00:00
dkurt
3203635765
Eltwise layer fixes
2017-07-10 12:58:11 +03:00
Vadim Pisarevsky
ed9564106c
reuse AVX2-optimized kernels for AVX1 CPUs (like IvyBridge)
2017-07-06 21:36:59 +03:00
abratchik
8f7181429f
add java wrappers to dnn module
2017-07-02 11:46:20 +04:00
Maksim Shabunin
e0393f8557
Fixed some issues found by static analysis (4th round)
2017-06-30 12:26:53 +03:00
Aleksandr Rybnikov
fab4f4b9d5
Disabled logging in caffe parser in release
2017-06-29 17:36:48 +03:00
Vadim Pisarevsky
ac49a17a82
Merge pull request #9022 from dkurt:keep_conv_weights_for_halide
2017-06-29 11:09:17 +00:00
Vadim Pisarevsky
fb1dcdd17d
Merge pull request #9029 from alalek:dnn_cleanup_torch
2017-06-29 11:07:35 +00:00
Maksim Shabunin
f1a56cb4b7
Merge pull request #9028 from alalek:dnn_experimental_namespace
2017-06-29 07:37:04 +00:00
Maksim Shabunin
ace0701a46
Merge pull request #9019 from alalek:dnn_trace
2017-06-29 07:33:46 +00:00
Alexander Alekhin
511e50c19c
dnn: cleanup torch integration code
2017-06-28 21:49:37 +00:00
Alexander Alekhin
324851882a
Merge pull request #9025 from mshabunin:fix-static-3
2017-06-28 20:50:21 +00:00
Alexander Alekhin
da0960321b
dnn: added "hidden" experimental namespace
...
Main purpose of this namespace is to avoid using of incompatible
binaries that will cause applications crashes.
This additional namespace will not impact "Source code API".
This change allows to maintain ABI checks (with easy filtering out).
2017-06-28 20:36:57 +00:00
Maksim Shabunin
a769d69a9d
Fixed several issues found by static analysis
2017-06-28 18:06:18 +03:00
dkurt
b46f5b1b38
Align convolutional layer weights separately from origin ones
2017-06-28 17:05:56 +03:00
Alexander Alekhin
ed10383359
dnn: added trace macros
2017-06-28 14:57:26 +03:00
Vadim Pisarevsky
c5faa9aefa
Merge pull request #9013 from arrybn:ssd_last_layers_optim
2017-06-28 10:38:55 +00:00
Vadim Pisarevsky
bbb14d3746
Merge pull request #9003 from dkurt:halide_bug_fixes
2017-06-28 08:48:27 +00:00
Aleksandr Rybnikov
ec321e651f
Removed usage of std::map in DetectionOutput layer
2017-06-28 11:31:38 +03:00
Vadim Pisarevsky
2ae849091c
Merge pull request #9009 from alalek:fix_dnn_initialization
2017-06-28 08:26:29 +00:00
Vadim Pisarevsky
8b3d6603d5
another round of dnn optimization ( #9011 )
...
* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)
* also, added missing copyrights to many of the layer implementation files
* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
2017-06-28 11:15:22 +03:00
Alexander Alekhin
00dd433368
dnn: fix LayerFactory initialization
2017-06-27 23:19:53 +03:00
Alexander Alekhin
f8a75c4361
dispatch: added CV_TRY_${OPT} macro, fix dnn build
...
- 1: OPT is available directly or via dispatcher
- 0: optimization is not compiled at all
2017-06-27 17:05:15 +03:00
dkurt
121789f78e
Fixed some bugs from Halide tests
2017-06-27 14:52:46 +03:00
Alexander Alekhin
16d1bbf2ea
dnn: fix build
...
- winpack
- opencv_world
2017-06-27 09:07:01 +03:00
Alexander Alekhin
986d27e49c
dnn: fix failed Torch tests
...
"Torch invalid argument 2: position must be smaller than LLONG_MAX"
These conditions are always true for "long position" argument.
2017-06-26 22:02:22 +03:00
Alexander Alekhin
93091ba203
dnn: AVX2 fix invalid unaligned read
2017-06-26 19:48:42 +03:00
Alexander Alekhin
93729784bb
dnn: move module from opencv_contrib
...
e6f63c7a38/modules/dnn
2017-06-26 13:41:51 +03:00