Alexander Alekhin
a65b5df5da
Merge pull request #10416 from fenrus75:avx512
2017-12-28 15:56:56 +00:00
Alexander Alekhin
2b3c140f04
Merge pull request #10436 from alalek:test_threads
2017-12-28 18:29:30 +03:00
Alexander Alekhin
898ca38257
cmake: AVX512 -> AVX_512F
2017-12-28 15:20:27 +00:00
Dmitry Kurtaev
a9807d8f54
Allocate new memory for optimized concat to prevent collisions.
...
Add a flag to disable memory reusing in dnn module.
2017-12-28 16:45:53 +03:00
Li Peng
00f03c5739
Add ocl version FasterRCNN accuracy test
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-28 19:15:15 +08:00
Alexander Alekhin
99a9c10b57
Merge pull request #10424 from dkurt:fix_concat_optim
2017-12-28 01:26:14 +00:00
Alexander Alekhin
9b131b5f7e
dnn(test): avoid calling of cv::setNumThreads() in tests directly
...
It is not necessary by default.
Also it breaks test system command-line parameters: --perf_threads / --test_threads
2017-12-27 15:16:41 +00:00
Alexander Alekhin
f3880c60a6
Merge pull request #10428 from pengli:dnn
2017-12-27 13:18:10 +00:00
Arjan van de Ven
2938860b3f
Provide a few AVX512 optimized functions for the DNN module
...
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-26 16:00:17 +00:00
Dmitry Kurtaev
70c605a03d
Limit Concat layer optimization
2017-12-26 16:49:33 +03:00
Li Peng
84e2fa79a0
dnn(ocl4dnn): update pre-tuned kernel config
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-26 20:14:41 +08:00
Alexander Alekhin
adf43e7d2a
build: fix MSVS2010 build error
2017-12-23 00:06:34 +00:00
Alexander Alekhin
019b7c5a66
Merge pull request #10402 from dkurt:dnn_tf_quantized
2017-12-22 15:58:56 +00:00
Alexander Alekhin
59e825ee02
Merge pull request #10385 from pengli:dnn
2017-12-22 15:48:40 +00:00
Dmitry Kurtaev
bcc669f3f7
TensorFlow weights dequantization
2017-12-22 17:25:10 +03:00
Alexander Alekhin
97af608030
Merge pull request #10397 from mshabunin:fix-incorrect-assert
2017-12-22 14:07:02 +00:00
Li Peng
181b448c4d
add one more convolution kernel tuning candidate
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-22 21:37:00 +08:00
Vadim Pisarevsky
0742e12f0b
Merge pull request #10265 from dkurt:nms_for_region_layer
2017-12-22 13:29:37 +00:00
Maksim Shabunin
aa46e31c6d
Replaced incorrect CV_Assert calls with CV_Error
2017-12-22 15:20:13 +03:00
Vadim Pisarevsky
325cbd7c84
Merge pull request #10364 from dkurt:dnn_smooth_tf_data_layout
2017-12-22 09:56:45 +00:00
Li Peng
c5fc8e03ff
cleanup unnecessary macros in convolution ocl kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 20:32:36 +08:00
Li Peng
0aa5e43a14
refactor candidate generation of convolution auto-tuning
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 23:05:54 +08:00
Dmitry Kurtaev
c67e75b68f
Refactor NMS procedure at RegionLayer
2017-12-21 12:21:45 +03:00
Vadim Pisarevsky
eecb64a973
Merge pull request #10331 from arrybn:python_dnn_net
2017-12-20 14:30:27 +00:00
Dmitry Kurtaev
7e48fa58eb
Manage TensorFlow's NHWC data layout is smoother
2017-12-20 14:13:40 +03:00
Dmitry Kurtaev
0ed2cbc931
R-FCN models support
2017-12-20 10:43:22 +03:00
Alexander Alekhin
dcdd6af5a8
Merge pull request #10341 from pengli:dnn
2017-12-19 14:04:55 +00:00
Li Peng
3b84acfc48
add ocl accuracy test for tf mobilenet ssd
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 18:38:55 +08:00
Li Peng
436d7e4eaf
add depthwise convolution kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:59:13 +08:00
Li Peng
910d7dab1f
prior box layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:44:10 +08:00
Dmitry Kurtaev
6aabd6cc7a
Remove cv::dnn::Importer
2017-12-18 18:08:28 +03:00
Alexander Rybnikov
19c914db51
Changed wrapping mode for cv::dnn::Net::forward
2017-12-18 15:56:09 +03:00
Dmitry Kurtaev
2b43d4f477
Fix default pooling layer type
2017-12-17 16:46:40 +03:00
Alexander Alekhin
3fddce67c6
experimental version++
2017-12-16 01:30:36 +03:00
Maksim Shabunin
1033f2b1bd
Fixed 3 issues found by static analysis
2017-12-15 17:29:26 +03:00
Vadim Pisarevsky
62359f70ff
Merge pull request #10306 from dkurt:faster_rcnn
2017-12-15 12:23:53 +00:00
Dmitry Kurtaev
08112f3821
Faster-RCNN models support
2017-12-15 12:16:21 +03:00
Alexander Alekhin
0da947e6b3
dnn: more debug information
2017-12-14 19:21:17 +03:00
Alexander Alekhin
c231472ad6
Merge pull request #10290 from tomoaki0705:fixVS2012Round
2017-12-13 15:30:21 +00:00
Tomoaki Teshima
ecb6bcf2e0
fix build error on Visual Studio 2012
...
* round doesn't exists in standard library of Visual Studio 2012
* apply the correct computation of ROI
2017-12-13 17:40:07 +03:00
Vitaly Tuzov
51cb56ef2c
Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. ( #9468 )
2017-12-13 15:00:38 +03:00
Alexander Alekhin
eff42f6387
dnn: more debug info
2017-12-12 12:04:10 +03:00
Vadim Pisarevsky
7e680bd9ff
Merge pull request #10215 from dkurt:dnn_js
2017-12-11 12:47:52 +00:00
Vadim Pisarevsky
c24f10d647
Merge pull request #10268 from dkurt:fix_scale_layer
2017-12-08 18:46:50 +00:00
Dmitry Kurtaev
f503515082
JavaScript bindings for dnn module
2017-12-08 18:33:48 +03:00
Dmitry Kurtaev
e307065c8e
Scale layer in case of 2D inputs
2017-12-08 17:34:59 +03:00
Alexander Alekhin
f2070c9f5d
Merge pull request #10255 from dkurt:dnn_roi_pooling
2017-12-08 11:20:07 +00:00
Dmitry Kurtaev
17dcf0e82d
ROIPooling layer
2017-12-07 19:04:38 +03:00
Dmitry Kurtaev
ef0650179b
Fix conv/deconv/fc layers FLOPS computation
2017-12-07 11:42:04 +03:00
Alexander Alekhin
6074f92d48
Merge pull request #10228 from pengli:dnn_new
2017-12-06 15:50:12 +00:00
Alexander Alekhin
0b688cd23f
Merge pull request #10240 from alalek:dnn_perf_ssd
2017-12-06 15:41:18 +00:00
Li Peng
59cbaca4d3
detection_output layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-06 22:35:59 +08:00
Li Peng
66feea6cac
region layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
7707c9bfba
reorg layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
85b1c4060c
support axis in concat layer ocl path
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:46 +08:00
Li Peng
07bec6bdcd
reshape layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-07 02:26:40 +08:00
Alexander Alekhin
d8a737b4b0
dnn: SSD performance test
2017-12-06 15:55:18 +03:00
Li Peng
7b7033ac60
permute layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-05 22:10:05 +08:00
Dmitry Kurtaev
bbbec300a6
nn.BatchNormalization and nn.Dropout layers from Torch
2017-12-04 12:57:21 +03:00
Alexander Alekhin
cc2ee923e4
Merge pull request #10164 from pengli:dnn
2017-11-29 12:05:10 +00:00
Wu Zhiwen
1f465a0ef9
dnn(ocl4dnn): fuseLayer() use umat_input/outputBlobs for OpenCL target
...
Also, fix bug when use OPENCL target but no OpenCL runtime
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-27 22:25:53 +08:00
Li Peng
a47fbd2610
Add ocl accuracy test for a few dnn nets
...
They are alexnet, mobilenet-ssd, resnet50, squeezeNet_v1_1,
yolo and fast_neural_style.
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-27 23:33:21 +08:00
Dmitry Kurtaev
99ed085752
Update PriorBox layer
2017-11-27 16:47:20 +03:00
Alexander Alekhin
13f374660f
dnn(ocl4dnn): drop unused batch_size_ in pooling
2017-11-23 20:46:56 +00:00
Alexander Alekhin
e34b64c979
dnn(ocl4dnn): refactor pooling OpenCL calls
2017-11-23 20:46:44 +00:00
Alexander Alekhin
f071a48ec7
Merge pull request #10143 from pengli:ocl4dnn
2017-11-23 18:47:14 +00:00
Alexander Alekhin
107582c767
Merge pull request #9996 from dkurt:dnn_multiple_inputs
2017-11-23 18:22:37 +00:00
Li Peng
636d6368ee
use OutputArrayOfArrays in net forward interface
...
It allows umat buffers used in net forward interface
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-24 02:19:10 +08:00
Wu, Zhiwen
04edc8fe3a
cleanup ocl4dnn spatial convolution kernels
...
remove unused macros and half definition macros,
also remove unused ocl::Queue
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-24 02:19:10 +08:00
Alexander Alekhin
49a5280198
Merge pull request #10139 from alalek:dnn_rename_caffe_proto_package
2017-11-23 14:10:42 +00:00
Alexander Alekhin
f37f4cf3b4
Merge pull request #9994 from r2d3:dnn_memory_load
2017-11-22 18:15:00 +00:00
Alexander Alekhin
e7d62d6ef3
Merge pull request #10126 from alalek:dnn_issue_10125
2017-11-22 18:03:51 +00:00
Alexander Alekhin
b29893b938
dnn: autogenerated files
2017-11-22 18:34:07 +03:00
Alexander Alekhin
1c88a566e0
dnn: rename caffe protobuf package
2017-11-22 18:34:07 +03:00
Alexander Alekhin
9db5cbf9a4
dnn: sync output/internals blobs back
2017-11-22 14:00:58 +03:00
Vadim Pisarevsky
f8ad289311
Merge pull request #10092 from alalek:dnn_rename_caffe_proto
2017-11-22 08:16:20 +00:00
Alexander Alekhin
0f34628af7
dnn: drop OpenCL code path for DetectionOutputLayer
...
getUMat()/getMat() calls are scope based. Results of these calls can't be
stored somewhere for future usage.
2017-11-21 17:28:42 +03:00
Alexander Alekhin
438e456ce9
Merge pull request #10113 from wzw-intel:fusion
2017-11-20 18:13:33 +00:00
Alexander Alekhin
f19f2bbcde
dnn: autogenerated files
...
rename caffe.proto => opencv-caffe.proto
2017-11-20 19:04:02 +03:00
Alexander Alekhin
f6d927ef3b
dnn: avoid conflicts with original caffe.proto
...
rename caffe.proto => opencv-caffe.proto
2017-11-20 19:04:00 +03:00
David Geldreich
f723cede2e
add loading TensorFlow/Caffe net from memory buffer
...
add a corresponding test
2017-11-20 16:28:22 +01:00
Dmitry Kurtaev
6c5dd5cf6d
Replace caffe::NormalizedBBox to local structure
2017-11-20 18:03:31 +03:00
Wu Zhiwen
45d11dde57
dnn(ocl4dnn): add fusion support for Power activation and eltwise add
...
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-20 14:58:53 +08:00
Wu Zhiwen
394101d6ed
dnn(ocl4dnn): Fix relu fusion bug
...
Incorrect type of negative_slope result in this bug.
Also and OCL test for darknet to validate this patch.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-17 16:21:56 +08:00
Jcrist99
0608227e10
Merge pull request #9698 from abratchik:parse.doxygen
...
Support @deprecated tag in java wrappers (#9698 )
2017-11-16 16:48:12 +03:00
Wu Zhiwen
88e6daa315
dnn(ocl4dnn): Fix wrong measurement for tuning time
...
convolution kernel use default queue to run, so that ocl::Timer
, to measure the kernel run time, should use the default queue too.
Also remove useless parameter for convolve()
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-11-16 13:09:57 +08:00
Li Peng
55260a8d3c
reshape mat before doing computation in fc layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-13 09:29:50 +08:00
Alexander Alekhin
bafdc44d37
Merge pull request #10061 from Sahloul:dnn_torch_fix
2017-11-10 05:05:52 +00:00
Alexander Alekhin
8a3a75cc16
Merge pull request #9882 from pengli:ocl4dnn
2017-11-09 18:54:43 +00:00
Hamdi Sahloul
06bda58a2c
DNN Torch - workaround when torch importer is disabled
2017-11-10 00:44:06 +09:00
Li Peng
8f99083726
Add new layer forward interface
...
Add layer forward interface with InputArrayOfArrays and
OutputArrayOfArrays parameters, it allows UMat buffer to be
processed and transferred in the layers.
Signed-off-by: Li Peng <peng.li@intel.com>
2017-11-09 15:59:39 +08:00
Alexander Alekhin
97181a90ba
dnn(ocl4dnn/conv): bailout on missing kernel configuration
2017-11-07 17:02:17 +03:00
Alexander Alekhin
6e4f9433d0
Merge pull request #9998 from alalek:ocl_fix_dnn_softmax_9991
2017-11-03 09:16:39 +00:00
Dmitry Kurtaev
20a2dc6ac5
Fix multiple inputs models from Caffe.
...
Fixed Concat optimization.
2017-11-02 18:55:08 +03:00
Alexander Alekhin
bacc96f4e8
dnn(ocl): fix softmax global/local size consistency
2017-11-02 17:08:40 +03:00
Dmitry Kurtaev
14af2a0c0c
Fixed Halide's copy_to_device invocation
2017-11-01 14:01:54 +03:00
Vadim Pisarevsky
bc348eb8ab
Merge pull request #9963 from dkurt:fix_caffe_shrinker
2017-10-31 12:27:19 +00:00
Dmitry Kurtaev
e1ebc4e991
Specify layer types for Caffe FP32->FP16 weights converter
2017-10-31 12:31:40 +03:00
Dmitry Kurtaev
03cefa7bfe
Set zero confidences in case of no detections
2017-10-30 10:17:57 +03:00
Vadim Pisarevsky
e0e40405ed
Merge pull request #9847 from wzw-intel:ocl4dnn_fusion
2017-10-27 13:59:46 +00:00