Dmitry Kurtaev
64a9e92390
Merge pull request #10466 from dkurt:reduce_umat_try_2
...
* UMat blobs are wrapped
* Replace getUMat and getMat at OpenCLBackendWrapper
2018-01-10 21:50:54 +03:00
Alexander Alekhin
4d4f291553
Merge pull request #10513 from pengli:dnn
2018-01-09 19:24:28 +00:00
Li Peng
e3b42bf93b
batch_norm and blank layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-09 21:58:46 +08:00
Alexander Alekhin
da0904df2d
Merge pull request #10550 from dkurt:replace_psroi_pooling_tag
2018-01-08 19:19:00 +00:00
Dmitry Kurtaev
27b55ea761
Replace Caffe's psroi_pooling_param tag from 10001 to 10002
2018-01-08 13:29:20 +03:00
Alexander Alekhin
6674a024fc
dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option
...
replaces REUSE_DNN_MEMORY compile-time option
2018-01-07 18:38:14 +00:00
Arthur Williams
8a67858068
Fixed missing #include "../precomp.hpp"
2018-01-05 15:10:39 +00:00
Li Peng
67f9406cbe
add normalize_bbox layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-05 19:38:36 +08:00
Li Peng
f99a135eda
add eltwise layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-05 19:38:30 +08:00
Li Peng
34bfd7ef51
add ocl implementation of proposal layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
2018-01-04 18:40:51 +08:00
Alexander Alekhin
7d67d60fb1
cmake(opt): AVX512_SKX
2017-12-29 07:18:11 +00:00
Alexander Alekhin
8e7af7f089
Merge pull request #10456 from dkurt:dnn_allocate_mem_for_optimized_concat
2017-12-28 16:04:51 +00:00
Alexander Alekhin
a65b5df5da
Merge pull request #10416 from fenrus75:avx512
2017-12-28 15:56:56 +00:00
Alexander Alekhin
2b3c140f04
Merge pull request #10436 from alalek:test_threads
2017-12-28 18:29:30 +03:00
Alexander Alekhin
898ca38257
cmake: AVX512 -> AVX_512F
2017-12-28 15:20:27 +00:00
Dmitry Kurtaev
a9807d8f54
Allocate new memory for optimized concat to prevent collisions.
...
Add a flag to disable memory reusing in dnn module.
2017-12-28 16:45:53 +03:00
Li Peng
00f03c5739
Add ocl version FasterRCNN accuracy test
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-28 19:15:15 +08:00
Alexander Alekhin
99a9c10b57
Merge pull request #10424 from dkurt:fix_concat_optim
2017-12-28 01:26:14 +00:00
Alexander Alekhin
9b131b5f7e
dnn(test): avoid calling of cv::setNumThreads() in tests directly
...
It is not necessary by default.
Also it breaks test system command-line parameters: --perf_threads / --test_threads
2017-12-27 15:16:41 +00:00
Alexander Alekhin
f3880c60a6
Merge pull request #10428 from pengli:dnn
2017-12-27 13:18:10 +00:00
Arjan van de Ven
2938860b3f
Provide a few AVX512 optimized functions for the DNN module
...
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-26 16:00:17 +00:00
Dmitry Kurtaev
70c605a03d
Limit Concat layer optimization
2017-12-26 16:49:33 +03:00
Li Peng
84e2fa79a0
dnn(ocl4dnn): update pre-tuned kernel config
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-26 20:14:41 +08:00
Alexander Alekhin
adf43e7d2a
build: fix MSVS2010 build error
2017-12-23 00:06:34 +00:00
Alexander Alekhin
019b7c5a66
Merge pull request #10402 from dkurt:dnn_tf_quantized
2017-12-22 15:58:56 +00:00
Alexander Alekhin
59e825ee02
Merge pull request #10385 from pengli:dnn
2017-12-22 15:48:40 +00:00
Dmitry Kurtaev
bcc669f3f7
TensorFlow weights dequantization
2017-12-22 17:25:10 +03:00
Alexander Alekhin
97af608030
Merge pull request #10397 from mshabunin:fix-incorrect-assert
2017-12-22 14:07:02 +00:00
Li Peng
181b448c4d
add one more convolution kernel tuning candidate
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-22 21:37:00 +08:00
Vadim Pisarevsky
0742e12f0b
Merge pull request #10265 from dkurt:nms_for_region_layer
2017-12-22 13:29:37 +00:00
Maksim Shabunin
aa46e31c6d
Replaced incorrect CV_Assert calls with CV_Error
2017-12-22 15:20:13 +03:00
Vadim Pisarevsky
325cbd7c84
Merge pull request #10364 from dkurt:dnn_smooth_tf_data_layout
2017-12-22 09:56:45 +00:00
Li Peng
c5fc8e03ff
cleanup unnecessary macros in convolution ocl kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 20:32:36 +08:00
Li Peng
0aa5e43a14
refactor candidate generation of convolution auto-tuning
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-21 23:05:54 +08:00
Dmitry Kurtaev
c67e75b68f
Refactor NMS procedure at RegionLayer
2017-12-21 12:21:45 +03:00
Vadim Pisarevsky
eecb64a973
Merge pull request #10331 from arrybn:python_dnn_net
2017-12-20 14:30:27 +00:00
Dmitry Kurtaev
7e48fa58eb
Manage TensorFlow's NHWC data layout is smoother
2017-12-20 14:13:40 +03:00
Dmitry Kurtaev
0ed2cbc931
R-FCN models support
2017-12-20 10:43:22 +03:00
Alexander Alekhin
dcdd6af5a8
Merge pull request #10341 from pengli:dnn
2017-12-19 14:04:55 +00:00
Li Peng
3b84acfc48
add ocl accuracy test for tf mobilenet ssd
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 18:38:55 +08:00
Li Peng
436d7e4eaf
add depthwise convolution kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:59:13 +08:00
Li Peng
910d7dab1f
prior box layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
2017-12-19 17:44:10 +08:00
Dmitry Kurtaev
6aabd6cc7a
Remove cv::dnn::Importer
2017-12-18 18:08:28 +03:00
Alexander Rybnikov
19c914db51
Changed wrapping mode for cv::dnn::Net::forward
2017-12-18 15:56:09 +03:00
Dmitry Kurtaev
2b43d4f477
Fix default pooling layer type
2017-12-17 16:46:40 +03:00
Alexander Alekhin
3fddce67c6
experimental version++
2017-12-16 01:30:36 +03:00
Maksim Shabunin
1033f2b1bd
Fixed 3 issues found by static analysis
2017-12-15 17:29:26 +03:00
Vadim Pisarevsky
62359f70ff
Merge pull request #10306 from dkurt:faster_rcnn
2017-12-15 12:23:53 +00:00
Dmitry Kurtaev
08112f3821
Faster-RCNN models support
2017-12-15 12:16:21 +03:00
Alexander Alekhin
0da947e6b3
dnn: more debug information
2017-12-14 19:21:17 +03:00