opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-16 02:19:12 +08:00

Author	SHA1	Message	Date
Li, Peng	ab8022f74e	update convolution opencl kernels in dnn module (#11762 ) * optimize ocl kernel enqueue in fc layer Signed-off-by: Li Peng <peng.li@intel.com> * use CV_LOG_INFO in convolution auto tuning Signed-off-by: Li Peng <peng.li@intel.com> * update convolution IDLF kernel extend parameter tuning range, also cleanup ocl kernel implementation Signed-off-by: Li Peng <peng.li@intel.com> * update in-memory convolution cache config fp16 and fp32 cache config are stored separately Signed-off-by: Li Peng <peng.li@intel.com>	2018-06-25 17:06:18 +03:00
Dmitry Kurtaev	e8e9d1d021	Implement Interp layer using Resize layer	2018-06-22 19:26:47 +03:00
Alexander Alekhin	1894f1a37f	Merge pull request #11773 from alalek:dnn_ocl_update_force_tuning_flag	2018-06-22 05:23:55 +00:00
Alexander Alekhin	50c607d206	dnn(ocl): fix external / predefined builtin configuration behavior OPENCV_OCL4DNN_FORCE_AUTO_TUNING should ignore existed configuration from: - builtin predefined configurations (for Intel OpenCL iGPUs) - external configuration (via OPENCV_OCL4DNN_CONFIG_PATH) Prefer external configuration over builtin.	2018-06-21 20:59:03 +03:00
Dmitry Kurtaev	4626246087	Add ShuffleChannel layer	2018-06-21 19:10:42 +03:00
Dmitry Kurtaev	40b85c1cd9	Remove undocumented feature to retreive layers outputs by indices	2018-06-20 14:44:21 +03:00
Alexander Alekhin	30d4e0261a	Merge pull request #11766 from dkurt:dnn_darknet_avgpool_softmax	2018-06-14 13:18:30 +00:00
Dmitry Kurtaev	bd87eb6e66	Import average pooling and softmax layers from Darknet	2018-06-14 15:22:08 +03:00
Dmitry Kurtaev	693a7663e7	Import ClipByValue from Keras	2018-06-14 13:30:30 +03:00
Alexander Alekhin	5fd7cfbcad	dnn: add runtime parameter OPENCV_DNN_BACKEND_DEFAULT to control DNN_BACKEND_DEFAULT enumeration value behavior	2018-06-13 19:00:04 +03:00
Alexander Alekhin	f040282bf8	Merge pull request #11739 from dkurt:more_ie_models	2018-06-13 13:26:50 +00:00
Dmitry Kurtaev	7d727ac2fb	Fuse top layers to batch normalization	2018-06-09 18:06:53 +03:00
Dmitry Kurtaev	2c291bc2fb	Enable FastNeuralStyle and OpenFace networks with IE backend	2018-06-09 15:57:12 +03:00
rockzhan	1187a7fa34	Merge pull request #11649 from rockzhan:dnn_dw_prelu dnn: Fix output mismatch when forward dnn model contain [depthwise conv(group=1) + bn + prelu] (#11649) * this can make sure [depthwise conv(group=1) + bn + prelu] output not shift * add TEST to show the output mismatch in [DWconv+Prelu] * fix typo * change loading image to init cvMat directly * build runtime model, without loading external model * remove whitespace * change way to create a cvmat * add bias_term, add target output * fix [dwconv + prelu] value mismatch when no optimizations * fix Test error when change output channels * add parametric test * change num_output to group value * change conv code and change test back	2018-06-07 13:45:54 +00:00
David	7175f257b5	Added ResizeBilinear op for tf (#11050 ) * Added ResizeBilinear op for tf Combined ResizeNearestNeighbor and ResizeBilinear layers into Resize (with an interpolation param). Minor changes to tf_importer and resize layer to save some code lines Minor changes in init.cpp Minor changes in tf_importer.cpp * Replaced implementation of a custom ResizeBilinear layer to all layers * Use Mat::ptr. Replace interpolation flags	2018-06-07 16:29:04 +03:00
Dmitry Kurtaev	f3a6ae5f00	Wrap Inference Engine init to try-catch	2018-06-07 12:55:52 +03:00
Vadim Pisarevsky	3cbd2e2764	Merge pull request #11650 from dkurt:dnn_default_backend	2018-06-06 09:30:39 +00:00
Dmitry Kurtaev	b781ac7346	Make Intel's Inference Engine backend is default if no preferable backend is specified.	2018-06-04 18:31:46 +03:00
Vadim Pisarevsky	055f33ec46	Merge pull request #11657 from dkurt:dnn_ie_multiple_networks	2018-06-04 10:12:46 +00:00
Kuang Fangjun	9ae28415ec	fix doc.	2018-06-03 17:44:24 +08:00
Dmitry Kurtaev	ab389142af	Fix multiple networks with Intel's Inference Engine backend	2018-06-01 14:10:32 +03:00
Alexander Alekhin	da75e463a8	Merge pull request #11639 from alalek:fix_precomp_hpp	2018-05-31 16:35:21 +00:00
Alexander Alekhin	799b4f48e7	fix missing precomp.hpp	2018-05-31 16:53:44 +03:00
Dmitry Kurtaev	32bab45f81	Fix Inference Engine graphs with fused output layers	2018-05-31 16:21:08 +03:00
Vadim Pisarevsky	c58cc4c2ff	Merge pull request #11255 from dkurt:dnn_tf_faster_rcnn	2018-05-31 11:07:39 +00:00
Dmitry Kurtaev	f96f934426	Update Intel's Inference Engine deep learning backend (#11587 ) * Update Intel's Inference Engine deep learning backend * Remove cpu_extension dependency * Update Darknet accuracy tests	2018-05-31 14:05:21 +03:00
Dmitry Kurtaev	bf87a43185	Faster-RCNN object detection models from TensorFlow	2018-05-30 17:12:36 +03:00
Alexander Alekhin	44572fac44	Merge pull request #11557 from tomoaki0705:relaxIntelOnlyOCL4DNN	2018-05-29 15:25:22 +00:00
Tomoaki Teshima	2e9e71ab9e	make ocl4dnn available to run on other platform than Intel GPU	2018-05-29 19:18:10 +09:00
Dmitry Kurtaev	085be6a445	Fix dilated convolution from Keras	2018-05-29 12:15:47 +03:00
Dmitry Kurtaev	2c3c59d018	Remove Shift deep learning layer	2018-05-28 18:18:56 +03:00
Alexander Alekhin	3654fb10d7	Merge pull request #11567 from alalek:code_quality	2018-05-23 15:47:11 +00:00
Maksim Shabunin	895e10c317	dnn: fixed IE support on Windows	2018-05-23 12:46:14 +03:00
Alexander Alekhin	471c17321f	improve code quality - eliminate rand() calls - non initialized members/ variables - unused return values - missing/useless NULL checks	2018-05-22 17:08:31 +03:00
Maksim Shabunin	53a68783a5	dnn: support later IE versions	2018-05-22 15:18:18 +03:00
Alexander Alekhin	085b27fc3d	Merge pull request #11390 from dkurt:east_text_detection	2018-05-21 13:02:29 +00:00
Dmitry Kurtaev	07dc6d2b45	Return a convex hull from rotatedRectangleIntersection	2018-05-18 14:20:17 +03:00
Alexander Alekhin	d6279bfff8	fix build warnings	2018-05-17 18:29:21 +03:00
Li Peng	ba5e8befa9	fp16 ocl support for more layers Signed-off-by: Li Peng <peng.li@intel.com>	2018-05-16 22:45:04 +08:00
Li Peng	3dd916882a	fp16 ocl support for googlenet Signed-off-by: Li Peng <peng.li@intel.com>	2018-05-16 22:45:02 +08:00
Li Peng	329abb5b64	dnn fp16 support Signed-off-by: Li Peng <peng.li@intel.com>	2018-05-16 22:44:39 +08:00
Alexander Alekhin	bb8ff2c463	Merge pull request #11494 from tomoaki0705:fixOpenCLDnn	2018-05-16 14:11:36 +00:00
Tomoaki Teshima	3f5347dd7a	work around of the test failure of opencv_test_dnn * let OpenCL kernel run only on Intel GPU * brush up the workaround based on 9a2b028 from alalek	2018-05-16 19:23:19 +09:00
Dmitry Kurtaev	8488f2e265	EAST: An Efficient and Accurate Scene Text Detector (https://arxiv.org/abs/1704.03155v2 )	2018-05-11 14:55:42 +03:00
Dmitry Kurtaev	c99c3e761e	Fuse multipliers but not convolution layers weights	2018-05-10 19:24:38 +03:00
Dmitry Kurtaev	777d77848c	Free Convolution and MatMul weights after TensorFlow layers import	2018-05-04 11:20:14 +03:00
Dmitry Kurtaev	9ffe4694db	Reduce memory consumption at Caffe importer	2018-05-04 09:24:13 +03:00
zuoshaobo	4ff6a1bc7b	Merge pull request #11425 from zuoshaobo:relu_negative_slope * FIX INF_ENGINE RELU ERROR * set slope to variable * tab in indentwq	2018-05-03 13:36:49 +03:00
Alexander Alekhin	083b08742d	Merge pull request #11406 from alalek:core_matsize_dims	2018-04-28 14:38:42 +00:00
Alexander Alekhin	65b0b319eb	eliminate MSVS2017 build warning modules\dnn\src\layers\prior_box_layer.cpp(208): warning C4834: discarding return value of function with 'nodiscard' attribute	2018-04-28 15:14:41 +03:00
Alexander Alekhin	8c349ff8ff	core: added MatSize::dims() method to avoid accessing of 'p[-1]' (static code analysers dislike this)	2018-04-27 16:57:29 +03:00
Alexander Alekhin	576d2dbac0	refactor: don't use CV_ErrorNoReturn() internally	2018-04-24 15:38:42 +03:00
Dmitry Kurtaev	4ec456f0a0	Custom layers for deep learning networks (#11129 ) * Custom deep learning layers support * Stack custom deep learning layers	2018-04-24 14:59:59 +03:00
Alexander Alekhin	29b4fd2774	Merge pull request #11351 from dkurt:dnn_enable_inf_engine_tests	2018-04-23 09:16:39 +00:00
Dmitry Kurtaev	d959d7b9f0	Fuse deconvolution layer subgraphs from Keras	2018-04-20 16:51:38 +03:00
Dmitry Kurtaev	bd77d100e1	Enable some tests for clDNN plugin from Intel's Inference Engine	2018-04-20 10:47:46 +03:00
Dmitry Kurtaev	3b4a292ca9	Let switch CPU/OpenCL targets for models from Intel's Model Optimizer	2018-04-19 10:23:57 +03:00
Vadim Pisarevsky	b290bdafb9	Merge pull request #11322 from dkurt:dnn_yolov3	2018-04-18 12:11:13 +00:00
Dmitry Kurtaev	66ce8cd7ea	Fix bugs found by valgrind	2018-04-17 17:53:51 +03:00
Dmitry Kurtaev	97fec07d96	Support YOLOv3 model from Darknet	2018-04-16 18:44:12 +03:00
Alexander Alekhin	a2d6ee2d31	Merge pull request #11305 from tomoaki0705:typoNVIDIA	2018-04-13 12:56:42 +00:00
Tomoaki Teshima	a40354d16f	use correct name for NVIDIA * remove NVidia and Nvidia * replace Cuda with CUDA * keep the letters for API	2018-04-13 20:33:19 +09:00
Dmitry Kurtaev	b92c3182ab	Blank and L2-normalization layers from Intel's Inference Engine	2018-04-12 15:21:08 +03:00
Vadim Pisarevsky	0b9d075958	Merge pull request #11295 from dkurt:dnn_repeated_conv_params	2018-04-11 15:25:24 +00:00
Vadim Pisarevsky	533bb89800	Merge pull request #11236 from dkurt:dnn_fuse_l2_norm	2018-04-11 15:09:55 +00:00
Vadim Pisarevsky	30175594e9	Merge pull request #11062 from dkurt:dnn_inf_engine_cldnn	2018-04-11 15:06:18 +00:00
Dmitry Kurtaev	512632e574	Parse repeated values of ConvolutionParameter	2018-04-11 14:38:05 +03:00
Dmitry Kurtaev	4ef6c91583	Fix multiple inputs for models from Intel's Model Optimizer	2018-04-11 13:28:07 +03:00
Dmitry Kurtaev	1ba72ca0d3	Fuse tf.nn.l2_normalize layer	2018-04-10 10:12:44 +03:00
Dmitry Kurtaev	709cf5d038	OpenCL GPU target for Inference Engine deep learning backend Enable FP16 GPU target for DL Inference Engine backend.	2018-04-09 17:21:35 +03:00
Vladislav Sovrasov	0d9c63744e	Add CPU default extensions loading in IE dnn backend (#11252 ) * Add CPU default extensions loading in IE dnn backend * Load cpu_extensions for the future Intel's Inference Engine	2018-04-09 16:22:19 +03:00
Dmitry Kurtaev	ef1aaf12c9	Fix Proposal deep learning layer	2018-04-04 14:48:29 +03:00
Dmitry Kurtaev	598039c0ed	Fix embedded Torch's nn.ConcatTable	2018-03-31 11:11:10 +03:00
Alexander Alekhin	e8a67de0d2	Merge pull request #11182 from dkurt:fix_11102_part_2	2018-03-30 13:11:01 +00:00
Alexander Alekhin	1060c0f439	dnn: apply CV_OVERRIDE/CV_FINAL	2018-03-28 18:43:27 +03:00
Alexander Alekhin	167034fb04	Merge pull request #11098 from dkurt:dnn_native_inf_engine	2018-03-28 14:52:08 +00:00
Dmitry Kurtaev	e039fc3a63	Replace protobuf's ReleaseLast to RemoveLast to deallocate memory. Change an order of PriorBox layer operations.	2018-03-28 17:27:36 +03:00
Dmitry Kurtaev	2f3a9ba1d4	Update OpenCVDetectInferenceEngine.cmake	2018-03-28 16:34:37 +03:00
Alexander Alekhin	9e0dee1259	Merge pull request #11112 from alalek:cmake_src_include_fix	2018-03-27 13:06:48 +00:00
Dmitry Kurtaev	7972f47ed4	Load networks from intermediate representation of Intel's Deep learning deployment toolkit.	2018-03-26 07:24:21 +03:00
Dmitry Kurtaev	e8fe6ee4e3	Fix prior box generation in case of squared proposals. Fix batch norm in training phase.	2018-03-23 09:44:59 +03:00
Alexander Alekhin	6c051a55e5	cmake: don't add include <module>/src directory to avoid conflicts during opencv_world builds	2018-03-19 11:14:15 +03:00
Dmitry Kurtaev	069f9add80	Fix an issue https://github.com/opencv/opencv/issues/11102	2018-03-18 10:49:12 +03:00
Alexander Alekhin	d68466bb6a	Merge pull request #10940 from dkurt:dnn_tf_graph_optim	2018-03-14 14:36:25 +00:00
Alexander Alekhin	ab110c0ad1	Merge pull request #10979 from dkurt:unite_dnn_samples	2018-03-14 14:33:49 +00:00
Dmitry Kurtaev	538fd42363	Add test for Scalar arguments at CommandLineParser	2018-03-13 11:01:07 +03:00
Dmitry Kurtaev	ab20d2a3fc	Update assertions in batch norm layer	2018-03-12 10:53:06 +03:00
Dmitry Kurtaev	69a8f110b6	Fuse subgraphs from Keras	2018-03-12 10:53:06 +03:00
Dmitry Kurtaev	9457bf10ab	Fuse batch normalization and flatten TensorFlow subgraphs in runtime	2018-03-12 10:51:35 +03:00
Alexander Alekhin	5b868ccd82	Merge pull request #10992 from dkurt:dnn_opencl_tests	2018-03-09 10:06:40 +00:00
Dmitry Kurtaev	0f01b40dd5	Reset OpenCL kernels if batch size changes	2018-03-07 17:06:59 +03:00
Alexander Alekhin	514f4193db	Merge pull request #10959 from alalek:cmake_ocl4dnn	2018-03-07 10:26:14 +00:00
Dmitry Kurtaev	e1c3237532	Parametric OpenCL deep learning tests	2018-03-05 20:53:18 +03:00
Dmitry Kurtaev	f2440ceae6	Update tutorials. A new cv::dnn::readNet function	2018-03-04 20:30:22 +03:00
Alexander Alekhin	fe97dc67dc	Merge pull request #10962 from alalek:dnn_precomp_hpp	2018-03-02 11:38:16 +00:00
Alexander Alekhin	97c1f09961	Merge pull request #10955 from pengli:dnn	2018-03-02 11:35:59 +00:00
Alexander Alekhin	a9ebc61f2a	dnn(workaround): switch to CPU target if compiled without OpenCL	2018-03-01 12:12:40 +03:00
Alexander Alekhin	1b83bc48a1	dnn: make OpenCL DNN code optional	2018-03-01 12:12:40 +03:00
Alexander Alekhin	a838a97092	dnn: fix precomp.hpp usage	2018-02-28 17:06:26 +03:00
Wu Zhiwen	ef937dd676	ocl4dnn: Fix SAME padding mode for convolve Signed-off-by: Wu, Zhiwen <zhiwen.wu@intel.com> Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-28 21:02:41 +08:00
Maksim Shabunin	7c855aa3e1	Fixed two issues found by static analysis	2018-02-26 00:16:02 +03:00
Li Peng	608968aa83	Deconvolution ocl fix Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-23 18:31:30 +08:00
Li, Peng	5caf6244a3	Merge pull request #10922 from pengli:dnn * ave pooling ocl fix support the padded area control in ave pooling Signed-off-by: Li Peng <peng.li@intel.com> * warning fix: ununitialized field	2018-02-22 21:01:12 +03:00
Maksim Shabunin	92e9d4ec3a	Fixed several issues detected by static analysis	2018-02-22 17:11:33 +03:00
Vadim Pisarevsky	5e0f95b948	Merge pull request #9708 from dkurt:tf_face_detector	2018-02-22 12:04:26 +00:00
Li Peng	e7d35d51fa	Fix for opencv face detector ocl test Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-22 23:37:54 +08:00
Li Peng	c524f669c7	Fallback for "SAME" padMode in ocl convolution and pooling It fixes tensorflow ocl testcase of MobileNetSSD and Inception_v2_SSD Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-22 21:17:59 +08:00
Dmitry Kurtaev	eab556e1e0	OpenCV face detection network in TensorFlow	2018-02-21 19:58:24 +03:00
Alexander Alekhin	53305d4a7e	Merge pull request #10891 from pengli:dnn	2018-02-20 08:59:07 +00:00
Li Peng	2863f950d6	ReLU6 layer ocl support include relu6 ocl kernel and layer fusion support Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-20 15:11:09 +08:00
Dmitry Kurtaev	8b4871a28d	Use only absolute prior boxes explicit sizes. Remove scales attributes. (#10874 ) * Use only absolute prior boxes explicit sizes. Remove scales attributes. * Simplified PriorBox layer forward pass	2018-02-19 17:25:18 +03:00
Alexander Alekhin	0e4eed0ba1	Merge pull request #10867 from dkurt:dnn_fix_ave_pooling_area	2018-02-16 11:17:32 +00:00
Alexander Alekhin	c020a7bb67	build: portable integer types	2018-02-15 23:43:02 +03:00
Dmitry Kurtaev	f8d0d6365e	Add a flag to manage average pooling with padding	2018-02-14 16:56:31 +03:00
Alexander Alekhin	cff79609c8	Merge pull request #10854 from pengli:dnn	2018-02-14 12:49:53 +00:00
Vadim Pisarevsky	ef70b0baa4	Merge pull request #10865 from dkurt:dnn_inf_engine_getInputsInfo	2018-02-14 12:25:18 +00:00
Dmitry Kurtaev	a66b5e2c13	Add const getInputsInfo	2018-02-14 14:17:44 +03:00
Vadim Pisarevsky	6dfd7e3da2	Merge pull request #10850 from dkurt:dnn_tf_deconv_tests	2018-02-14 10:35:14 +00:00
Li Peng	5992c46606	add fallback case for ocl convolution The ocl convolution doesn't support tensorflow padMode well. Add fallback check if we meet this situation, it could fix the tensorflow MobileNet SSD failure. Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-14 00:04:38 +08:00
Li Peng	00d2f34888	ocl fix for detection_output and prior_box layer Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-13 23:09:14 +08:00
Dmitry Kurtaev	514e6df460	Refactored deep learning layers fusion	2018-02-13 14:35:58 +03:00
Dmitry Kurtaev	a6baedd02c	Fix deconvolution layer. Add batch norm layer with mean-variance normalization from TensorFlow.	2018-02-13 11:00:27 +03:00
Alexander Alekhin	66f3c1ae79	Merge pull request #10843 from luzpaz:misc-modules-typos	2018-02-12 13:47:12 +00:00
Sui Libin	1ad814a191	fix faster_rcnn sample crashed at PoolingInvoker on Windows7(x64). (#10724 ) * fix faster_rcnn sample crashed at PoolingInvoker operator() of pooling_layer. * find_odj onmouse bug about find matched point status. * reverted AutoBuffer back to std::vector	2018-02-12 16:07:56 +03:00
luz.paz	5718d09e39	Misc. modules/ typos Found via `codespell`	2018-02-12 07:09:43 -05:00
Rémi Ratajczak	b67523550f	dnn : Added an imagesFromBlob method to the dnn module (#10607 ) * Added the imagesFromBlob method to the dnn module. * Rewritten imagesFromBlob based on first dkurt comments * Updated code with getPlane() * Modify comment of imagesFromBlob() in dnn module * modified comments, removed useless assertions & added OutputArrayOfArray * replaced tabs with whitespaces & put vectorOfChannels instantiation outside the loop * Changed pre-commit.sample to pre-commit in .git/hooks/ * Added a test for imagesFromBlob in test_misc.cpp (dnn) * Changed nbOfImages, robustified test with cv::randu, modified assertion	2018-02-12 14:51:07 +03:00
Dmitry Kurtaev	7fe97376c2	MobileNet-SSD from TensorFlow 1.3 and Inception-V2-SSD using Inference Engine backend	2018-02-09 13:45:45 +03:00
Dmitry Kurtaev	ed94136548	OpenCV face detection network using Inference Engine backend	2018-02-06 17:53:24 +03:00
Alexander Alekhin	398ebbac98	Merge pull request #10795 from pengli:dnn	2018-02-06 10:04:29 +00:00
Li Peng	c43498c6ad	check vector emptiness before access it Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-06 22:59:51 +08:00
Li Peng	389fa5d38e	slice layer ocl update Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-06 22:59:47 +08:00
Dmitry Kurtaev	10e1de74d2	Intel Inference Engine deep learning backend (#10608 ) * Intel Inference Engine deep learning backend. * OpenFace network using Inference Engine backend	2018-02-06 11:57:35 +03:00
Maksim Shabunin	e56d6054aa	Do not build protobuf without dnn (#10689 ) * Do not build protobuf if dnn is disabled * Added BUILD_LIST cmake option to the cache * Moved protobuf to the top level * Fixed static build * Fixed world build * fixup! Fixed world build	2018-02-01 16:30:23 +03:00
Vadim Pisarevsky	713ec7be45	Merge pull request #10746 from dkurt:dnn_batch_norm_from_nvidia_caffe	2018-02-01 13:22:09 +00:00
Alexander Alekhin	42569cfd61	Merge pull request #10748 from dkurt:fix_dnn_slice_layer	2018-02-01 13:21:17 +00:00
Alexander Alekhin	9d25bd583f	Merge pull request #10754 from dkurt:dnn_ocl_gemv_min_globalsize	2018-02-01 12:39:27 +00:00
Dmitry Kurtaev	65a6674c6e	ocl4dnnGEMV in case of row_size < 4	2018-02-01 14:06:47 +03:00
Alexander Alekhin	9698b93d10	Merge pull request #10717 from pengli:dnn	2018-02-01 10:49:54 +00:00
Li Peng	6aec71d7ee	mvn layer ocl update it fuse ocl kernels to reduce kernel enqueue Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-01 17:48:12 +08:00
Li Peng	83b16ab7b7	fix extra spaces in build option Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-01 17:46:11 +08:00
Li Peng	54c81cbde4	eltwise layer SUM op update Signed-off-by: Li Peng <peng.li@intel.com>	2018-02-01 17:46:06 +08:00
Dmitry Kurtaev	184862582c	Fix slice layer from TensorFlow	2018-01-31 19:12:37 +03:00
Arjan van de Ven	a75840d19c	Merge pull request #10468 from fenrus75:avx512-2 * Add a 512 bit codepath to the AVX512 fastConv function this patch adds a 512 wide codepath to the fastConv() function for AVX512 use. The basic idea is to process the first N * 16 elements of the vector with avx512, and then run the rest of the vector using the traditional AVX2 codepath. * dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary) * dnn: change "vecsize" condition for AVX512 * dnn: fix indentation	2018-01-31 16:34:12 +03:00
Alexander Alekhin	f06c44f1f1	Merge pull request #10701 from dkurt:tf_ave_pooling	2018-01-31 13:28:09 +00:00
Dmitry Kurtaev	844f1d0281	Fix Batch Normalization layer imported from NVIDIA Caffe.	2018-01-31 16:25:45 +03:00
Dmitry Kurtaev	a2e9bfbaf4	Fix padding for average pooling from TensorFlow	2018-01-31 15:54:30 +03:00
Li Peng	7a4c5e9421	slice layer ocl support Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-29 22:34:32 +08:00
Alexander Alekhin	2876670de3	dnn(ocl): fix build options for Apple OpenCL	2018-01-28 01:54:25 +00:00
Alexander Alekhin	104502c5be	Merge pull request #10676 from dkurt:dnn_for_newer_mobilenet_ssd	2018-01-26 04:02:21 +00:00
Li Peng	2493083935	mvn, batch_norm and relu layer fusion Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-25 18:57:05 +08:00
Li Peng	e15928b49e	convolution and tanh layer fusion Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-25 17:45:33 +08:00
Dmitry Kurtaev	9e9926a2f0	PriorBox layer with explicit normalized sizes	2018-01-24 14:01:42 +03:00
Dmitry Kurtaev	a3d74704e5	OpenCV face detection network test	2018-01-23 09:27:58 +03:00
Alexander Alekhin	26e0f408f0	Merge pull request #10639 from pengli:dnn	2018-01-19 10:01:41 +00:00
Li Peng	fe494297e4	more update on MVN layer ocl implementation cut one ocl kernel if normVariance is disabled, also use native_powr for performance reason. Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-19 22:54:04 +08:00
Alexander Alekhin	c3569211d5	Merge pull request #10591 from drkoller:master	2018-01-19 09:44:21 +00:00
Li Peng	2124361ff7	ocl support for Deconvolution layer Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-18 23:40:22 +08:00
David Koller	d1a3b530be	Make DNN Crop layer match Caffe default offset behavior and add parametric unit test for crop layer.	2018-01-17 10:52:36 -05:00
Li Peng	e77af4ae33	MVN layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-17 17:11:32 +08:00
Li Peng	7bc017601f	Power, Tanh and Channels ReLU layer ocl support Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-17 17:11:27 +08:00
Li Peng	4189214d04	batch_norm layer ocl update use a batch_norm ocl kernel to do the work Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-16 19:01:58 +08:00
Alexander Alekhin	1255bd8d4b	Merge pull request #10585 from dkurt:dnn_weightless_scale	2018-01-15 06:07:50 +00:00
Dmitry Kurtaev	6a395d88ff	dnn::blobFromImage with OutputArray	2018-01-13 18:20:24 +03:00
Dmitry Kurtaev	1f4fdfd599	Untrainable version of Scale layer from Caffe	2018-01-13 10:35:29 +03:00
Dmitry Kurtaev	64a9e92390	Merge pull request #10466 from dkurt:reduce_umat_try_2 * UMat blobs are wrapped * Replace getUMat and getMat at OpenCLBackendWrapper	2018-01-10 21:50:54 +03:00
Alexander Alekhin	4d4f291553	Merge pull request #10513 from pengli:dnn	2018-01-09 19:24:28 +00:00
Li Peng	e3b42bf93b	batch_norm and blank layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-09 21:58:46 +08:00
Alexander Alekhin	da0904df2d	Merge pull request #10550 from dkurt:replace_psroi_pooling_tag	2018-01-08 19:19:00 +00:00
Dmitry Kurtaev	27b55ea761	Replace Caffe's psroi_pooling_param tag from 10001 to 10002	2018-01-08 13:29:20 +03:00
Alexander Alekhin	6674a024fc	dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option replaces REUSE_DNN_MEMORY compile-time option	2018-01-07 18:38:14 +00:00
Arthur Williams	8a67858068	Fixed missing #include "../precomp.hpp"	2018-01-05 15:10:39 +00:00
Li Peng	67f9406cbe	add normalize_bbox layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-05 19:38:36 +08:00
Li Peng	f99a135eda	add eltwise layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-05 19:38:30 +08:00
Li Peng	34bfd7ef51	add ocl implementation of proposal layer Signed-off-by: Li Peng <peng.li@intel.com>	2018-01-04 18:40:51 +08:00
Alexander Alekhin	7d67d60fb1	cmake(opt): AVX512_SKX	2017-12-29 07:18:11 +00:00
Alexander Alekhin	8e7af7f089	Merge pull request #10456 from dkurt:dnn_allocate_mem_for_optimized_concat	2017-12-28 16:04:51 +00:00
Alexander Alekhin	a65b5df5da	Merge pull request #10416 from fenrus75:avx512	2017-12-28 15:56:56 +00:00
Alexander Alekhin	898ca38257	cmake: AVX512 -> AVX_512F	2017-12-28 15:20:27 +00:00
Dmitry Kurtaev	a9807d8f54	Allocate new memory for optimized concat to prevent collisions. Add a flag to disable memory reusing in dnn module.	2017-12-28 16:45:53 +03:00
Li Peng	00f03c5739	Add ocl version FasterRCNN accuracy test Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-28 19:15:15 +08:00
Alexander Alekhin	99a9c10b57	Merge pull request #10424 from dkurt:fix_concat_optim	2017-12-28 01:26:14 +00:00
Alexander Alekhin	f3880c60a6	Merge pull request #10428 from pengli:dnn	2017-12-27 13:18:10 +00:00
Arjan van de Ven	2938860b3f	Provide a few AVX512 optimized functions for the DNN module This patch adds AVX512 optimized fastConv as well as the hookups needed to get these called in the convolution_layer. AVX512 fastConv is code-identical on a C level to the AVX2 one, but is measurably faster due to AVX512 having more registers available to cache results in. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2017-12-26 16:00:17 +00:00
Dmitry Kurtaev	70c605a03d	Limit Concat layer optimization	2017-12-26 16:49:33 +03:00
Li Peng	84e2fa79a0	dnn(ocl4dnn): update pre-tuned kernel config Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-26 20:14:41 +08:00
Alexander Alekhin	adf43e7d2a	build: fix MSVS2010 build error	2017-12-23 00:06:34 +00:00
Alexander Alekhin	019b7c5a66	Merge pull request #10402 from dkurt:dnn_tf_quantized	2017-12-22 15:58:56 +00:00
Alexander Alekhin	59e825ee02	Merge pull request #10385 from pengli:dnn	2017-12-22 15:48:40 +00:00
Dmitry Kurtaev	bcc669f3f7	TensorFlow weights dequantization	2017-12-22 17:25:10 +03:00
Alexander Alekhin	97af608030	Merge pull request #10397 from mshabunin:fix-incorrect-assert	2017-12-22 14:07:02 +00:00
Li Peng	181b448c4d	add one more convolution kernel tuning candidate Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-22 21:37:00 +08:00
Vadim Pisarevsky	0742e12f0b	Merge pull request #10265 from dkurt:nms_for_region_layer	2017-12-22 13:29:37 +00:00
Maksim Shabunin	aa46e31c6d	Replaced incorrect CV_Assert calls with CV_Error	2017-12-22 15:20:13 +03:00
Vadim Pisarevsky	325cbd7c84	Merge pull request #10364 from dkurt:dnn_smooth_tf_data_layout	2017-12-22 09:56:45 +00:00
Li Peng	c5fc8e03ff	cleanup unnecessary macros in convolution ocl kernel Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-21 20:32:36 +08:00
Li Peng	0aa5e43a14	refactor candidate generation of convolution auto-tuning Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-21 23:05:54 +08:00
Dmitry Kurtaev	c67e75b68f	Refactor NMS procedure at RegionLayer	2017-12-21 12:21:45 +03:00
Dmitry Kurtaev	7e48fa58eb	Manage TensorFlow's NHWC data layout is smoother	2017-12-20 14:13:40 +03:00
Dmitry Kurtaev	0ed2cbc931	R-FCN models support	2017-12-20 10:43:22 +03:00
Alexander Alekhin	dcdd6af5a8	Merge pull request #10341 from pengli:dnn	2017-12-19 14:04:55 +00:00
Li Peng	436d7e4eaf	add depthwise convolution kernel Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-19 17:59:13 +08:00
Li Peng	910d7dab1f	prior box layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-19 17:44:10 +08:00
Dmitry Kurtaev	6aabd6cc7a	Remove cv::dnn::Importer	2017-12-18 18:08:28 +03:00
Dmitry Kurtaev	2b43d4f477	Fix default pooling layer type	2017-12-17 16:46:40 +03:00
Maksim Shabunin	1033f2b1bd	Fixed 3 issues found by static analysis	2017-12-15 17:29:26 +03:00
Vadim Pisarevsky	62359f70ff	Merge pull request #10306 from dkurt:faster_rcnn	2017-12-15 12:23:53 +00:00
Dmitry Kurtaev	08112f3821	Faster-RCNN models support	2017-12-15 12:16:21 +03:00
Alexander Alekhin	0da947e6b3	dnn: more debug information	2017-12-14 19:21:17 +03:00
Alexander Alekhin	c231472ad6	Merge pull request #10290 from tomoaki0705:fixVS2012Round	2017-12-13 15:30:21 +00:00
Tomoaki Teshima	ecb6bcf2e0	fix build error on Visual Studio 2012 * round doesn't exists in standard library of Visual Studio 2012 * apply the correct computation of ROI	2017-12-13 17:40:07 +03:00
Vitaly Tuzov	51cb56ef2c	Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. (#9468 )	2017-12-13 15:00:38 +03:00
Alexander Alekhin	eff42f6387	dnn: more debug info	2017-12-12 12:04:10 +03:00
Vadim Pisarevsky	7e680bd9ff	Merge pull request #10215 from dkurt:dnn_js	2017-12-11 12:47:52 +00:00
Vadim Pisarevsky	c24f10d647	Merge pull request #10268 from dkurt:fix_scale_layer	2017-12-08 18:46:50 +00:00
Dmitry Kurtaev	f503515082	JavaScript bindings for dnn module	2017-12-08 18:33:48 +03:00
Dmitry Kurtaev	e307065c8e	Scale layer in case of 2D inputs	2017-12-08 17:34:59 +03:00
Alexander Alekhin	f2070c9f5d	Merge pull request #10255 from dkurt:dnn_roi_pooling	2017-12-08 11:20:07 +00:00
Dmitry Kurtaev	17dcf0e82d	ROIPooling layer	2017-12-07 19:04:38 +03:00
Dmitry Kurtaev	ef0650179b	Fix conv/deconv/fc layers FLOPS computation	2017-12-07 11:42:04 +03:00
Alexander Alekhin	6074f92d48	Merge pull request #10228 from pengli:dnn_new	2017-12-06 15:50:12 +00:00
Li Peng	59cbaca4d3	detection_output layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-06 22:35:59 +08:00
Li Peng	66feea6cac	region layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-07 02:26:46 +08:00
Li Peng	7707c9bfba	reorg layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-07 02:26:46 +08:00
Li Peng	85b1c4060c	support axis in concat layer ocl path Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-07 02:26:46 +08:00
Li Peng	07bec6bdcd	reshape layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-07 02:26:40 +08:00
Li Peng	7b7033ac60	permute layer ocl implementation Signed-off-by: Li Peng <peng.li@intel.com>	2017-12-05 22:10:05 +08:00
Dmitry Kurtaev	bbbec300a6	nn.BatchNormalization and nn.Dropout layers from Torch	2017-12-04 12:57:21 +03:00
Alexander Alekhin	cc2ee923e4	Merge pull request #10164 from pengli:dnn	2017-11-29 12:05:10 +00:00
Wu Zhiwen	1f465a0ef9	dnn(ocl4dnn): fuseLayer() use umat_input/outputBlobs for OpenCL target Also, fix bug when use OPENCL target but no OpenCL runtime Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>	2017-11-27 22:25:53 +08:00
Dmitry Kurtaev	99ed085752	Update PriorBox layer	2017-11-27 16:47:20 +03:00
Alexander Alekhin	13f374660f	dnn(ocl4dnn): drop unused batch_size_ in pooling	2017-11-23 20:46:56 +00:00
Alexander Alekhin	e34b64c979	dnn(ocl4dnn): refactor pooling OpenCL calls	2017-11-23 20:46:44 +00:00
Alexander Alekhin	f071a48ec7	Merge pull request #10143 from pengli:ocl4dnn	2017-11-23 18:47:14 +00:00
Alexander Alekhin	107582c767	Merge pull request #9996 from dkurt:dnn_multiple_inputs	2017-11-23 18:22:37 +00:00
Li Peng	636d6368ee	use OutputArrayOfArrays in net forward interface It allows umat buffers used in net forward interface Signed-off-by: Li Peng <peng.li@intel.com>	2017-11-24 02:19:10 +08:00
Wu, Zhiwen	04edc8fe3a	cleanup ocl4dnn spatial convolution kernels remove unused macros and half definition macros, also remove unused ocl::Queue Signed-off-by: Li Peng <peng.li@intel.com>	2017-11-24 02:19:10 +08:00
Alexander Alekhin	49a5280198	Merge pull request #10139 from alalek:dnn_rename_caffe_proto_package	2017-11-23 14:10:42 +00:00
Alexander Alekhin	f37f4cf3b4	Merge pull request #9994 from r2d3:dnn_memory_load	2017-11-22 18:15:00 +00:00
Alexander Alekhin	e7d62d6ef3	Merge pull request #10126 from alalek:dnn_issue_10125	2017-11-22 18:03:51 +00:00
Alexander Alekhin	1c88a566e0	dnn: rename caffe protobuf package	2017-11-22 18:34:07 +03:00
Alexander Alekhin	9db5cbf9a4	dnn: sync output/internals blobs back	2017-11-22 14:00:58 +03:00
Vadim Pisarevsky	f8ad289311	Merge pull request #10092 from alalek:dnn_rename_caffe_proto	2017-11-22 08:16:20 +00:00
Alexander Alekhin	0f34628af7	dnn: drop OpenCL code path for DetectionOutputLayer getUMat()/getMat() calls are scope based. Results of these calls can't be stored somewhere for future usage.	2017-11-21 17:28:42 +03:00
Alexander Alekhin	438e456ce9	Merge pull request #10113 from wzw-intel:fusion	2017-11-20 18:13:33 +00:00
Alexander Alekhin	f6d927ef3b	dnn: avoid conflicts with original caffe.proto rename caffe.proto => opencv-caffe.proto	2017-11-20 19:04:00 +03:00
David Geldreich	f723cede2e	add loading TensorFlow/Caffe net from memory buffer add a corresponding test	2017-11-20 16:28:22 +01:00
Dmitry Kurtaev	6c5dd5cf6d	Replace caffe::NormalizedBBox to local structure	2017-11-20 18:03:31 +03:00
Wu Zhiwen	45d11dde57	dnn(ocl4dnn): add fusion support for Power activation and eltwise add Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>	2017-11-20 14:58:53 +08:00
Wu Zhiwen	394101d6ed	dnn(ocl4dnn): Fix relu fusion bug Incorrect type of negative_slope result in this bug. Also and OCL test for darknet to validate this patch. Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>	2017-11-17 16:21:56 +08:00
Wu Zhiwen	88e6daa315	dnn(ocl4dnn): Fix wrong measurement for tuning time convolution kernel use default queue to run, so that ocl::Timer , to measure the kernel run time, should use the default queue too. Also remove useless parameter for convolve() Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>	2017-11-16 13:09:57 +08:00
Li Peng	55260a8d3c	reshape mat before doing computation in fc layer Signed-off-by: Li Peng <peng.li@intel.com>	2017-11-13 09:29:50 +08:00
Alexander Alekhin	bafdc44d37	Merge pull request #10061 from Sahloul:dnn_torch_fix	2017-11-10 05:05:52 +00:00
Alexander Alekhin	8a3a75cc16	Merge pull request #9882 from pengli:ocl4dnn	2017-11-09 18:54:43 +00:00
Hamdi Sahloul	06bda58a2c	DNN Torch - workaround when torch importer is disabled	2017-11-10 00:44:06 +09:00
Li Peng	8f99083726	Add new layer forward interface Add layer forward interface with InputArrayOfArrays and OutputArrayOfArrays parameters, it allows UMat buffer to be processed and transferred in the layers. Signed-off-by: Li Peng <peng.li@intel.com>	2017-11-09 15:59:39 +08:00
Alexander Alekhin	97181a90ba	dnn(ocl4dnn/conv): bailout on missing kernel configuration	2017-11-07 17:02:17 +03:00
Alexander Alekhin	6e4f9433d0	Merge pull request #9998 from alalek:ocl_fix_dnn_softmax_9991	2017-11-03 09:16:39 +00:00
Dmitry Kurtaev	20a2dc6ac5	Fix multiple inputs models from Caffe. Fixed Concat optimization.	2017-11-02 18:55:08 +03:00
Alexander Alekhin	bacc96f4e8	dnn(ocl): fix softmax global/local size consistency	2017-11-02 17:08:40 +03:00
Dmitry Kurtaev	14af2a0c0c	Fixed Halide's copy_to_device invocation	2017-11-01 14:01:54 +03:00
Vadim Pisarevsky	bc348eb8ab	Merge pull request #9963 from dkurt:fix_caffe_shrinker	2017-10-31 12:27:19 +00:00
Dmitry Kurtaev	e1ebc4e991	Specify layer types for Caffe FP32->FP16 weights converter	2017-10-31 12:31:40 +03:00
Dmitry Kurtaev	03cefa7bfe	Set zero confidences in case of no detections	2017-10-30 10:17:57 +03:00
Vadim Pisarevsky	e0e40405ed	Merge pull request #9847 from wzw-intel:ocl4dnn_fusion	2017-10-27 13:59:46 +00:00
Vadim Pisarevsky	ff037ebe5f	Merge pull request #9845 from dkurt:fast_neural_style_models	2017-10-27 13:59:02 +00:00
Vadim Pisarevsky	5384d2f090	Merge pull request #9880 from dkurt:caffe_ceil_mode	2017-10-27 11:51:46 +00:00
Dmitry Kurtaev	4b52b8df34	Layers for fast-neural-style models: https://github.com/jcjohnson/fast-neural-style	2017-10-27 14:26:45 +03:00
Vadim Pisarevsky	bc93775385	Merge pull request #9862 from sovrasov:dnn_nms	2017-10-27 11:19:57 +00:00
Vadim Pisarevsky	825c0ffdb4	Merge pull request #9874 from dkurt:fix_identity_permute_layer	2017-10-27 11:11:48 +00:00
Vadim Pisarevsky	69f2590359	Merge pull request #9921 from dkurt:fix_prelu_after_fully_connected	2017-10-27 11:10:59 +00:00
Vadim Pisarevsky	7b8fb64f21	Merge pull request #9939 from alalek:fix_dnn_getUMat_crash	2017-10-27 11:06:22 +00:00
Vladislav Sovrasov	5bf39ceb5d	dnn: handle 4-channel images in blobFromImage (#9944 )	2017-10-27 14:06:53 +03:00
Alexander Alekhin	436a1f72a5	dnn: fix sporadic crashes in getUMat() Incorrect "total" buffer size calculated in StdMatAllocator::allocate() due wrong step values.	2017-10-25 18:07:05 +03:00
Vladislav Sovrasov	7e3e9144de	dnn: add an accuracy test for NMS	2017-10-25 13:40:56 +03:00
Vladislav Sovrasov	c704942b8a	dnn: add a documentation for NMS, fix missing experimantal namespace	2017-10-25 13:35:49 +03:00
Vladislav Sovrasov	acedb4a579	dnn: make NMS function public	2017-10-25 13:35:49 +03:00
Dmitry Kurtaev	a36ebaecdc	PReLU layer for multidimensional input	2017-10-23 16:13:03 +03:00
Alexander Alekhin	185faf99bd	ocl: simplify ocl::Timer interface	2017-10-18 16:01:21 +03:00
Dmitry Kurtaev	b903ff8992	Ceil mode from experimental version of Caffe, https://github.com/BVLC/caffe/pull/3057	2017-10-18 14:04:53 +03:00
Dmitry Kurtaev	a3a446c197	Output blobs shapes initialization in case of identity permutation (NCHW->NCHW)	2017-10-17 17:15:25 +03:00
Wu Zhiwen	2d8f2c2aea	dnn(ocl4dnn): add fusion support ocl4dnn supports following fusion styles: Conv + [BN] + [Scale] + [ReLU/PReLU] Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>	2017-10-16 19:18:36 +08:00
Maksim Shabunin	b066dd36ff	Fixed uninitialized class fields	2017-10-16 13:47:43 +03:00
Alexander Alekhin	4857cae6ed	dnn: don't use "experimental_dnn_v1" namespace directly	2017-10-12 18:16:53 +03:00
Alexander Alekhin	df5b2224d7	Merge pull request #9829 from pengli:ocl4dnn	2017-10-12 11:26:20 +00:00
Li Peng	937b8e4277	dnn(ocl4dnn): support log softmax in ocl4dnn Signed-off-by: Li Peng <peng.li@intel.com>	2017-10-12 09:51:13 +08:00
Vadim Pisarevsky	e356ca2369	Merge pull request #9835 from sovrasov:blob_from_img_crop_opt	2017-10-11 17:18:40 +00:00
Vadim Pisarevsky	8b168175ec	Merge pull request #9636 from dkurt:duplicate_lp_norm_layer	2017-10-11 13:36:14 +00:00
Vadim Pisarevsky	0873ebb9b0	Merge pull request #9820 from sovrasov:text_detector_dnn	2017-10-11 13:31:46 +00:00
Vadim Pisarevsky	babd21c764	Merge pull request #9823 from alalek:dnn_halide_bypass_tbb_threads	2017-10-11 13:28:38 +00:00
Vladislav Sovrasov	47e1133e71	dnn: add crop flag to blobFromImage	2017-10-11 15:46:20 +03:00
Vladislav Sovrasov	f7175f5050	dnn: fix additional text boxes handling after the latest adaptations for TF	2017-10-11 14:04:48 +03:00
Vladislav Sovrasov	050916fd6b	dnn: modify priorBox layer	2017-10-11 11:43:50 +03:00
Dmitry Kurtaev	905a9dada2	Removed LPNormalize layer.	2017-10-10 20:38:55 +03:00
Alexander Alekhin	3935e13603	dnn(halide): don't compile Halide via parallel_for_() To avoid problem with reduced stack size of inner threads.	2017-10-10 18:06:03 +03:00
Vadim Pisarevsky	b7ff9ddcdd	Merge pull request #9705 from AlexeyAB:dnn_darknet_yolo_v2	2017-10-10 12:02:03 +00:00
Vadim Pisarevsky	046045239c	Merge pull request #9750 from dkurt:feature_dnn_tf_text_graph	2017-10-10 10:06:24 +00:00
AlexeyAB	ecc34dc521	Added DNN Darknet Yolo v2 for object detection	2017-10-09 21:08:44 +03:00
Dmitry Kurtaev	eabf728682	PReLU layer from Caffe	2017-10-09 20:30:37 +03:00
Vadim Pisarevsky	fee87ea3f7	Merge pull request #9800 from alalek:fix_build_msvs2010	2017-10-09 12:33:08 +00:00
Vadim Pisarevsky	6a80834ed4	Merge pull request #9803 from wzw-intel:ocl_timer	2017-10-09 12:11:22 +00:00
Maksim Shabunin	5a22d81fe5	Fixed warnings produced by static analyzer	2017-10-09 13:37:18 +03:00
Wu Zhiwen	dbe9ee0924	ocl: simplify ocl::Timer Use clFinish to gurantee commands completed, instead of waiting for events. Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>	2017-10-09 13:48:38 +08:00
Alexander Alekhin	e615fafe2d	build: fix MSVS2010	2017-10-08 23:32:22 +03:00
Dmitry Kurtaev	e4aa39f9e5	Text TensorFlow graphs parsing. MobileNet-SSD for 90 classes.	2017-10-08 22:25:29 +03:00
Vadim Pisarevsky	21bd834a59	Merge pull request #9772 from dkurt:fix_caffe_eltwise_and_fc_layers	2017-10-06 13:47:54 +00:00
Vadim Pisarevsky	b969d86415	Merge pull request #9787 from dkurt:feature_dnn_resize_nearest_neighbor	2017-10-06 13:46:50 +00:00
Vadim Pisarevsky	fe58b58937	Merge pull request #9778 from dkurt:dnn_colorization	2017-10-06 11:48:05 +00:00
Dmitry Kurtaev	b9f94c9315	Nearest neighbor resize layer	2017-10-06 14:33:26 +03:00
Dmitry Kurtaev	e268606e26	Grayscale colorization model (https://github.com/richzhang/colorization ) test.	2017-10-06 09:33:41 +03:00
Dmitry Kurtaev	ad8bbaf008	Multidimensional eltwise layer. Fixed fully-connected layer axis.	2017-10-04 14:01:44 +03:00
Dmitry Kurtaev	2a21c10837	Fix TensorFlow split layer	2017-10-02 22:44:42 +03:00
pengli	e340ff9c3a	Merge pull request #9114 from pengli:dnn_rebase add libdnn acceleration to dnn module (#9114) * import libdnn code Signed-off-by: Li Peng <peng.li@intel.com> * add convolution layer ocl acceleration Signed-off-by: Li Peng <peng.li@intel.com> * add pooling layer ocl acceleration Signed-off-by: Li Peng <peng.li@intel.com> * add softmax layer ocl acceleration Signed-off-by: Li Peng <peng.li@intel.com> * add lrn layer ocl acceleration Signed-off-by: Li Peng <peng.li@intel.com> * add innerproduct layer ocl acceleration Signed-off-by: Li Peng <peng.li@intel.com> * add HAVE_OPENCL macro Signed-off-by: Li Peng <peng.li@intel.com> * fix for convolution ocl Signed-off-by: Li Peng <peng.li@intel.com> * enable getUMat() for multi-dimension Mat Signed-off-by: Li Peng <peng.li@intel.com> * use getUMat for ocl acceleration Signed-off-by: Li Peng <peng.li@intel.com> * use CV_OCL_RUN macro Signed-off-by: Li Peng <peng.li@intel.com> * set OPENCL target when it is available and disable fuseLayer for OCL target for the time being Signed-off-by: Li Peng <peng.li@intel.com> * fix innerproduct accuracy test Signed-off-by: Li Peng <peng.li@intel.com> * remove trailing space Signed-off-by: Li Peng <peng.li@intel.com> * Fixed tensorflow demo bug. Root cause is that tensorflow has different algorithm with libdnn to calculate convolution output dimension. libdnn don't calculate output dimension anymore and just use one passed in by config. * split gemm ocl file split it into gemm_buffer.cl and gemm_image.cl Signed-off-by: Li Peng <peng.li@intel.com> * Fix compile failure Signed-off-by: Li Peng <peng.li@intel.com> * check env flag for auto tuning Signed-off-by: Li Peng <peng.li@intel.com> * switch to new ocl kernels for softmax layer Signed-off-by: Li Peng <peng.li@intel.com> * update softmax layer on some platform subgroup extension may not work well, fallback to non subgroup ocl acceleration. Signed-off-by: Li Peng <peng.li@intel.com> * fallback to cpu path for fc layer with multi output Signed-off-by: Li Peng <peng.li@intel.com> * update output message Signed-off-by: Li Peng <peng.li@intel.com> * update fully connected layer fallback to gemm API if libdnn return false Signed-off-by: Li Peng <peng.li@intel.com> * Add ReLU OCL implementation * disable layer fusion for now Signed-off-by: Li Peng <peng.li@intel.com> * Add OCL implementation for concat layer Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com> * libdnn: update license and copyrights Also refine libdnn coding style Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com> Signed-off-by: Li Peng <peng.li@intel.com> * DNN: Don't link OpenCL library explicitly * DNN: Make default preferableTarget to DNN_TARGET_CPU User should set it to DNN_TARGET_OPENCL explicitly if want to use OpenCL acceleration. Also don't fusion when using DNN_TARGET_OPENCL * DNN: refine coding style * Add getOpenCLErrorString * DNN: Use int32_t/uint32_t instread of alias * Use namespace ocl4dnn to include libdnn things * remove extra copyTo in softmax ocl path Signed-off-by: Li Peng <peng.li@intel.com> * update ReLU layer ocl path Signed-off-by: Li Peng <peng.li@intel.com> * Add prefer target property for layer class It is used to indicate the target for layer forwarding, either the default CPU target or OCL target. Signed-off-by: Li Peng <peng.li@intel.com> * Add cl_event based timer for cv::ocl * Rename libdnn to ocl4dnn Signed-off-by: Li Peng <peng.li@intel.com> Signed-off-by: wzw <zhiwen.wu@intel.com> * use UMat for ocl4dnn internal buffer Remove allocateMemory which use clCreateBuffer directly Signed-off-by: Li Peng <peng.li@intel.com> Signed-off-by: wzw <zhiwen.wu@intel.com> * enable buffer gemm in ocl4dnn innerproduct Signed-off-by: Li Peng <peng.li@intel.com> * replace int_tp globally for ocl4dnn kernels. Signed-off-by: wzw <zhiwen.wu@intel.com> Signed-off-by: Li Peng <peng.li@intel.com> * create UMat for layer params Signed-off-by: Li Peng <peng.li@intel.com> * update sign ocl kernel Signed-off-by: Li Peng <peng.li@intel.com> * update image based gemm of inner product layer Signed-off-by: Li Peng <peng.li@intel.com> * remove buffer gemm of inner product layer call cv::gemm API instead Signed-off-by: Li Peng <peng.li@intel.com> * change ocl4dnn forward parameter to UMat Signed-off-by: Li Peng <peng.li@intel.com> * Refine auto-tuning mechanism. - Use OPENCV_OCL4DNN_KERNEL_CONFIG_PATH to set cache directory for fine-tuned kernel configuration. e.g. export OPENCV_OCL4DNN_KERNEL_CONFIG_PATH=/home/tmp, the cache directory will be /home/tmp/spatialkernels/ on Linux. - Define environment OPENCV_OCL4DNN_ENABLE_AUTO_TUNING to enable auto-tuning. - OPENCV_OPENCL_ENABLE_PROFILING is only used to enable profiling for OpenCL command queue. This fix basic kernel get wrong running time, i.e. 0ms. - If creating cache directory failed, disable auto-tuning. * Detect and create cache dir on windows Signed-off-by: Li Peng <peng.li@intel.com> * Refine gemm like convolution kernel. Signed-off-by: Li Peng <peng.li@intel.com> * Fix redundant swizzleWeights calling when use cached kernel config. * Fix "out of resource" bug when auto-tuning too many kernels. * replace cl_mem with UMat in ocl4dnnConvSpatial class * OCL4DNN: reduce the tuning kernel candidate. This patch could reduce 75% of the tuning candidates with less than 2% performance impact for the final result. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> * replace cl_mem with umat in ocl4dnn convolution Signed-off-by: Li Peng <peng.li@intel.com> * remove weight_image_ of ocl4dnn inner product Actually it is unused in the computation Signed-off-by: Li Peng <peng.li@intel.com> * Various fixes for ocl4dnn 1. OCL_PERFORMANCE_CHECK(ocl::Device::getDefault().isIntel()) 2. Ptr<OCL4DNNInnerProduct<float> > innerProductOp 3. Code comments cleanup 4. ignore check on OCL cpu device Signed-off-by: Li Peng <peng.li@intel.com> * add build option for log softmax Signed-off-by: Li Peng <peng.li@intel.com> * remove unused ocl kernels in ocl4dnn Signed-off-by: Li Peng <peng.li@intel.com> * replace ocl4dnnSet with opencv setTo Signed-off-by: Li Peng <peng.li@intel.com> * replace ALIGN with cv::alignSize Signed-off-by: Li Peng <peng.li@intel.com> * check kernel build options Signed-off-by: Li Peng <peng.li@intel.com> * Handle program compilation fail properly. * Use std::numeric_limits<float>::infinity() for large float number * check ocl4dnn kernel compilation result Signed-off-by: Li Peng <peng.li@intel.com> * remove unused ctx_id Signed-off-by: Li Peng <peng.li@intel.com> * change clEnqueueNDRangeKernel to kernel.run() Signed-off-by: Li Peng <peng.li@intel.com> * change cl_mem to UMat in image based gemm Signed-off-by: Li Peng <peng.li@intel.com> * check intel subgroup support for lrn and pooling layer Signed-off-by: Li Peng <peng.li@intel.com> * Fix convolution bug if group is greater than 1 Signed-off-by: Li Peng <peng.li@intel.com> * Set default layer preferableTarget to be DNN_TARGET_CPU Signed-off-by: Li Peng <peng.li@intel.com> * Add ocl perf test for convolution Signed-off-by: Li Peng <peng.li@intel.com> * Add more ocl accuracy test Signed-off-by: Li Peng <peng.li@intel.com> * replace cl_image with ocl::Image2D Signed-off-by: Li Peng <peng.li@intel.com> * Fix build failure in elementwise layer Signed-off-by: Li Peng <peng.li@intel.com> * use getUMat() to get blob data Signed-off-by: Li Peng <peng.li@intel.com> * replace cl_mem handle with ocl::KernelArg Signed-off-by: Li Peng <peng.li@intel.com> * dnn(build): don't use C++11, OPENCL_LIBRARIES fix * dnn(ocl4dnn): remove unused OpenCL kernels * dnn(ocl4dnn): extract OpenCL code into .cl files * dnn(ocl4dnn): refine auto-tuning Defaultly disable auto-tuning, set OPENCV_OCL4DNN_ENABLE_AUTO_TUNING environment variable to enable it. Use a set of pre-tuned configs as default config if auto-tuning is disabled. These configs are tuned for Intel GPU with 48/72 EUs, and for googlenet, AlexNet, ResNet-50 If default config is not suitable, use the first available kernel config from the candidates. Candidate priority from high to low is gemm like kernel, IDLF kernel, basick kernel. * dnn(ocl4dnn): pooling doesn't use OpenCL subgroups * dnn(ocl4dnn): fix perf test OpenCV has default 3sec time limit for each performance test. Warmup OpenCL backend outside of perf measurement loop. * use ocl::KernelArg as much as possible Signed-off-by: Li Peng <peng.li@intel.com> * dnn(ocl4dnn): fix bias bug for gemm like kernel * dnn(ocl4dnn): wrap cl_mem into UMat Signed-off-by: Li Peng <peng.li@intel.com> * dnn(ocl4dnn): Refine signature of kernel config - Use more readable string as signture of kernel config - Don't count device name and vendor in signature string - Default kernel configurations are tuned for Intel GPU with 24/48/72 EUs, and for googlenet, AlexNet, ResNet-50 net model. * dnn(ocl4dnn): swap width/height in configuration * dnn(ocl4dnn): enable configs for Intel OpenCL runtime only * core: make configuration helper functions accessible from non-core modules * dnn(ocl4dnn): update kernel auto-tuning behavior Avoid unwanted creation of directories * dnn(ocl4dnn): simplify kernel to workaround OpenCL compiler crash * dnn(ocl4dnn): remove redundant code * dnn(ocl4dnn): Add more clear message for simd size dismatch. * dnn(ocl4dnn): add const to const argument Signed-off-by: Li Peng <peng.li@intel.com> * dnn(ocl4dnn): force compiler use a specific SIMD size for IDLF kernel * dnn(ocl4dnn): drop unused tuneLocalSize() * dnn(ocl4dnn): specify OpenCL queue for Timer and convolve() method * dnn(ocl4dnn): sanitize file names used for cache * dnn(perf): enable Network tests with OpenCL * dnn(ocl4dnn/conv): drop computeGlobalSize() * dnn(ocl4dnn/conv): drop unused fields * dnn(ocl4dnn/conv): simplify ctor * dnn(ocl4dnn/conv): refactor kernelConfig localSize=NULL * dnn(ocl4dnn/conv): drop unsupported double / untested half types * dnn(ocl4dnn/conv): drop unused variable * dnn(ocl4dnn/conv): alignSize/divUp * dnn(ocl4dnn/conv): use enum values * dnn(ocl4dnn): drop unused innerproduct variable Signed-off-by: Li Peng <peng.li@intel.com> * dnn(ocl4dnn): add an generic function to check cl option support * dnn(ocl4dnn): run softmax subgroup version kernel first Signed-off-by: Li Peng <peng.li@intel.com>	2017-10-02 15:38:00 +03:00
Vadim Pisarevsky	5e93c82023	Merge pull request #9491 from dkurt:tf_lstm	2017-09-28 21:04:06 +00:00
Vadim Pisarevsky	68cc2e292d	Merge pull request #9734 from dkurt:fix_deconv_layer_kernel_layout	2017-09-28 11:42:57 +00:00
Vadim Pisarevsky	45365e4df1	Merge pull request #9691 from dkurt:padding_layer_refactoring	2017-09-28 11:34:28 +00:00
Dmitry Kurtaev	6e593cd1f0	Swap dimensions of deconvolution kernel	2017-09-27 22:38:34 +03:00
Alexander Alekhin	3dee92ec50	fix usage of CV_FMA3 macro	2017-09-26 17:23:54 +03:00
Dmitry Kurtaev	84cec17913	LSTM layer for TensorFlow importer	2017-09-26 12:59:36 +03:00
Dmitry Kurtaev	222149b9c6	Refactored Padding layer	2017-09-22 12:39:00 +03:00
Dmitry Kurtaev	17a85b16fc	Remove reorder_dims attribute of Reshape layer	2017-09-21 16:42:03 +03:00
Dmitry Kurtaev	bdd8cc697a	Import wrapped Dropout subgraphs from TensorFlow	2017-09-20 13:30:25 +03:00
Vadim Pisarevsky	f7df5dd32c	Merge pull request #9305 from dkurt:public_dnn_importer_is_deprecated	2017-09-18 09:35:35 +00:00
Vadim Pisarevsky	e012ccda4a	Merge pull request #9517 from dkurt:tf_mobilenet	2017-09-18 09:31:19 +00:00
Dmitry Kurtaev	bd8e6b7e14	Make external cv::dnn::Importer usage is deprecated	2017-09-18 08:52:36 +03:00
Dmitry Kurtaev	d891e9b1d8	Layers for MobileNet from TensorFlow	2017-09-15 20:17:30 +03:00
Dmitry Kurtaev	8646d5fb84	FP16 Caffe models import and export	2017-09-15 18:06:34 +03:00
Vadim Pisarevsky	6bf8fe815d	Merge pull request #9384 from dkurt:torch_split	2017-09-15 13:05:05 +00:00
Vadim Pisarevsky	41b23fde9f	Merge pull request #9524 from dkurt:dnn_torch_openface	2017-09-15 12:38:12 +00:00
Dmitry Kurtaev	0ce7c33bc8	Torch's Concat and ConcatTable doesn't use Split layer	2017-09-14 09:26:57 +03:00
Dmitry Kurtaev	7dc6b1d7d4	Layers for OpenFace face recognition network	2017-09-14 09:11:31 +03:00
Dmitry Kurtaev	58b890b9f7	Dilated convolution import from TensorFlow	2017-09-13 18:44:14 +03:00
Vadim Pisarevsky	8c7f19850f	Merge pull request #9576 from dkurt:feature_dnn_tf_importer_fp16	2017-09-13 13:57:38 +00:00
Vadim Pisarevsky	93c3f20deb	Merge pull request #9569 from dkurt:test_dnn_ssd_halide	2017-09-13 13:29:50 +00:00
Dmitry Kurtaev	ce41a15437	Import and convert FP16 weights from TensorFlow	2017-09-12 09:50:55 +03:00
Maksim Shabunin	248e2c7d47	Fixed some issues found by static analysis	2017-09-08 12:22:12 +03:00
Dmitry Kurtaev	cad7c4d51d	MobileNet-SSD and VGG-SSD topologies in Halide	2017-09-08 09:55:53 +03:00
Alexander Alekhin	3202bbe17c	Merge pull request #9349 from dkurt:tf_deconv	2017-08-24 15:58:38 +00:00
Alexander Alekhin	8e7e24ac80	Merge pull request #9394 from dkurt:fix_halide_wrapper	2017-08-24 15:56:54 +00:00
Alexander Alekhin	25a4559565	Merge pull request #9294 from arrybn:layers_perf	2017-08-24 09:37:49 +00:00
Aleksandr Rybnikov	8b1146deb2	Added function to get timings for layers	2017-08-23 13:40:05 +03:00
Dmitry Kurtaev	54f0616a13	Deconvolution layer from TensorFlow	2017-08-21 21:38:07 +03:00
Dmitry Kurtaev	4e28c00e7b	Fix Halide buffer behavior in case of OpenCL device memory allocation	2017-08-17 13:27:54 +03:00
dkurt	339793143c	Unit tests for TensorFlow importer	2017-08-03 11:29:48 +03:00
Alexander Alekhin	0bd357e7ec	Merge pull request #9296 from dkurt:halide_device_interface	2017-08-02 20:26:30 +00:00
dkurt	b1ef44b1ac	Replace halide_opencl_device_interface	2017-08-02 20:38:30 +03:00
Aleksandr Rybnikov	8d6b8b45b6	Added ELU and test for it	2017-08-02 11:13:59 +03:00
Alexander Alekhin	bab4bc0968	Merge pull request #9284 from ipuustin:dnn-opencl-fixes	2017-08-01 13:06:01 +00:00
Ismo Puustinen	c2de5cf735	dnn: force floating point literals to be float. In OpenCL code in activations.cl, make the type of floating point literals to be float. Otherwise the values will be interpreted as doubles, causing Beignet to have type conversion issues.	2017-08-01 15:02:24 +03:00
Alexander Alekhin	2959e7aba9	Merge pull request #9188 from arrybn:mobilenet_ssd_sample	2017-08-01 11:12:54 +00:00
Aleksandr Rybnikov	ce1cc352d9	MobileNet SSD sample	2017-08-01 12:30:27 +03:00
Alexander Alekhin	3f102e5d3a	dnn: protobuf shutdown	2017-07-26 17:21:46 +03:00
Alexander Alekhin	878a6906cc	dnn: fix torch importer memory leaks	2017-07-25 12:20:55 +03:00
Tomoaki Teshima	0f91faddae	fix linker error when trying CPU_BASELINE=AVX	2017-07-21 21:13:47 +09:00
Alexander Alekhin	ab58cac236	Merge pull request #9194 from tomoaki0705:fixBuildErrorDnn	2017-07-20 15:27:07 +00:00
Alexander Alekhin	08c94aa5c0	build: reuse int32_t workaround from softfloat.hpp	2017-07-20 14:01:21 +03:00
Tomoaki Teshima	1989bc33a7	fix build error on Visual Studio 2012	2017-07-20 11:00:04 +09:00
Aleksandr Rybnikov	7d1140340e	Rewrote googlenet tests	2017-07-18 18:49:14 +03:00
Vadim Pisarevsky	0488d9bdb2	optimize out scaleLayer & concatLayer whenever possible fixed problem in concat layer by disabling memory re-use in layers with multiple inputs trying to fix the tests when Halide is used to run deep nets another attempt to fix Halide tests see if the Halide tests will pass with concat layer fusion turned off trying to fix failures in halide tests; another try one more experiment to make halide_concat & halide_enet tests pass continue attempts to fix halide tests moving on uncomment parallel concat layer seemingly fixed failures in Halide tests and re-enabled concat layer fusion; thanks to dkurt for the patch	2017-07-14 18:30:53 +03:00
Alexander Alekhin	4784c7be5f	dnn: cleanup dispatched code, fix SIMD128 types	2017-07-13 19:00:34 +03:00
Alexander Alekhin	c3e6de293f	dnn: code cleanup, refactor detection output layer	2017-07-13 19:00:34 +03:00
Alexander Alekhin	544908d06c	dnn: some minor fixes in docs, indentation, unused code	2017-07-13 15:33:49 +03:00
Alexander Alekhin	520da7aaaf	Merge pull request #9111 from vpisarev:dnn_optim_avx1	2017-07-13 12:27:05 +00:00
dkurt	3203635765	Eltwise layer fixes	2017-07-10 12:58:11 +03:00
Vadim Pisarevsky	ed9564106c	reuse AVX2-optimized kernels for AVX1 CPUs (like IvyBridge)	2017-07-06 21:36:59 +03:00
abratchik	8f7181429f	add java wrappers to dnn module	2017-07-02 11:46:20 +04:00
Maksim Shabunin	e0393f8557	Fixed some issues found by static analysis (4th round)	2017-06-30 12:26:53 +03:00
Aleksandr Rybnikov	fab4f4b9d5	Disabled logging in caffe parser in release	2017-06-29 17:36:48 +03:00
Vadim Pisarevsky	ac49a17a82	Merge pull request #9022 from dkurt:keep_conv_weights_for_halide	2017-06-29 11:09:17 +00:00
Vadim Pisarevsky	fb1dcdd17d	Merge pull request #9029 from alalek:dnn_cleanup_torch	2017-06-29 11:07:35 +00:00
Maksim Shabunin	f1a56cb4b7	Merge pull request #9028 from alalek:dnn_experimental_namespace	2017-06-29 07:37:04 +00:00
Maksim Shabunin	ace0701a46	Merge pull request #9019 from alalek:dnn_trace	2017-06-29 07:33:46 +00:00
Alexander Alekhin	511e50c19c	dnn: cleanup torch integration code	2017-06-28 21:49:37 +00:00
Alexander Alekhin	324851882a	Merge pull request #9025 from mshabunin:fix-static-3	2017-06-28 20:50:21 +00:00
Alexander Alekhin	da0960321b	dnn: added "hidden" experimental namespace Main purpose of this namespace is to avoid using of incompatible binaries that will cause applications crashes. This additional namespace will not impact "Source code API". This change allows to maintain ABI checks (with easy filtering out).	2017-06-28 20:36:57 +00:00
Maksim Shabunin	a769d69a9d	Fixed several issues found by static analysis	2017-06-28 18:06:18 +03:00
dkurt	b46f5b1b38	Align convolutional layer weights separately from origin ones	2017-06-28 17:05:56 +03:00
Alexander Alekhin	ed10383359	dnn: added trace macros	2017-06-28 14:57:26 +03:00
Vadim Pisarevsky	c5faa9aefa	Merge pull request #9013 from arrybn:ssd_last_layers_optim	2017-06-28 10:38:55 +00:00
Vadim Pisarevsky	bbb14d3746	Merge pull request #9003 from dkurt:halide_bug_fixes	2017-06-28 08:48:27 +00:00
Aleksandr Rybnikov	ec321e651f	Removed usage of std::map in DetectionOutput layer	2017-06-28 11:31:38 +03:00
Vadim Pisarevsky	2ae849091c	Merge pull request #9009 from alalek:fix_dnn_initialization	2017-06-28 08:26:29 +00:00
Vadim Pisarevsky	8b3d6603d5	another round of dnn optimization (#9011 ) * another round of dnn optimization: * increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly * improved SIMD optimization of pooling layer, optimized average pooling * cleaned up convolution layer implementation * made activation layer "attacheable" to all other layers, including fully connected and addition layer. * fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology. * greatly optimized permutation layer, which improved SSD performance * parallelized element-wise binary/ternary/... ops (sum, prod, max) * also, added missing copyrights to many of the layer implementation files * temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders	2017-06-28 11:15:22 +03:00
Alexander Alekhin	00dd433368	dnn: fix LayerFactory initialization	2017-06-27 23:19:53 +03:00
Alexander Alekhin	f8a75c4361	dispatch: added CV_TRY_${OPT} macro, fix dnn build - 1: OPT is available directly or via dispatcher - 0: optimization is not compiled at all	2017-06-27 17:05:15 +03:00
dkurt	121789f78e	Fixed some bugs from Halide tests	2017-06-27 14:52:46 +03:00
Alexander Alekhin	16d1bbf2ea	dnn: fix build - winpack - opencv_world	2017-06-27 09:07:01 +03:00
Alexander Alekhin	986d27e49c	dnn: fix failed Torch tests "Torch invalid argument 2: position must be smaller than LLONG_MAX" These conditions are always true for "long position" argument.	2017-06-26 22:02:22 +03:00
Alexander Alekhin	93091ba203	dnn: AVX2 fix invalid unaligned read	2017-06-26 19:48:42 +03:00
Alexander Alekhin	93729784bb	dnn: move module from opencv_contrib `e6f63c7a38/modules/dnn`	2017-06-26 13:41:51 +03:00

... 26 27 28 29 30 ...

1739 Commits