opencv

mirror of https://github.com/opencv/opencv.git synced 2025-06-22 03:22:10 +08:00

Author	SHA1	Message	Date
rockzhan	1187a7fa34	Merge pull request #11649 from rockzhan:dnn_dw_prelu dnn: Fix output mismatch when forward dnn model contain [depthwise conv(group=1) + bn + prelu] (#11649) * this can make sure [depthwise conv(group=1) + bn + prelu] output not shift * add TEST to show the output mismatch in [DWconv+Prelu] * fix typo * change loading image to init cvMat directly * build runtime model, without loading external model * remove whitespace * change way to create a cvmat * add bias_term, add target output * fix [dwconv + prelu] value mismatch when no optimizations * fix Test error when change output channels * add parametric test * change num_output to group value * change conv code and change test back	2018-06-07 13:45:54 +00:00
Arjan van de Ven	a75840d19c	Merge pull request #10468 from fenrus75:avx512-2 * Add a 512 bit codepath to the AVX512 fastConv function this patch adds a 512 wide codepath to the fastConv() function for AVX512 use. The basic idea is to process the first N * 16 elements of the vector with avx512, and then run the rest of the vector using the traditional AVX2 codepath. * dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary) * dnn: change "vecsize" condition for AVX512 * dnn: fix indentation	2018-01-31 16:34:12 +03:00
Alexander Alekhin	7d67d60fb1	cmake(opt): AVX512_SKX	2017-12-29 07:18:11 +00:00
Alexander Alekhin	898ca38257	cmake: AVX512 -> AVX_512F	2017-12-28 15:20:27 +00:00
Arjan van de Ven	2938860b3f	Provide a few AVX512 optimized functions for the DNN module This patch adds AVX512 optimized fastConv as well as the hookups needed to get these called in the convolution_layer. AVX512 fastConv is code-identical on a C level to the AVX2 one, but is measurably faster due to AVX512 having more registers available to cache results in. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2017-12-26 16:00:17 +00:00
Alexander Alekhin	3dee92ec50	fix usage of CV_FMA3 macro	2017-09-26 17:23:54 +03:00
Alexander Alekhin	4784c7be5f	dnn: cleanup dispatched code, fix SIMD128 types	2017-07-13 19:00:34 +03:00
Vadim Pisarevsky	ed9564106c	reuse AVX2-optimized kernels for AVX1 CPUs (like IvyBridge)	2017-07-06 21:36:59 +03:00

8 Commits