Commit Graph

5 Commits

Author SHA1 Message Date
Alexander Alekhin
898ca38257 cmake: AVX512 -> AVX_512F 2017-12-28 15:20:27 +00:00
Arjan van de Ven
2938860b3f Provide a few AVX512 optimized functions for the DNN module
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.

AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-26 16:00:17 +00:00
Alexander Alekhin
3dee92ec50 fix usage of CV_FMA3 macro 2017-09-26 17:23:54 +03:00
Alexander Alekhin
4784c7be5f dnn: cleanup dispatched code, fix SIMD128 types 2017-07-13 19:00:34 +03:00
Vadim Pisarevsky
ed9564106c reuse AVX2-optimized kernels for AVX1 CPUs (like IvyBridge) 2017-07-06 21:36:59 +03:00