opencv/modules
insoow 2922738b6d Merge pull request #8104 from insoow:master
Gemm kernels for Intel GPU (#8104)

* Fix an issue with Kernel object reset release when consecutive Kernel::run calls

Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.

This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* GEMM kernel optimization for Intel GEN

The optimized kernels uses cl_intel_subgroups extension for better
performance.

Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Fix API compatibility error

This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Revert "Fix API compatibility error"

This reverts commit 2ef427db91.

Conflicts:
	modules/core/src/intel_gpu_gemm.inl.hpp

* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"

This reverts commit cc7f9f5469.

* Fix the case of uninitialization D

When C is null and beta is non-zero, D is used without initialization.
This resloves the issue

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* fix potential output error due to 0 * nan

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* whitespace fix, eliminate non-ASCII symbols

* fix build warning
2017-04-19 12:57:54 +03:00
..
calib3d fix typo and align white spaces 2017-04-17 08:32:08 +09:00
core Merge pull request #8104 from insoow:master 2017-04-19 12:57:54 +03:00
cudaarithm Implement DFT as cv::Algorithm to support concurrent streams 2017-03-21 13:55:13 +02:00
cudabgsegm Fixed identifiers warns 2016-09-30 15:16:29 +05:30
cudacodec Fixed identifiers warns 2016-09-30 15:16:29 +05:30
cudafeatures2d test: fix cuda build 2016-11-29 01:18:10 +03:00
cudafilters BufferPool is used for temporary buffer, use mat create directly 2016-12-29 18:29:44 +08:00
cudaimgproc Merge pull request #8367 from khnaba:cuda-calchist-with-mask 2017-03-15 09:34:00 +00:00
cudalegacy Merge pull request #7370 from souch55:Fixxn 2016-10-01 10:44:56 +00:00
cudaobjdetect Add cuda::streams to by_rows and 8UC1 functions 2017-03-27 10:54:07 +02:00
cudaoptflow Fixed bug #7482. Updated dense flow routine to reference bound textures. 2017-01-18 19:30:45 +02:00
cudastereo test: fix cuda build 2016-11-29 01:18:10 +03:00
cudawarping Fixed identifiers warns 2016-09-30 15:16:29 +05:30
cudev Use %% for inline assembly rather than % so this compiles with clang. 2017-04-05 10:57:50 -07:00
features2d Reduce dependencies between modules 2017-03-15 17:58:52 +03:00
flann Merge pull request #8346 from Sahloul:fixes/python_wrapper/flann 2017-03-10 20:01:54 +00:00
highgui Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00
imgcodecs Merge pull request #8492 from brian-armstrong-discord:exif_inmemory 2017-04-14 23:12:07 +03:00
imgproc Fixed cvtColor OCL compilation issue (BGRA2mBGRA) 2017-04-05 11:48:29 +03:00
java Added javadoc generation 2017-04-05 18:18:39 +03:00
ml Fixed Algorithm.save and other methods work in Java 2017-04-05 17:48:38 +03:00
objdetect Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00
photo suppress warning on Jetson TK1 2017-04-11 18:27:12 +09:00
python Avoid memory leakage in smart pointers wrapper 2017-04-01 18:27:57 +09:00
shape remove precomp.cpp 2017-01-23 18:45:53 +03:00
stitching Clear old CameraParameters in AffineBasedEstimator 2017-03-28 15:19:25 +02:00
superres suppress warnings from cuda 2017-04-05 08:30:16 +09:00
ts ocl: dump OpenCL driver version in tests 2017-03-15 18:23:30 +03:00
video Merge pull request #8565 from iglesias:fix/bgsknn-initialization 2017-04-17 08:08:34 +00:00
videoio Merge pull request #8492 from brian-armstrong-discord:exif_inmemory 2017-04-14 23:12:07 +03:00
videostab Add method KeypointBasedMotionEstimator::estimate(InputArray, InputArray) to support both cpu & opencl algorithm processing 2017-04-14 10:08:48 +08:00
viz viz: fix compilation - we need the VTK includes before ocv_define_module 2017-01-27 15:51:19 +01:00
world core: CPU target dispatcher update 2017-03-23 16:12:11 +03:00
CMakeLists.txt world fix 2014-08-05 20:12:35 +04:00