Commit Graph

202 Commits

Author SHA1 Message Date
Alexander Alekhin
44b75eb116 core: fix OpenCL runtime compilation with HAVE_OPENCL_STATIC 2017-09-08 12:43:33 +03:00
Ricardo Ribalda Delgado
6fc5697950 ocl: Fix OpenCL library detection in Linux
OpenCL runtime does not require OpenCL development file (libOpenCL.so),
just the "run" library (so.1).

This patch searches for the run library (so.1) if the dev library (.so)
is not found.

Web search shows that this error has been present since at least 2015
http://answers.opencv.org/question/80532/haveopencl-return-false/

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
2017-08-23 16:38:06 +02:00
Maksim Shabunin
02db592014 Fixed several issues found by static analysis (Windows specific) 2017-07-10 23:14:02 +03:00
Maksim Shabunin
1f23202ad8 Issues found by static analysis (5th round) 2017-07-01 18:56:24 +03:00
Tomoaki Teshima
d81cdb8e1c add OpenCL version of convertFp16 and test
* disable vector operation for now
 * brush up the implementation based on comment
2017-05-23 20:00:21 +09:00
insoow
2922738b6d Merge pull request #8104 from insoow:master
Gemm kernels for Intel GPU (#8104)

* Fix an issue with Kernel object reset release when consecutive Kernel::run calls

Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.

This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* GEMM kernel optimization for Intel GEN

The optimized kernels uses cl_intel_subgroups extension for better
performance.

Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Fix API compatibility error

This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Revert "Fix API compatibility error"

This reverts commit 2ef427db91.

Conflicts:
	modules/core/src/intel_gpu_gemm.inl.hpp

* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"

This reverts commit cc7f9f5469.

* Fix the case of uninitialization D

When C is null and beta is non-zero, D is used without initialization.
This resloves the issue

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* fix potential output error due to 0 * nan

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* whitespace fix, eliminate non-ASCII symbols

* fix build warning
2017-04-19 12:57:54 +03:00
Alexander Alekhin
fe8501c931 ocl: fix SVM code 2016-10-10 20:52:30 +03:00
Alexander Alekhin
411e551335 ocl: autogenerated code 2016-10-02 02:43:02 +03:00
Alexander Alekhin
a0add98b30 ocl: eliminate build warning
Templates are replaced by macro
2016-10-02 02:43:02 +03:00
Alexander Alekhin
c7bdbef504 ocl: fix OpenGL sharing detection (6052)
Apple OpenCL framework hasn't OpenGL sharing extension
2016-02-11 12:46:22 +03:00
Alexander Alekhin
87bbaa2c27 ocl: OpenCL headers are located in "CL" subfolder (3rdparty/include) 2016-02-11 12:44:45 +03:00
Alexander Alekhin
7e472fbf68 ocl: thread-safe OpenCL loading (6056) 2016-02-03 18:30:40 +03:00
Alexander Alekhin
2b2bc83b61 Merge pull request #4238 from vladimir-dudnik:d3d11-nv12-interop 2015-07-30 10:36:25 +00:00
Vladimir Dudnik
6bd01a96d9 finished with NV12 support for D3D11-interop. Now, if texture is in NV12 format then it will be converted to/from BGR UMat. 2015-07-29 19:52:05 +03:00
Vladimir Dudnik
d4774ead43 d3d11-nv12 interop
fixed issues with ocl nv12 cvt kernel

finisged ocl nv12-to-rgba kernel, update dx-interop samples. (ocl rgba-to-nv12 kernel will be added later)

an attempt to fix build issue

fix for non opencl build issue

fix typo

fix compilation warnings

fix compile issue for Mac (OpenCL)

add convertion from rgba to nv12 (still need to debug kernel)

remove empty line at the EOF

fixed compilation warning
2015-07-29 19:52:03 +03:00
Yan Wang
132416ebe9 It is unnecessary to use fma() if no scaling.
Signed-off-by: Yan Wang <yan.wang@linux.intel.com>
2015-07-23 10:18:11 +08:00
Alexander Alekhin
c0d61964d6 ocl: fix unaligned memory access
http://code.opencv.org/issues/4462
2015-07-06 13:58:17 +03:00
Alexander Alekhin
04b2edcc8c ocl: autogenerated files for cl_gl.h 2015-06-26 14:08:27 +03:00
Alexander Alekhin
ee68d26f99 ocl: update generator scripts 2015-06-26 14:08:20 +03:00
Vadim Pisarevsky
d280205245 fixed compile errors on ARM, as well as failures in OCL_Dft* regression tests 2015-05-06 10:00:10 +03:00
Alexander Alekhin
f282fd0ebf ocl: print missing error message only if OPENCV_OPENCL_RUNTIME is used 2015-01-29 13:16:31 +03:00
Alexander Alekhin
0a07d780e0 ocl: OpenCL SVM support 2015-01-23 20:37:45 +03:00
Alexander Karsakov
462c3c25a9 Removed incorrect using of rootn() and powr() in ocl_pow 2014-11-06 16:23:02 +03:00
ElenaGvozdeva
d88fdd0378 use LOCAL_SIZE+1 2014-10-28 15:18:31 +03:00
ElenaGvozdeva
65b8a1cb37 Some small fixes 2014-10-27 14:38:22 +03:00
Elena Gvozdeva
c5a2879ce0 use vectors 2014-10-27 14:38:22 +03:00
Elena Gvozdeva
2d89df1804 use local memory 2014-10-27 14:38:21 +03:00
Elena Gvozdeva
d78bc3c321 naive implementation 2014-10-27 14:38:21 +03:00
Alexander Karsakov
c942c6539a Remove mul24 since id can be larger 2^23 2014-09-08 13:11:58 +04:00
Vadim Pisarevsky
1f85ffa11b Merge pull request #3166 from akarsakov:ocl_native_sqrt 2014-09-01 10:36:50 +00:00
Alexander Alekhin
4d474d40e7 Merge pull request #3171 from akarsakov:amd_fft_fix 2014-08-29 16:28:31 +00:00
Ilya Lavrenov
71ec6144bd attempt to fix compilation of OpenCL cv::transpose for AMD 2014-08-29 16:59:30 +04:00
Alexander Alekhin
57fec2f2da OCL: enable clAmdFftGetVersion 2014-08-29 13:45:04 +04:00
Alexander Karsakov
491bf41356 Disabled native_sqrt for double, since it may be not implemented and gives compilation error. 2014-08-28 17:01:49 +04:00
Alexander Karsakov
a89ff402fc Refactoring of OCL_FftPlan class 2014-08-27 10:33:25 +04:00
Alexander Karsakov
3ae95150c7 Added double support for OCL version of DFT 2014-08-25 18:08:43 +04:00
Alexander Alekhin
52ac61d87c Merge pull request #3088 from vbystricky:ocl_enableNormEtc 2014-08-14 14:34:40 +00:00
vbystricky
1fe403f461 Enable OpenCL version of norm and convertScaleAbs or 32F data
Fix error in minmaxloc.cl
Change test for convertScaleAbs
Fix minMaxIdx for _src2 align
Change epsilon on the tests
2014-08-13 18:33:01 +04:00
Alexander Karsakov
c3100eeb19 Fixed buffer initialization in reduce kernel. Enabled OCL version of reduce for SUM, MAX, MIN modes. 2014-08-13 12:03:06 +04:00
vbystricky
6fb282aa39 Remove mul24, for CV8UC3 3840x2160 it generates implementation specific result 2014-08-12 11:25:23 +04:00
Alexander Karsakov
6ad4521b78 Fixed typos 2014-08-08 13:11:35 +04:00
Alexander Alekhin
55188fe991 world fix 2014-08-05 20:12:35 +04:00
vbystricky
774d277c1f Fix error in OpenCl version of meanstddev for continues src and not continues mask 2014-08-05 17:30:06 +04:00
vbystricky
11a0e3ff78 Fix error in OCL minmaxloc 2014-07-31 19:04:38 +04:00
Vadim Pisarevsky
962b519708 Merge pull request #2996 from akarsakov:ocl_dft_new_concept 2014-07-28 15:59:59 +00:00
Elena Gvozdeva
fe29af2e58 Fixed bug in reduce.cl 2014-07-25 14:51:30 +04:00
Alexander Karsakov
37d01e2d27 Added license header, using cv::Ptr, small fixes. 2014-07-25 13:27:00 +04:00
Alexander Karsakov
66ac46214d Final refactoring, fixes 2014-07-24 13:23:02 +04:00
Alexander Karsakov
1d2cf0e20e Added nonzero_rows support 2014-07-22 18:31:08 +04:00
Alexander Karsakov
52f76a3283 Added rest Elena's changes 2014-07-22 18:31:08 +04:00