Commit Graph

1213 Commits

Author SHA1 Message Date
insoow
2922738b6d Merge pull request #8104 from insoow:master
Gemm kernels for Intel GPU (#8104)

* Fix an issue with Kernel object reset release when consecutive Kernel::run calls

Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.

This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* GEMM kernel optimization for Intel GEN

The optimized kernels uses cl_intel_subgroups extension for better
performance.

Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Fix API compatibility error

This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Revert "Fix API compatibility error"

This reverts commit 2ef427db91.

Conflicts:
	modules/core/src/intel_gpu_gemm.inl.hpp

* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"

This reverts commit cc7f9f5469.

* Fix the case of uninitialization D

When C is null and beta is non-zero, D is used without initialization.
This resloves the issue

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* fix potential output error due to 0 * nan

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* whitespace fix, eliminate non-ASCII symbols

* fix build warning
2017-04-19 12:57:54 +03:00
Alexander Alekhin
8ba95cd498 Merge pull request #8548 from csukuangfj:fix-typo-RNG 2017-04-11 11:46:06 +00:00
Fangjun KUANG
4065778255 fix typos. 2017-04-10 09:32:50 +02:00
nnorwitz
24e8cd1a78 Use %% for inline assembly rather than % so this compiles with clang.
Same as 9210cefb36 but for this file too.
2017-04-06 12:54:56 -07:00
Alexander Alekhin
e5d9b608c4 cmake: fix fp16 support 2017-04-04 20:34:58 +03:00
Alexander Alekhin
297ba85323 Merge pull request #8441 from alalek:dispatch_mathfuncs_core 2017-04-03 14:03:49 +00:00
Vadim Pisarevsky
4aa51f6a32 Merge pull request #8484 from berak:patch-2 2017-04-03 09:57:58 +00:00
Alexander Alekhin
bbdd8ba078 Merge pull request #8506 from sergiud:mat-move-assignment-dont-copy 2017-04-02 10:13:17 +00:00
Sergiu Deitsch
4f31759965 prevent copying in cv::Mat_<T> move assignment 2017-04-01 21:53:30 +02:00
Alexander Alekhin
b67382cbeb Merge pull request #8494 from tomoaki0705:fixWarningCuda 2017-03-31 15:42:01 +00:00
Tomoaki Teshima
507071cc6f suppress warnings on Jetson TK1 2017-03-31 08:20:59 +09:00
berak
3e0b63f65b fix comment in optim.hpp 2017-03-30 11:07:52 +02:00
jexner
b45e784beb Fix segmentation fault in cv::Mat::forEach
This issue concerns only matrices with more dimensions than columns.
See https://github.com/opencv/opencv/issues/8447
2017-03-23 18:48:59 +01:00
Alexander Alekhin
17e5e4cd5a core: CPU target dispatcher update
- use suffixes like '.avx.cpp'
- added CMake-generated files for '.simd.hpp' optimization approach
- wrap HAL intrinsic headers into separate namespaces for different build flags
- automatic vzeroupper insertion (via CV_INSTRUMENT_REGION macro)
2017-03-23 16:12:11 +03:00
Vadim Pisarevsky
8abd163464 Merge pull request #8404 from khnaba:stream-with-custom-allocator 2017-03-21 20:06:56 +00:00
Vadim Pisarevsky
e5dbd2c3a5 Merge pull request #8406 from khnaba:dft-as-algorithm 2017-03-21 20:05:54 +00:00
Vadim Pisarevsky
a57d144076 Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00
Naba Kumar
29680100ac Support for creating streams with custom allocator 2017-03-21 14:50:14 +02:00
Naba Kumar
00f3ad7217 Implement DFT as cv::Algorithm to support concurrent streams 2017-03-21 13:55:13 +02:00
Naba Kumar
cdcf44b3ef Expose BufferPool class for external use also 2017-03-21 13:50:02 +02:00
Alexander Alekhin
73e9c44377 Merge pull request #8370 from csukuangfj:patch-7 2017-03-14 13:32:35 +00:00
Alexander Alekhin
661f3e2160 Merge pull request #8371 from csukuangfj:patch-8 2017-03-14 13:23:02 +00:00
Alexander Alekhin
6fcb07d41e Merge pull request #8375 from Sahloul:fixes/matx/init 2017-03-14 13:22:26 +00:00
Alexander Alekhin
54f7ebdec9 Merge pull request #8380 from csukuangfj:patch-9 2017-03-14 13:20:52 +00:00
Hamdi Sahloul
171e705ba4 Fixes the constructor of 1x14, 2x7, 7x2 or 14x1 matrix 2017-03-14 18:26:22 +09:00
Fangjun KUANG
3ad6d13ff3 Fix an error in the documentation. 2017-03-14 09:57:37 +01:00
hailong-wang
207218e920 Fix the bug of Mat_<>::opeartor []
`template<typename _Tp> inline const _Tp* Mat_<_Tp>::operator [](int y) const` does not support 3d matrix since it checks rows.

This operator[] shall check size.p[0] instead.
2017-03-14 13:02:59 +08:00
Fangjun KUANG
31cc519cd3 fix typos. 2017-03-13 13:51:22 +01:00
Fangjun KUANG
3c15913f53 Impove the documentation for Mat::diag 2017-03-13 12:46:50 +01:00
Alexander Alekhin
502aa1f053 Merge pull request #8368 from csukuangfj:patch-5 2017-03-13 10:15:30 +00:00
Fangjun KUANG
95468b72f3 Fix typos in the documentation for cv::Mat. 2017-03-13 10:20:41 +01:00
Alexander Alekhin
8ef23d64a1 Merge pull request #8308 from sovrasov:fs_dmatch_kpts_update 2017-03-07 19:28:34 +00:00
Alexander Alekhin
f9f013e264 Merge pull request #8323 from csukuangfj:csukuangfj-patch-8 2017-03-07 11:24:19 +00:00
Fangjun KUANG
8a679128fa Update comments for cv::InputArray. 2017-03-06 14:45:30 +01:00
Vladislav Sovrasov
931b32d102 core: add single DMatch/Keypoint I/O 2017-03-03 13:58:55 +03:00
Vadim Pisarevsky
c7049ca627 Merge pull request #8293 from alalek:update_rng_in_parallel_for 2017-03-02 05:51:01 +00:00
Vadim Pisarevsky
ddfe688be6 Merge pull request #8299 from sovrasov:fs_fix_kpts_dmatch_output 2017-03-02 05:46:38 +00:00
Alexander Alekhin
69f1d1ddff Merge pull request #8296 from ville-v:master 2017-03-01 14:12:00 +00:00
Alexander Alekhin
47c4dcc8a3 Merge pull request #8204 from terfendail:ovx_tlcontext 2017-03-01 12:36:37 +00:00
Vladislav Sovrasov
c321d025c4 Fix DMatch and Keypoint I/O in FileStorage 2017-03-01 15:07:38 +03:00
ville-v
0c1bcf354c Fix issue #8278: "CV_XADD compile errors with Embarcadero C++ Builder 10.1" 2017-03-01 08:47:49 +02:00
ville-v
1de10f9f86 Add files via upload
Fix issue #8280: "fastmath.h related compile errors with Embarcadero C++ Builder 10.1"
2017-03-01 08:42:14 +02:00
Alexander Alekhin
649bb7ac04 core: parallel_for_(): update RNG state of the main thread 2017-02-28 18:28:15 +03:00
Alexander Alekhin
c624d82383 Merge pull request #8239 from tomoaki0705:buildUniversalIntrinsicBlend 2017-02-24 11:17:51 +00:00
Vadim Pisarevsky
12d7429ff0 Merge pull request #8064 from terfendail:sgbm_bigbuffer 2017-02-23 20:11:26 +00:00
Vitaly Tuzov
9a4b5a4545 OpenVX calls updated to use single common OpenVX context per thread 2017-02-21 16:08:23 +03:00
Alexander Alekhin
ec7f74f7b4 core(TLS): add cleanup() method 2017-02-21 16:08:23 +03:00
Fangjun KUANG
526220a171 Fix typos in the documentation (#8226)
* fix typos.

* Fix typos.

* Fix typos.

* Fix typos.

* Fix typos.
2017-02-21 12:48:15 +03:00
Vadim Pisarevsky
5bfaf9931b Merge pull request #8228 from csukuangfj:csukuangfj-patch 2017-02-21 09:46:09 +00:00
Fangjun KUANG
b1851e2f16 Add support to print cv::UMat.
Now a user can use `std::cout` to print an object of `cv::UMat` just like `cv::Mat`.
2017-02-20 16:22:46 +01:00