Commit Graph

1277 Commits

Author SHA1 Message Date
insoow
2922738b6d Merge pull request #8104 from insoow:master
Gemm kernels for Intel GPU (#8104)

* Fix an issue with Kernel object reset release when consecutive Kernel::run calls

Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.

This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* GEMM kernel optimization for Intel GEN

The optimized kernels uses cl_intel_subgroups extension for better
performance.

Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Fix API compatibility error

This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Revert "Fix API compatibility error"

This reverts commit 2ef427db91.

Conflicts:
	modules/core/src/intel_gpu_gemm.inl.hpp

* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"

This reverts commit cc7f9f5469.

* Fix the case of uninitialization D

When C is null and beta is non-zero, D is used without initialization.
This resloves the issue

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* fix potential output error due to 0 * nan

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* whitespace fix, eliminate non-ASCII symbols

* fix build warning
2017-04-19 12:57:54 +03:00
Vitaly Tuzov
9dc36a1ece Tuned restrictions for Canny, Warp, FAST, Accumulate and Convolution OpenVX HAL calls on small images 2017-04-14 13:20:25 +03:00
Alexander Alekhin
8ba95cd498 Merge pull request #8548 from csukuangfj:fix-typo-RNG 2017-04-11 11:46:06 +00:00
Vitaly Tuzov
87bb74312b Disabled vxuConvolution call for Sobel, GaussianBlur and Box filter evaluation 2017-04-11 14:11:55 +03:00
Matthias Grundmann
fce7469961 Update utility.hpp
Adding missing header for ostream decl. in line 384
2017-04-10 12:04:09 -07:00
Fangjun KUANG
4065778255 fix typos. 2017-04-10 09:32:50 +02:00
Vitaly Tuzov
0f1a56da7c Changed restrictions for OpenVX HAL calls on small images 2017-04-09 01:17:57 +03:00
Fangjun KUANG
ff31d069d0 avoid allocating memory for string with a length of zero.
Remove the specifier "explicit", because the constructor has no parameter. There is no point to add it here.
2017-04-07 20:46:17 +02:00
jveitchmichaelis
8f19363c07 Update documentation for getCudaEnabledDeviceCount
Inform users that getCudaEnabledDeviceCount can return -1 in some cases.
2017-04-07 14:30:14 +01:00
nnorwitz
24e8cd1a78 Use %% for inline assembly rather than % so this compiles with clang.
Same as 9210cefb36 but for this file too.
2017-04-06 12:54:56 -07:00
Vitaly Tuzov
bf62dca45a Extended restrictions for OpenVX HAL calls on small images 2017-04-06 18:17:34 +03:00
Vitaly Tuzov
bf5b7843e8 Extended set of OpenVX HAL calls disabled for small images 2017-04-06 18:17:32 +03:00
Alexander Alekhin
e5d9b608c4 cmake: fix fp16 support 2017-04-04 20:34:58 +03:00
Alexander Alekhin
297ba85323 Merge pull request #8441 from alalek:dispatch_mathfuncs_core 2017-04-03 14:03:49 +00:00
Vadim Pisarevsky
4aa51f6a32 Merge pull request #8484 from berak:patch-2 2017-04-03 09:57:58 +00:00
Alexander Alekhin
bbdd8ba078 Merge pull request #8506 from sergiud:mat-move-assignment-dont-copy 2017-04-02 10:13:17 +00:00
Sergiu Deitsch
4f31759965 prevent copying in cv::Mat_<T> move assignment 2017-04-01 21:53:30 +02:00
Alexander Alekhin
b67382cbeb Merge pull request #8494 from tomoaki0705:fixWarningCuda 2017-03-31 15:42:01 +00:00
Tomoaki Teshima
507071cc6f suppress warnings on Jetson TK1 2017-03-31 08:20:59 +09:00
berak
3e0b63f65b fix comment in optim.hpp 2017-03-30 11:07:52 +02:00
jexner
b45e784beb Fix segmentation fault in cv::Mat::forEach
This issue concerns only matrices with more dimensions than columns.
See https://github.com/opencv/opencv/issues/8447
2017-03-23 18:48:59 +01:00
Fangjun KUANG
da94d85789 add more info to the error code. 2017-03-23 14:40:34 +01:00
Fangjun KUANG
f82d64c6e5 Add more info to the error code. 2017-03-23 14:34:24 +01:00
Alexander Alekhin
17e5e4cd5a core: CPU target dispatcher update
- use suffixes like '.avx.cpp'
- added CMake-generated files for '.simd.hpp' optimization approach
- wrap HAL intrinsic headers into separate namespaces for different build flags
- automatic vzeroupper insertion (via CV_INSTRUMENT_REGION macro)
2017-03-23 16:12:11 +03:00
Fangjun KUANG
94521629ab fix issue 8411. 2017-03-22 23:24:47 +01:00
KUANG, Fangjun
eae1ebfd29 fix issue 8411. 2017-03-22 22:03:29 +01:00
Vadim Pisarevsky
8abd163464 Merge pull request #8404 from khnaba:stream-with-custom-allocator 2017-03-21 20:06:56 +00:00
Vadim Pisarevsky
e5dbd2c3a5 Merge pull request #8406 from khnaba:dft-as-algorithm 2017-03-21 20:05:54 +00:00
Vadim Pisarevsky
a57d144076 Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00
Naba Kumar
29680100ac Support for creating streams with custom allocator 2017-03-21 14:50:14 +02:00
Naba Kumar
00f3ad7217 Implement DFT as cv::Algorithm to support concurrent streams 2017-03-21 13:55:13 +02:00
Naba Kumar
cdcf44b3ef Expose BufferPool class for external use also 2017-03-21 13:50:02 +02:00
Alexander Alekhin
73e9c44377 Merge pull request #8370 from csukuangfj:patch-7 2017-03-14 13:32:35 +00:00
Alexander Alekhin
661f3e2160 Merge pull request #8371 from csukuangfj:patch-8 2017-03-14 13:23:02 +00:00
Alexander Alekhin
6fcb07d41e Merge pull request #8375 from Sahloul:fixes/matx/init 2017-03-14 13:22:26 +00:00
Alexander Alekhin
54f7ebdec9 Merge pull request #8380 from csukuangfj:patch-9 2017-03-14 13:20:52 +00:00
Hamdi Sahloul
171e705ba4 Fixes the constructor of 1x14, 2x7, 7x2 or 14x1 matrix 2017-03-14 18:26:22 +09:00
Fangjun KUANG
3ad6d13ff3 Fix an error in the documentation. 2017-03-14 09:57:37 +01:00
hailong-wang
207218e920 Fix the bug of Mat_<>::opeartor []
`template<typename _Tp> inline const _Tp* Mat_<_Tp>::operator [](int y) const` does not support 3d matrix since it checks rows.

This operator[] shall check size.p[0] instead.
2017-03-14 13:02:59 +08:00
Fangjun KUANG
31cc519cd3 fix typos. 2017-03-13 13:51:22 +01:00
Fangjun KUANG
3c15913f53 Impove the documentation for Mat::diag 2017-03-13 12:46:50 +01:00
Alexander Alekhin
502aa1f053 Merge pull request #8368 from csukuangfj:patch-5 2017-03-13 10:15:30 +00:00
Fangjun KUANG
95468b72f3 Fix typos in the documentation for cv::Mat. 2017-03-13 10:20:41 +01:00
KUANG, Fangjun
3c5d87cbae Add more information to the error code. 2017-03-11 10:55:50 +01:00
Alexander Alekhin
8ef23d64a1 Merge pull request #8308 from sovrasov:fs_dmatch_kpts_update 2017-03-07 19:28:34 +00:00
Alexander Alekhin
f9f013e264 Merge pull request #8323 from csukuangfj:csukuangfj-patch-8 2017-03-07 11:24:19 +00:00
Fangjun KUANG
8a679128fa Update comments for cv::InputArray. 2017-03-06 14:45:30 +01:00
Vladislav Sovrasov
931b32d102 core: add single DMatch/Keypoint I/O 2017-03-03 13:58:55 +03:00
Vadim Pisarevsky
c7049ca627 Merge pull request #8293 from alalek:update_rng_in_parallel_for 2017-03-02 05:51:01 +00:00
Vadim Pisarevsky
ddfe688be6 Merge pull request #8299 from sovrasov:fs_fix_kpts_dmatch_output 2017-03-02 05:46:38 +00:00
Alexander Alekhin
69f1d1ddff Merge pull request #8296 from ville-v:master 2017-03-01 14:12:00 +00:00
Alexander Alekhin
47c4dcc8a3 Merge pull request #8204 from terfendail:ovx_tlcontext 2017-03-01 12:36:37 +00:00
Vladislav Sovrasov
c321d025c4 Fix DMatch and Keypoint I/O in FileStorage 2017-03-01 15:07:38 +03:00
ville-v
0c1bcf354c Fix issue #8278: "CV_XADD compile errors with Embarcadero C++ Builder 10.1" 2017-03-01 08:47:49 +02:00
ville-v
1de10f9f86 Add files via upload
Fix issue #8280: "fastmath.h related compile errors with Embarcadero C++ Builder 10.1"
2017-03-01 08:42:14 +02:00
Alexander Alekhin
649bb7ac04 core: parallel_for_(): update RNG state of the main thread 2017-02-28 18:28:15 +03:00
Alexander Alekhin
c624d82383 Merge pull request #8239 from tomoaki0705:buildUniversalIntrinsicBlend 2017-02-24 11:17:51 +00:00
Vadim Pisarevsky
12d7429ff0 Merge pull request #8064 from terfendail:sgbm_bigbuffer 2017-02-23 20:11:26 +00:00
Vitaly Tuzov
9a4b5a4545 OpenVX calls updated to use single common OpenVX context per thread 2017-02-21 16:08:23 +03:00
Alexander Alekhin
ec7f74f7b4 core(TLS): add cleanup() method 2017-02-21 16:08:23 +03:00
Fangjun KUANG
526220a171 Fix typos in the documentation (#8226)
* fix typos.

* Fix typos.

* Fix typos.

* Fix typos.

* Fix typos.
2017-02-21 12:48:15 +03:00
Vadim Pisarevsky
5bfaf9931b Merge pull request #8228 from csukuangfj:csukuangfj-patch 2017-02-21 09:46:09 +00:00
Fangjun KUANG
b1851e2f16 Add support to print cv::UMat.
Now a user can use `std::cout` to print an object of `cv::UMat` just like `cv::Mat`.
2017-02-20 16:22:46 +01:00
Tomoaki Teshima
64cf206fb5 optimize blend using universal intrinsic
- add more channels/depth performance test for blend
2017-02-20 19:09:26 +09:00
Alexander Alekhin
b2da9df82d Merge pull request #8221 from csukuangfj:csukuangfj-path-2 2017-02-19 10:16:00 +00:00
Fangjun KUANG
e827a5bd9e Fix an error in the demo code for cv::Mat::forEach 2017-02-18 10:14:29 +01:00
Fangjun KUANG
57ed0e57f0 Fix the documentation for Mat::diag(int). (#8199)
* Fix the documentation for Mat::diag(int).

Fix issue #8181

* Fix the documentation for Mat::diag(int).

Fix issue #8181.

* Add support for printing out cv::Complex.

* Remove extra spaces.

* cv::Complex is submitted as a new pull request.
2017-02-16 18:00:32 +03:00
Fangjun KUANG
a8a208e0fe Merge pull request #8208 from csukuangfj:complex_support
Add support for printing out cv::Complex. (#8208)

* Add support for printing out cv::Complex.

* Conform to the format of std::complex.

* Remove extra spaces.

* Remove extra spaces.
2017-02-15 21:50:14 +03:00
Alexander Alekhin
3fbaabc866 Merge pull request #8209 from csukuangfj:csukuangfj-patch-2 2017-02-15 18:48:01 +00:00
Fangjun KUANG
46fe74177d Fix typos. 2017-02-15 14:52:00 +01:00
Pavel Rojtberg
df86f0752a add missing casts to _Tp as determinant() always returns double 2017-02-15 12:21:17 +01:00
Alexander Alekhin
e16227b53c cmake: support multiple CPU targets 2017-02-13 19:52:59 +03:00
Fangjun KUANG
1e11657ba4 Merge pull request #8197 from csukuangfj/csukuangfj-patch-1
Fix typos in the documentation for AutoBuffer. (#8197)

* Allocate 1000 floats to match the documentation

Fix the documentation of `AutoBuffer`. By default, the following code
```.cpp
cv::AutoBuffer<float> m;
```` 
allocates only 264 floats. But the comment in the demonstration code says it allocates 1000 floats, which is 
not correct.

* fix typo in the comment.
2017-02-13 13:58:44 +03:00
Alexander Alekhin
9ac9e9e29a core: fix String::end() implementation 2017-02-09 16:36:22 +03:00
Vitaly Tuzov
4950f542d1 Fix for SGBM compute() buffer allocation failure on big images 2017-02-08 12:49:21 +03:00
Alexander Alekhin
519e452e1a Merge pull request #8128 from LaurentBerger:MatrixExpressions
Add a link to MatExpr in Detailed Description of Mat
2017-02-06 10:34:12 +00:00
Alexander Alekhin
48f7cbec75 Merge pull request #8107 from reunanen:fix8093 2017-02-06 10:31:34 +00:00
LaurentBerger
488eb11ba8 Add a link to MatExpr in Detailed Description of Mat 2017-02-04 11:10:50 +01:00
Pavel Vlasov
a47b7a34be Adds IPP control functions to bindings export 2017-02-01 10:29:35 +03:00
Juha Reunanen
f3cb5084cf Fix #8093: CV_DbgAssert that the result of area() fits in the return value 2017-01-31 15:02:36 +02:00
Tomoaki Teshima
fd711219a2 use universal intrinsic in VBLAS
- brush up v_reduce_sum of SSE version
2017-01-31 05:36:27 +09:00
mshabunin
c6c519166e Added CV_DEPRECATED macro 2017-01-24 15:57:06 +03:00
Tomoaki Teshima
8b22099da2 use universal intrinsic and SSE4 popcount instruction in normHamming
- add v_popcount in universal intrinsic
 - add test for v_popcount
 - add wrapper of popcount for both MSVC and GCC
2017-01-12 09:09:22 +09:00
Alexander Alekhin
7dd3723abe Merge tag '3.2.0' 2016-12-23 16:20:02 +03:00
Alexander Alekhin
70bbf17b13 OpenCV Version++
3.2.0
2016-12-23 15:54:44 +03:00
Alexander Alekhin
7a95e654eb ocl: update compiled programs
- minimize library initialization time (lazy calculations of program hash)
- LRU cache of in-memory compiled programs
2016-12-19 17:17:20 +03:00
Alexander Alekhin
d85c11e525 OpenCV version++
3.2.0-rc
2016-12-19 17:12:18 +03:00
Alexander Alekhin
862c3aa6e1 Merge pull request #7873 from addisonElliott:Mat_Range_InitializerList 2016-12-16 16:45:17 +00:00
Alexander Alekhin
0e4dde1781 Merge pull request #7872 from alalek:merge-2.4 2016-12-16 16:03:14 +02:00
Alexander Alekhin
c038d1be60 Merge pull request #7858 from addisonElliott:master 2016-12-16 10:57:27 +00:00
Addison Elliott
eb04b2bfa9 Added N-dim submat selection with vectors
Currently, to select a submatrix of a N-dimensional matrix, it requires
two lines of code while only one line of code is required if using a 2D
array.

I added functionality to be able to select an N-dim submatrix using a
vector list instead of a Range pointer. This allows initializer lists to
be used for a one-line selection.
2016-12-15 09:16:40 -06:00
Addison Elliott
fa6692afcf Added new overloaded functions for Mat and UMat that accepts std::vector<int> instead of int * for the sizes on a N-dimensional array.
This allows for an N-dimensional array to be setup in one line instead of two when using C++11 initializer lists. cv::Mat(3, {zDim, yDim, xDim}, ...) can be used instead of having to create an int pointer to hold the size array.
2016-12-14 13:52:03 -06:00
Rostislav Vasilikhin
8b9422a052 OpenVX wrappers rewritten with CV_OVX_RUN, VX_DbgThrow 2016-12-14 17:49:41 +03:00
mshabunin
965deaba8d Documentation fixes for latest doxygen 2016-12-14 14:14:13 +03:00
Alexander Alekhin
fbf2383d5d Merge pull request #7787 from alalek:ocl_explicit_only 2016-12-13 10:22:33 +00:00
apavlenko
1e2ddc30b1 Canny via OpenVX, Node wrapper extended (query/set attribute), some naming fixes 2016-12-09 14:53:06 +03:00
Alexander Alekhin
0724d13bcd build: cuda warnings 2016-12-04 03:10:05 +03:00
Alexander Alekhin
44d9d59f08 ocl: stop using of OpenCL without explicit UMat arguments 2016-12-04 02:34:17 +03:00
mshabunin
695c518384 Updated TBB search script and code checks 2016-12-01 16:58:38 +03:00
Alexander Alekhin
69949025db core: drop type/dims/rows/cols information in Mat::release() 2016-11-23 13:51:37 +03:00