Commit Graph

2135 Commits

Author SHA1 Message Date
Christof Kaufmann
46a668c565 Add multi-channel mask support to mean, meanStdDev and setTo
This adds the possibility to use multi-channel masks for the functions
cv::mean, cv::meanStdDev and the method Mat::setTo. The tests have now a
probability to use multi-channel masks for operations that support them.
This also includes Mat::copyTo, which supported multi-channel masks
before, but there was no test confirming this.
2017-09-04 19:40:27 +02:00
Alexander Alekhin
0451629e22 core(persistence): resolve DMatch/KeyPoint problem 2017-08-31 19:35:48 +03:00
Vadim Pisarevsky
518c6ae8c6 Merge pull request #9327 from sovrasov:fs_free_on_error_fix 2017-08-28 20:25:34 +00:00
Zoltán Mizsei
6258ff36bc Haiku build fix 2017-08-26 11:37:59 +02:00
Alexander Alekhin
603fa03ac6 Merge pull request #9441 from wzw-intel:delete_program 2017-08-25 12:03:27 +00:00
Wu Zhiwen
da3da84a20 ocl: Add a function to unload a run-time cached program
This function is the counterpart of "Context::getProg".
With this function, users have chance to unload a program
from global run-time cached programs, and save resource.
2017-08-25 08:42:11 +08:00
Alexander Alekhin
9c14a2f0aa Merge pull request #9395 from lupustr3:pvlasov/icv2017u3_update 2017-08-24 11:48:53 +00:00
Alexander Alekhin
d0509f6702 Merge pull request #9449 from ribalda:ocv 2017-08-23 19:40:36 +00:00
Ricardo Ribalda Delgado
6fc5697950 ocl: Fix OpenCL library detection in Linux
OpenCL runtime does not require OpenCL development file (libOpenCL.so),
just the "run" library (so.1).

This patch searches for the run library (so.1) if the dev library (.so)
is not found.

Web search shows that this error has been present since at least 2015
http://answers.opencv.org/question/80532/haveopencl-return-false/

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
2017-08-23 16:38:06 +02:00
Pavel Vlasov
a57718e1ac ICV2017u3 package update;
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
2017-08-23 14:24:43 +03:00
KUANG Fangjun
97ec91ad67 fix cv::CommandLineParser.
It should handle bool value not only of "true" but also of "TRUE" and "True".
2017-08-23 11:38:58 +02:00
Rostislav Vasilikhin
66b0651607 Merge pull request #9329 from savuor:softfloat_sincos
SoftFloat: added sin, cos and docs (#9329)

* softfloat: comparison operators made inline, min() max() eps() isSubnormal() added

* softfloat: get/set sign/exp

* softfloat: get/set frac

* softfloat: tests rewritten with new tools

* softfloat: added pi(), sin(), cos()

* softfloat: more comments

* softfloat: updated sincos arg reduction

* softfloat: initial tests for sincos added

* softfloat: test works, code cleanup is pending

* softfloat: sincos argreduce rewritten

* softfloat: sincos refactored and simplified

* softfloat sincos: epsilons calibrated

* softfloat: junk code removed from tests

* softfloat: docs added

* inline comparisons undone; warning fixed
2017-08-15 09:23:26 +00:00
Alexander Alekhin
a048cb9f0d Merge pull request #9338 from dkurt:fix_ocl 2017-08-14 12:56:07 +00:00
Alexander Alekhin
ca9a88785e core(build): fix compilation of parallel.cpp (OpenMP configuration) 2017-08-14 11:42:49 +03:00
Alexander Alekhin
0ebabe17df core: fix flag processing for nested regions in cv::parallel_for_() 2017-08-10 08:37:47 +00:00
Dmitry Kurtaev
41519d3ac0 Fixed some OpenCL interface bugs 2017-08-09 11:54:55 +03:00
Vladislav Sovrasov
9a10bdbae5 core: use new assert in matmul.cpp 2017-08-08 23:00:11 +03:00
Vladislav Sovrasov
5e68b28ad3 core: fix file not closed when exception in FS 2017-08-07 21:03:59 +03:00
Alexander Alekhin
9ca39821c8 core: divUp function 2017-08-03 19:51:45 +03:00
Alexander Alekhin
dcc63d7408 Merge pull request #9248 from alalek:alloc_refactoring 2017-08-03 16:25:29 +00:00
Alexander Alekhin
16fb74425e ocl: fix program cache key 2017-07-31 17:24:08 +03:00
Alexander Alekhin
e58a778bd5 core(stat): disable IPP optimization in meanStdDev (cn > 1) 2017-07-31 14:09:18 +03:00
Alexander Alekhin
fffd0f5b68 Merge pull request #9241 from alalek:tlsSlotsSize 2017-07-30 09:53:39 +00:00
Alexander Alekhin
b46e741c95 core(alloc): drop unused code, use memalign() functions instead of hacks
valgrind provides better detection without memory buffer hacks
2017-07-27 18:10:41 +03:00
Alexander Alekhin
34f9c039c5 Merge pull request #9238 from alalek:valgrind_fixes 2017-07-27 14:33:01 +00:00
Alexander Alekhin
d35422b523 core(tls): hide assertions from Thread Sanitizer 2017-07-27 17:31:51 +03:00
Alexander Alekhin
68ef903a7c core(tls): don't use tlsSlots without synchronization 2017-07-26 22:45:55 +03:00
Alexander Alekhin
bf0173bf38 ts: update valgrind suppressions 2017-07-26 17:21:45 +03:00
Alexander Alekhin
b4e300b78b Merge pull request #9236 from dkurt:fix_json_bool 2017-07-26 13:08:13 +00:00
Alexander Alekhin
402a77e7f7 Merge pull request #9237 from alalek:fix_winrt_build 2017-07-26 10:42:49 +00:00
dkurt
583b327523 Fix JSON booleans without quotes 2017-07-26 12:51:06 +03:00
Alexander Alekhin
c512bf6c66 Merge pull request #9232 from dkurt:json_named_nodes 2017-07-25 15:56:03 +00:00
dkurt
3515f6ec33 Missed NAMED bit of JSON nodes tag 2017-07-25 13:39:32 +03:00
Alexander Alekhin
602f047fe8 build: replace WIN32 => _WIN32 2017-07-25 13:30:48 +03:00
Alexander Alekhin
7f3eea6325 core: fix Mat/UMat cleanup on exceptions in deallocate() 2017-07-25 12:27:30 +03:00
Alexander Alekhin
bc3c7e80a6 Merge pull request #9209 from alalek:fix_persistence_format 2017-07-21 10:55:40 +00:00
Alexander Alekhin
544eb4be1f IPP: update minMaxIdx, disable some AVX optimizations with mask 2017-07-21 12:56:36 +03:00
Alexander Alekhin
ec7ce81401 core: fix FileStorage format detection in case of .gz archives 2017-07-20 19:58:36 +03:00
Alexander Alekhin
5bc291937f test: FileStorage format regression test 2017-07-20 19:58:10 +03:00
Alexander Alekhin
e5ed9cc612 Merge pull request #8498 from savuor:bit_exact_lab 2017-07-17 14:01:05 +00:00
Alexander Alekhin
b4716b1d92 core: fix convertTo() AVX2 optimization 2017-07-17 15:02:14 +03:00
Rostislav Vasilikhin
4b75be801e initial version of Lab2RGB_f tetrahedral interpolation written
RGB2Lab_f added, bugs fixed, moved to float

several bugs fixed

LUT fixed, no switch in tetraInterpolate()

temporary code; to be removed and rewritten

before refactoring

extra interpolations removed, some things to do left

added Lab2RGB_b +XYZ version, etc.

basic version is done, to be sped up

tetra refactored

interpolations: LUT for weights, refactor., etc.

address arithm optimized

initial version of vectorized code added (not compiling now)

compilation fixed, now segfaults

a lot of fixes, vectorization temp. disabled

fixed trilinear shift size, max error dropped from 19 to 10

fixed several bugs (255 vs 256, signed vs unsigned, bIdx)

minor changes

packed: address arithmetics fixed

shorter code

experiments with pure integer calculations

Lab2RGB max error decreased to 2; need to clean the code

ready for vectorization; need cleaning

vectorized, to be debugged

precision fixed, max error is 2

Lab->XYZ shortened

minor fixes

Lab2RGB_f version fixed, to be completely rewritten using _b code

RGB2Lab_f vectorized

minors

moved to separate file

refactored Lab2RGB to float and int versions

minor fix

Lab2RGB_f vectorized

minor refactoring

Lab2RGBint refactored: process methods, vectorize by 4 pix

Lab2RGB_f int version is done

cleanup extra code

code copied to color.cpp

fixed blue idx bug

optimizations enabled when testing; mulFracConst introduced

divConst -> mulFracConst

calc min time in perf instead of avg

minors

process() slightly sped up

Lab2RGB_f: disabled int version

reinterpret added, minor fixes in names

some warnings fixed

changes transferred to color.cpp

RGB2Lab_f code (and trilinear interpolation code) moved to rgb2lab_faster

whitespace

shift negative fixed

more warnings fixed

"constant condition" warnings fixed, little speed up

minor changes

test_photo decolor fixed

changes copied to test_lab.cpp

idx bounds checking in LUT init

several fixes

WIP: softfloat almost integrated

test_lab partially rewritten to SoftFloat

color.cpp rewritten to SoftFloat

test_lab.cpp: accuracy code added

several fixes

RGB2Lab_b testing fixed

splineBuild() rewritten to SoftFloat

accuracy control improved

rounding fixed

Luv <=> RGB: rewritten to SoftFloat

OCL cvtColor Lab and Lut rewritten to SoftFloat

minor fixes

refactored to new SoftFloat interface

round() -> cvRound, etc.

fixed OCL tests

softfloat.cpp: internal functions made static, unused ones removed

meaningful constants

extra lines removed

unused function removed

unfinished work

it works, need to fix TODOs

refactoring; more calls rewritten

mulFracConst removed

constants made bit exact; minors

changes moved to color.cpp

fixed 1 bug and 4 warnings

OCL: fixed constants

pow(x, _1_3f) replaced by cubeRoot(x)

fixed compilation on MSVC32

magic constants explained

file with internal accuracy&speed tests moved to lab_tetra branch
2017-07-17 00:32:30 +03:00
Alexander Alekhin
5a54acef4e Merge pull request #9130 from alalek:android_define 2017-07-14 17:17:24 +00:00
Alexander Alekhin
11626fe32c Merge pull request #8975 from sovrasov:fs_additional_errors 2017-07-14 17:13:50 +00:00
Alexander Alekhin
4e39d0371d Merge pull request #9074 from alalek:cpu_dispatch_core_hamming
cpu dispatch(core): hamming
2017-07-14 16:48:07 +00:00
Alexander Alekhin
eef78f5664 Merge pull request #9061 from terfendail:convert_avx
AVX and SSE4.1 optimized conversion migrated
2017-07-14 16:43:54 +00:00
Alexander Alekhin
5ebfb52a4a ipp(minmaxIdx): disable SSE4.2 optimizations for 32f datatype
NaN values handling issue
2017-07-12 16:06:18 +03:00
PkLab.net
6dd9e18b2e add std::string overload for cv::read() 2017-07-12 15:51:11 +03:00
Vladislav Sovrasov
5b833db558 core: forbid conversion real->int in some cases in FileStorage 2017-07-12 15:50:57 +03:00
Maksim Shabunin
02db592014 Fixed several issues found by static analysis (Windows specific) 2017-07-10 23:14:02 +03:00
Alexander Alekhin
a4a47b538c build: detect Android via '__ANDROID__' macro
https://sourceforge.net/p/predef/wiki/OperatingSystems
2017-07-10 12:43:59 +03:00
Alexander Alekhin
da8dbf6cf5 ocl: async cl_buffer cleanup queue (for event callback) 2017-07-07 13:41:20 +03:00
Alexander Alekhin
daee982106 ocl: rework events handling with clSetEventCallback 2017-07-06 13:25:32 +03:00
Vitaly Tuzov
5448d9186a AVX and SSE4.1 optimized conversion implementations migrated to separate files 2017-07-04 14:48:01 +03:00
Alexander Alekhin
b66c349bba core(stat): add required CV_AVX_GUARD
Added guard with 'vzeroupper' instruction
2017-07-02 22:45:10 +00:00
Alexander Alekhin
c45d3568ae core(stat): register dispatched code, fix build 2017-07-02 22:45:10 +00:00
Alexander Alekhin
6a6222d21c core(stat): remove useless checks 2017-07-02 22:45:10 +00:00
Alexander Alekhin
880052d3f3 core(stat): create dispatch.cpp file 2017-07-02 22:45:10 +00:00
Alexander Alekhin
85afbd409b core(stat): move implementations into .hpp file w/o changes 2017-07-02 22:45:09 +00:00
Alexander Alekhin
03c3e0edcf core(stat): stat.cpp minor refactoring
- remove unused code
- added: #if CV_ENABLE_UNROLLED in Hamming's functions
2017-07-02 22:45:09 +00:00
Maksim Shabunin
1f23202ad8 Issues found by static analysis (5th round) 2017-07-01 18:56:24 +03:00
Maksim Shabunin
e0393f8557 Fixed some issues found by static analysis (4th round) 2017-06-30 12:26:53 +03:00
Vadim Pisarevsky
2ac819018d Merge pull request #9014 from sovrasov:compare_scalars_fix 2017-06-29 11:14:44 +00:00
Maksim Shabunin
a769d69a9d Fixed several issues found by static analysis 2017-06-28 18:06:18 +03:00
Vladislav Sovrasov
35a1ecef2a core: fix infinite recursion in compare 2017-06-28 15:00:52 +03:00
Maksim Shabunin
32d4af36e2 Fixing some static analysis issues 2017-06-27 14:30:26 +03:00
Alexander Alekhin
650830b9d6 build: eliminate warning 2017-06-27 08:16:40 +03:00
Vadim Pisarevsky
ef2e5a9f82 Merge pull request #8988 from sovrasov:repeat_src_eq_dst_fix 2017-06-26 21:58:26 +00:00
Rostislav Vasilikhin
e63feba8e2 fixed typo 2017-06-26 20:19:18 +03:00
Alexander Alekhin
006966e629 trace: initial support for code trace 2017-06-26 17:07:13 +03:00
Vladislav Sovrasov
4f9871817a core: forbid handling of the case when src=dst in cv::repeat 2017-06-26 14:02:52 +03:00
Vadim Pisarevsky
fa7e7e0ff9 Merge pull request #8900 from alalek:update_android_build 2017-06-23 10:58:53 +00:00
Alexander Alekhin
3e3e2dd512 android: make optional "cpufeatures", build fixes for NDK r15 2017-06-21 13:34:19 +03:00
Alexander Alekhin
d3ebe665e0 core: fix IPP optimization for sortIdx 2017-06-21 03:04:16 +00:00
Rostislav Vasilikhin
939c8e8a99 float constant replaced by int hex representations 2017-06-15 15:10:41 +03:00
Rostislav Vasilikhin
29593635ed licence updated 2017-06-14 21:20:10 +03:00
Alexander Alekhin
9fa90c8851 Merge pull request #8899 from tomoaki0705:fixSuppressWarningsUnreachable 2017-06-14 01:07:47 +00:00
Vadim Pisarevsky
fbafc700ea added v_reduce_sum4() universal intrinsic; corrected number of threads in cv::getNumThreads() in the case of GCD 2017-06-13 18:04:00 +03:00
Tomoaki Teshima
94848a3e1f suppress unreachable code warning
- fix the define condition based on the comment
2017-06-13 08:11:04 +09:00
Alexander Alekhin
a3189e36c0 Merge pull request #8753 from RyuheiMori:fix-cpu-feature-detection-on-android 2017-06-12 16:08:08 +00:00
Alexander Alekhin
3dee87b697 update CPU detection on ANDROID patch 2017-06-11 05:06:49 +00:00
Alexander Alekhin
0213b508dc Merge pull request #8868 from alalek:fix_build_softfloat 2017-06-08 20:22:53 +00:00
Alexander Alekhin
e3c0d11b55 Merge pull request #8876 from alalek:fix_build_msvs 2017-06-08 19:53:29 +00:00
Maksim Shabunin
f71ea4dfe9 Merge pull request #8816 from mshabunin:sprintf-fix
Fixed snprintf for VS 2013 (#8816)

* Fixed snprintf for VS 2013

* snprintf: removed declaration from header, changed implementation

* cv_snprintf corrected according to comments

* update snprintf patch
2017-06-08 21:53:16 +02:00
Alexander Alekhin
5c0a287ce8 build: fix warning
C4189: 'clImageUV' : local variable is initialized but not referenced
2017-06-08 20:40:36 +03:00
Alexander Alekhin
71517a910a build: fix errors for MSVS2010-2013, reduce default softfloat scope 2017-06-08 01:09:21 +00:00
Alexander Alekhin
125abe2fe4 Merge pull request #8838 from tomoaki0705:dispatchFp16 2017-06-06 15:31:42 +00:00
Tomoaki Teshima
e269ef96cb update convertFp16 using CV_CPU_CALL_FP16
* avoid link error (move the implementation of software version to header)
 * make getConvertFuncFp16 local (move from precomp.hpp to convert.hpp)
 * fix error on 32bit x86
2017-06-06 22:26:51 +09:00
Rostislav Vasilikhin
c6a3a18894 SoftFloat integrated (#8668)
* everything is put into softfloat.cpp and softfloat.hpp

* WIP: try to integrate softfloat into OpenCV

* extra functions removed

* softfloat made stateless

* CV_EXPORTS added

* operators fixed

* exp added, log: WIP

* log32 fixed

* shorter names; a lot of TODOs

* log64 rewritten

* cbrt32 added

* minors, refactoring

* "inline" -> "CV_INLINE"

* cast to bool warnings fixed

* several warnings fixed

* fixed warning about unsigned unary minus

* fixed warnings on type cast

* inline -> CV_INLINE

* special cases processing added (NaNs, Infs, etc.)

* constants for NaN and Inf added

* more macros and helper functions added

* added (or fixed) tests for pow32, pow64, cbrt32

* exp-like functions fixed

* minor changes

* fixed random number generation for tests

* tests for exp32 and exp64: values are compared to SoftFloat-based naive implementation

* minor warning fix

* pow(f, i) 32/64: special cases handling added

* unused functions removed

* refactoring is in progress (not compiling)

* CV_inline added

* unions {uint_t, float_t} removed

* tests compilation fixed

* static const members -> static methods returning const

* reinterpret_cast

* warning fixed

* const-ness fixed

* all FP calculations (even compile-time) are done in SoftFloat + minor fixes

* pow(f, i) removed from interface (can cause incorrect cast) to internals of pow(f, f), tests fixed

* CV_INLINE -> inline

* internal constants moved to .cpp file

* toInt_minMag() methods merged into toInt() methods

* macros moved to .cpp file

* refactoring: types renamed to softfloat and softdouble; explicit constructors, etc.

* toFloat(), toDouble() -> operator float(), operator double()

* removed f32/f64 prefixes from functions names

* toType() methods removed, round() and trunc() functions added

* minor change

* minors

* MSVC: warnings fixed

* added int cvRound(), cvFloor, cvCeil, cvTrunc, saturate_cast<T>()

* typo fixed

* type cast fixed
2017-05-29 17:07:25 +03:00
Alexander Alekhin
36918b3bb8 Merge pull request #8814 from woodychow:openmp_num_threads 2017-05-29 13:17:41 +00:00
Woody Chow
6e00c7651b Use num_threads clause of #pragma omp parallel instead to avoid calling omp_set_num_threads for the entire application 2017-05-29 14:16:10 +09:00
Vadim Pisarevsky
ee257ffe9e Merge pull request #8455 from terfendail:ovxhal_skipsmall 2017-05-26 12:10:18 +00:00
Vitaly Tuzov
1d62a025b3 Moved size restrictions for OpenVX processed images to corresponding cpp files 2017-05-25 19:25:17 +03:00
Matthias Grundmann
cf4e9e5ce2 Update matrix.cpp
Fix race condition in getDefaultAllocator and setDefaultAllocator interaction / not threadsafe currently
2017-05-24 13:55:18 +03:00
Vadim Pisarevsky
7c3577f7ae Merge pull request #8779 from vpisarev:empty_cmp_fix 2017-05-23 19:06:57 +00:00
Alexander Alekhin
15a2c7724d Merge pull request #8743 from tomoaki0705:featureConvertFp16UMat 2017-05-23 15:32:12 +00:00
Vadim Pisarevsky
4eda8efd42 resolves https://github.com/opencv/opencv/issues/7792 2017-05-23 18:16:40 +03:00
Vadim Pisarevsky
a065e4b9aa Merge pull request #8769 from mshabunin:kw-fixes 2017-05-23 14:59:36 +00:00
Tomoaki Teshima
d81cdb8e1c add OpenCL version of convertFp16 and test
* disable vector operation for now
 * brush up the implementation based on comment
2017-05-23 20:00:21 +09:00
Maksim Shabunin
f23b6ba652 Fixed multidimensional count non-zero IPP implementation 2017-05-23 13:23:59 +03:00
Maksim Shabunin
b04ed5956e Fixed several issues found by static analysis in core module 2017-05-23 12:35:31 +03:00
Vadim Pisarevsky
17b89b2a35 Merge pull request #8770 from alalek:fix_pthreads_default 2017-05-22 22:18:26 +00:00
Alexander Alekhin
16ea72e6b9 build: fix snprintf() usage 2017-05-22 22:24:17 +03:00
Alexander Alekhin
900c406541 core: fix threads count in pthreads 2017-05-22 21:45:25 +03:00
Ryuhei Mori
bb3a416320 Fix cpu features detection on android 2017-05-19 21:19:13 +08:00
Alexander Alekhin
17eef4d8a9 Merge pull request #8596 from nnorwitz:nnorwitz 2017-05-12 19:48:28 +00:00
Philipp Hasper
dcd8589b67 Fixed exp32f() compilation on MSVC 2017-05-10 18:25:39 +02:00
Vadim Pisarevsky
b683e68223 Merge pull request #8398 from woodychow:normL2Sqr_avx2 2017-05-03 14:49:56 +00:00
nnorwitz
256b6bb3db Don't blow out the stack. Use a smaller buffer and prevent buffer overruns with snprintf. 2017-05-03 16:56:09 +03:00
Vadim Pisarevsky
925594d1e3 Merge pull request #7894 from alalek:ocl_program 2017-05-03 13:48:58 +00:00
Alexander Alekhin
75f28245a8 core: fix persistence bug in RAW I/O code
- persistence.cpp code expects special sizeof value for passed structures
- this assumption is lead to memory corruption problems
- fixed/workarounded test to prevent memory corruption on Linux 32-bit systems
2017-04-26 17:19:26 +03:00
Alexander Alekhin
26be2402a3 Merge pull request #8629 from lupustr3:pvlasov/icv2017u2_update2 2017-04-26 10:45:37 +00:00
Pavel Vlasov
11c2ffaf1c Update for IPP for OpenCV 2017u2 integration;
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
2017-04-25 15:53:12 +03:00
Vadim Pisarevsky
96aaac186d Merge pull request #8616 from vpisarev:dnn4 2017-04-25 06:32:16 +00:00
Alexander Alekhin
f1c8094f5f Merge pull request #8575 from lupustr3:pvlasov/icv2017u2_initial_update 2017-04-21 10:55:29 +00:00
Pavel Vlasov
35c7216846 IPP for OpenCV 2017u2 initial enabling patch; 2017-04-20 20:26:30 +03:00
Vadim Pisarevsky
dd54f7a22a got rid of Blob and BlobShape completely; use cv::Mat and std::vector<int> instead 2017-04-19 23:20:17 +03:00
Arnaud Brejeon
636ab095b0 Merge pull request #8535 from arnaudbrejeon:std_array
Add support for std::array<T, N> (#8535)

* Add support for std::array<T, N>

* Add std::array<Mat, N> support

* Remove UMat constructor with std::array parameter
2017-04-19 13:13:39 +03:00
insoow
2922738b6d Merge pull request #8104 from insoow:master
Gemm kernels for Intel GPU (#8104)

* Fix an issue with Kernel object reset release when consecutive Kernel::run calls

Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.

This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* GEMM kernel optimization for Intel GEN

The optimized kernels uses cl_intel_subgroups extension for better
performance.

Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Fix API compatibility error

This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Revert "Fix API compatibility error"

This reverts commit 2ef427db91.

Conflicts:
	modules/core/src/intel_gpu_gemm.inl.hpp

* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"

This reverts commit cc7f9f5469.

* Fix the case of uninitialization D

When C is null and beta is non-zero, D is used without initialization.
This resloves the issue

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* fix potential output error due to 0 * nan

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* whitespace fix, eliminate non-ASCII symbols

* fix build warning
2017-04-19 12:57:54 +03:00
Yuriy Solovyov
26ccc09c46 Fix zlib issue on iOS 2017-04-14 17:16:00 +03:00
Vitaly Tuzov
bf5b7843e8 Extended set of OpenVX HAL calls disabled for small images 2017-04-06 18:17:32 +03:00
Alexander Alekhin
297ba85323 Merge pull request #8441 from alalek:dispatch_mathfuncs_core 2017-04-03 14:03:49 +00:00
Alexander Alekhin
1e6ce1d2f8 core(mathfuncs_core): cpu optimization dispatched code 2017-03-23 16:17:10 +03:00
KUANG, Fangjun
03c4c37969 fix issue 8189. 2017-03-22 22:24:20 +01:00
Vadim Pisarevsky
0b3d13645f Merge pull request #8364 from csukuangfj:patch-2 2017-03-22 14:13:13 +00:00
Alexander Alekhin
ba104b61bf Merge branch 'pr8392' 2017-03-22 13:45:24 +03:00
Vadim Pisarevsky
8abd163464 Merge pull request #8404 from khnaba:stream-with-custom-allocator 2017-03-21 20:06:56 +00:00
Vadim Pisarevsky
e5dbd2c3a5 Merge pull request #8406 from khnaba:dft-as-algorithm 2017-03-21 20:05:54 +00:00
Vadim Pisarevsky
a57d144076 Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00
Naba Kumar
29680100ac Support for creating streams with custom allocator 2017-03-21 14:50:14 +02:00
Naba Kumar
00f3ad7217 Implement DFT as cv::Algorithm to support concurrent streams 2017-03-21 13:55:13 +02:00
Naba Kumar
cdcf44b3ef Expose BufferPool class for external use also 2017-03-21 13:50:02 +02:00
Woody Chow
c370cc10e9 Optimize normL2Sqr_ with AVX2 2017-03-16 14:20:41 +09:00
Woody Chow
a8763c1fec Optimize exp32f with AVX2 2017-03-15 17:03:36 +09:00
KUANG, Fangjun
debc1c4c95 fix an error while setting kernel argument for a 3-D matrix. 2017-03-12 18:29:49 +01:00
KUANG, Fangjun
be7d4608fb Add more comments to the members of CoreTLSData related to OpenCL. 2017-03-12 16:13:00 +01:00
Alexander Alekhin
5d31d6ebbb Merge pull request #8306 from chacha21:portability 2017-03-03 04:46:05 +00:00
chacha21
74abbd0898 Fix compilation when USE_ZLIB is false
create a dummy gzFile type
2017-03-02 16:58:51 +01:00
chacha21
aa1b031274 get rid of warning C4800 under VS2010
the "std::basic_ios::operator bool" differs between C++98 and C++11. The
"double not" syntax is portable and covers both cases with equivalent
meaning
2017-03-02 16:56:20 +01:00
Vadim Pisarevsky
c7049ca627 Merge pull request #8293 from alalek:update_rng_in_parallel_for 2017-03-02 05:51:01 +00:00
Vadim Pisarevsky
ddfe688be6 Merge pull request #8299 from sovrasov:fs_fix_kpts_dmatch_output 2017-03-02 05:46:38 +00:00
Alexander Alekhin
47c4dcc8a3 Merge pull request #8204 from terfendail:ovx_tlcontext 2017-03-01 12:36:37 +00:00
Vladislav Sovrasov
c321d025c4 Fix DMatch and Keypoint I/O in FileStorage 2017-03-01 15:07:38 +03:00
Alexander Alekhin
649bb7ac04 core: parallel_for_(): update RNG state of the main thread 2017-02-28 18:28:15 +03:00
Alexander Alekhin
b28fd79fdc core: parallel_for_(): propagate RNG state from the main thread 2017-02-28 18:22:46 +03:00
Alexander Alekhin
eee638fd81 Merge pull request #8244 from sovrasov:adjust_roi_fix 2017-02-24 11:18:35 +00:00
Vadim Pisarevsky
12d7429ff0 Merge pull request #8064 from terfendail:sgbm_bigbuffer 2017-02-23 20:11:26 +00:00
Vladislav Sovrasov
595437bdd1 hal: replace round() with cvRound() 2017-02-22 14:08:38 +03:00
Vladislav Sovrasov
14451f3f06 core: fix adjustROI behavior on indexes overflow 2017-02-22 14:05:51 +03:00
Vitaly Tuzov
9a4b5a4545 OpenVX calls updated to use single common OpenVX context per thread 2017-02-21 16:08:23 +03:00