Commit Graph

260 Commits

Author SHA1 Message Date
Alexander Alekhin
4fa82809df ocl: avoid rescheduling of async kernels 2020-09-18 14:53:50 +00:00
Alexander Alekhin
efcf307b4c ocl: cleanup dead code in case of disabled OpenCL 2020-08-31 11:30:42 +00:00
Alexander Alekhin
00890aecdf core(ocl): fix ocl::Image2d::isFormatSupported()
in case of OPENCV_OPENCL_DEVICE=disabled
2020-08-13 18:33:18 +00:00
Alexander Alekhin
54063c40de core(ocl): options to control buffer access flags
- control using of clEnqueueMapBuffer or clEnqueueReadBuffer[Rect]
- added benchmarks with OpenCL buffer access use cases
2020-04-02 11:11:06 +00:00
Jan Solanti
ad16c243ca core(ocl): Don't query image formats when none exist
clGetSupportedImageFormats returns CL_INVALID_VALUE if called with
num_entries 0 and a non-NULL image_formats pointer so let's not do that.
2020-03-04 14:15:33 +02:00
Vadim Pisarevsky
07b475062f
Merge pull request #16608 from vpisarev:fix_mac_ocl_tests
* fixed several problems when running tests on Mac:
* OCL_pyrUp
* OCL_flip
* some basic UMat tests
* histogram badarg test (out of range access)

* retained the storepix fix in ocl_flip only for 16U/16S datatype, where the OpenCL compiler on Mac generates incorrect code

* moved deletion of ACCESS_FAST flag to non-SVM branch (where SVM is shared virtual memory (in OpenCL 2.x), not support vector machine)

* force OpenCL to use read/write for GPU<=>CPU memory transfers on machines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write)
2020-02-21 16:13:41 +03:00
Igor Murzov
cdbfdcc363 Fix OpenCL device detection when some OpenCL platform has no devices
It's not an error if some OpenCL platform has no devices. This makes
OpenCL device detection work correctly in the following scenario:

$ OPENCV_OPENCL_DEVICE=:GPU: ./opencv_test_dnn

OpenCV version: 4.1.2-dev
OpenCV VCS version: 4.1.2-80-g467748ee98-dirty
Build type: Debug
Compiler: /usr/bin/g++  (ver 7.4.0)
Parallel framework: pthreads
CPU features: SSE SSE2 SSE3 *SSE4.1 *SSE4.2 *FP16 *AVX *AVX2 *AVX512-SKX?
Intel(R) IPP version: ippIP AVX2 (l9) 2019.0.0 Gold (-) Jul 24 2018
OpenCL Platforms:
    AMD Accelerated Parallel Processing
    Portable Computing Language
        CPU: pthread-AMD Ryzen 7 2700X Eight-Core Processor (OpenCL 1.2 pocl HSTR: pthread-x86_64-pc-linux-gnu-znver1)
    NVIDIA CUDA
        dGPU: GeForce GTX 1080 (OpenCL 1.2 CUDA)
Current OpenCL device:
    Type = dGPU
    Name = GeForce GTX 1080
    Version = OpenCL 1.2 CUDA
    Driver version = 430.26
2019-11-05 20:02:39 +03:00
Alexander Alekhin
17e2bf5717 core(tls): implement releasing of TLS on thread termination
- move TLS & instrumentation code out of core/utility.hpp
- (*) TLSData lost .gather() method (to dispose thread data on thread termination)
- use TLSDataAccumulator for reliable collecting of thread data
- prefer using of .detachData() + .cleanupDetachedData() instead of .gather() method

(*) API is broken: replace TLSData => TLSDataAccumulator if gather required
(objects disposal on threads termination is not available in accumulator mode)
2019-10-24 06:36:18 +00:00
Alexander Alekhin
eacadf0e73 core(ocl): add flag OPENCV_OPENCL_ENABLE_MEM_USE_HOST_PTR
to control CL_MEM_USE_HOST_PTR usage
2019-09-25 15:12:36 +03:00
luz.paz
fcc7d8dd4e Fix modules/ typos
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`

backporting of commit: ec43292e1e
2019-08-16 17:34:29 +03:00
Hugo Lindström
2ee00e7f7d Merge pull request #15059 from hugolm84:improved-support-for-wince
* Improve support for Windows Embedded Compact

* Remove redundant set(WINCE true) and format CMake
2019-07-24 23:12:09 +03:00
Alexander Alekhin
b38de57f9a ts: test tags for flexible/reliable tests filtering
- added functionality to collect memory usage of OpenCL sybsystem
- memory usage of fastMalloc() (disabled by default):
  * It is not accurate sometimes - external memory profiler is required.
- specify common `CV_TEST_TAG_` macros
- added applyTestTag() function
- write memory usage / enabled tags into Google Tests output file (.xml)
2019-04-08 19:12:49 +00:00
Alexander Alekhin
8c8715c4dd fix static analysis issues 2019-03-13 17:19:39 +03:00
Alexander Alekhin
66d9a33b50 core(ocl): fix log messages 2019-02-07 16:35:14 +03:00
Alexander Alekhin
4501a2cdea ocl: support empty "ptr only" UMat in Kernel::set()
add messages to avoid silent kernel destruction
2019-01-30 14:51:06 +03:00
Alexander Alekhin
d9d9b05912 core(ocl): add parameter to limit device max workgroup size
used by OpenCV
2018-12-17 18:33:05 +00:00
Alexander Alekhin
9fd822f97e ocl: fix kernels launching with USE_HOST_PTR UMat
created from RAW memory buffers (without proper lifetime management)
2018-11-24 15:37:16 +00:00
Alexander Alekhin
b74b05d1b3 Revert CV_TRY/CV_CATCH macros
This reverts commit 7349b8f5ce (partially).
2018-11-08 19:56:52 +03:00
Alexander Alekhin
11e2a216c5 ocl(win32): bypass deallocate() during process termination 2018-10-10 18:06:06 +00:00
Alexander Alekhin
94201b7cf9 ocl: OPENCV_OPENCL_BUILD_EXTRA_OPTIONS parameter 2018-10-01 17:56:17 +03:00
Rostislav Vasilikhin
be989b3b60 Merge pull request #12637 from savuor:fix/instr_ipp_ocl
Fixes for instrumentation of IPP and OCL (#12637)

* fixed warning about re-declaring variable when both IPP and instrumentation are enabled

* fixed segfault when no funName provided

* compilation fixed when both OCL and instrumentation are enabled
2018-09-27 22:39:06 +03:00
Dmitry Kurtaev
24ab751547 Merge pull request #12565 from dkurt:dnn_non_intel_gpu
* Remove isIntel check from deep learning layers

* Remove fp16->fp32 fallbacks where it's not necessary

* Fix Kernel::run to prevent localsize > globalsize
2018-09-26 16:27:00 +03:00
Hamdi Sahloul
5d54def264 Add semicolons after CV_INSTRUMENT macros 2018-09-14 06:45:31 +09:00
Hamdi Sahloul
a39e0daacf Utilize CV_UNUSED macro 2018-09-07 20:33:52 +09:00
Alexander Alekhin
89528d7c3a core(ocl): don't expose exceptions from OpenCL callback
to avoid silent crashes of OpenCL worker threads.
2018-07-28 10:29:26 +00:00
Alexander Alekhin
b09a4a98d4 opencv: Use cv::AutoBuffer<>::data() 2018-07-04 19:11:29 +03:00
Vadim Pisarevsky
e0dbe5cfcc
handle huge matrices correctly (#11505)
* make sure that the matrix with more than INT_MAX elements is marked as non-continuous, and thus all the pixel-wise functions process it correctly (i.e. row-by-row, not as a single row, where integer overflow may occur when computing the total number of elements)
2018-05-14 15:29:14 +03:00
Alexander Alekhin
576d2dbac0 refactor: don't use CV_ErrorNoReturn() internally 2018-04-24 15:38:42 +03:00
Alexander Alekhin
d76b41b50e ocl: CL_MEM_USE_HOST_PTR workaround test 2018-04-20 14:58:42 +03:00
Alexander Alekhin
670ef403b0 ocl: improve trace messages of OpenCL calls 2018-04-13 14:54:27 +03:00
Alexander Alekhin
9111538bfb core: apply CV_OVERRIDE/CV_FINAL 2018-03-28 17:57:59 +03:00
Alexander Alekhin
b1fc7d46a5 ocl: update getOpenCLErrorString() code 2018-03-01 13:52:43 +03:00
Alexander Alekhin
ebdb0eb0c1 ocl: force clBuildProgram() call after clCreateProgramWithBinary() 2018-01-29 15:51:07 +03:00
Alexander Alekhin
cec700525c core(ocl): fix deadlock in UMatDataAutoLock
UMatData locks are not mapped on real locks (they are mapped to some "pre-initialized" pool).

Concurrent execution of these statements may lead to deadlock:
- a.copyTo(b) from thread 1
- c.copyTo(d) from thread 2
where:
- 'a' and 'd' are mapped to single lock "A".
- 'b' and 'c' are mapped to single lock "B".

Workaround is to process locks with strict order.
2018-01-16 17:33:06 +03:00
Maksim Shabunin
594a93316c Fixed concurrent OpenCL cache folder name generation 2018-01-12 19:03:16 +03:00
Alexander Alekhin
534645a12f ocl: workaround option to disable usage of buffer "Rect" operations 2017-12-22 13:05:03 +03:00
Jiri Horner
3dbf392d48 fix build with intrinsics enabled
* since #10231 opencv with instrumentation does not build
2017-12-17 20:23:15 +01:00
Tomoaki Teshima
267c5a747b suppress warnings on OpenCL build
* stop re-enabling the warning C4127
  * disabling is done in CMakeLists.txt
2017-12-13 15:07:51 +09:00
Vadim Pisarevsky
9fa505027a Merge pull request #10263 from mshabunin:embedded-build 2017-12-11 12:42:45 +00:00
Maksim Shabunin
7349b8f5ce Build for embedded systems 2017-12-11 13:27:37 +03:00
Alexander Alekhin
a82d2363f4 ocl: refactor Program API
- don't store ProgramSource in compiled Programs (resolved problem with "source" buffers lifetime)
- completelly remove Program::read/write methods implementation:
  - replaced with method to query RAW OpenCL binary without any "custom" data
- deprecate Program::getPrefix() methods
2017-12-05 22:25:14 +03:00
Alexander Alekhin
13c4a02157 ocl: low-level API to support OpenCL binary programs 2017-12-05 22:25:14 +03:00
Vadim Pisarevsky
5ce38e516e Merge pull request #10223 from vpisarev:ocl_mac_fixes
* fixed OpenCL functions on Mac, so that the tests pass

* fixed compile warnings; temporarily disabled OCL branch of TV L1 optical flow on mac

* fixed other few warnings on macos
2017-12-05 13:32:28 +03:00
Alexander Alekhin
0595ab3eef ocl: fix usage of invalid OpenCL cache on mixed 64/32-bit platforms
Observed during launch of 32/64-bit applications on Windows.
Added '32-bit' prefix for 32-bit OpenCL devices. No prefix on 64-bit configurations.
2017-12-01 14:20:18 +03:00
Vadim Pisarevsky
f5dba12762 Merge pull request #10180 from alalek:ocl_avoid_unnecessary_initialization 2017-11-29 11:42:22 +00:00
Alexander Alekhin
0ed3209b00 ocl: avoid unnecessary loading/initializing OpenCL subsystem
If there are no OpenCL/UMat methods calls from application.

OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called

Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
Alexander Alekhin
c4b158ff91 Merge pull request #10167 from alalek:ocl_fix_issue_contrib1467 2017-11-27 11:05:07 +00:00
Alexander Alekhin
92b35e6758 ocl: fix null pointer access crash 2017-11-27 12:43:29 +03:00
Alexander Alekhin
b6abf0d3f9 ocl: drop obsolete cache directories after upgrade of OpenCL driver
Entries with the same platform name, the same device name and with different driver versions
are assumed obsolete.
2017-11-24 17:02:28 +03:00
Alexander Alekhin
8e6280fc8e ocl: binary program cache 2017-11-22 12:56:38 +03:00