Commit Graph

2115 Commits

Author SHA1 Message Date
Alexander Alekhin
e49febb70f Merge pull request #10269 from terfendail:softdouble_round 2017-12-11 12:48:03 +00:00
Vadim Pisarevsky
9fa505027a Merge pull request #10263 from mshabunin:embedded-build 2017-12-11 12:42:45 +00:00
Vitaly Tuzov
86b128dbb3 Added implementation of softdouble rounding to int64_t 2017-12-11 14:29:32 +03:00
Maksim Shabunin
7349b8f5ce Build for embedded systems 2017-12-11 13:27:37 +03:00
Alexander Alekhin
a82d2363f4 ocl: refactor Program API
- don't store ProgramSource in compiled Programs (resolved problem with "source" buffers lifetime)
- completelly remove Program::read/write methods implementation:
  - replaced with method to query RAW OpenCL binary without any "custom" data
- deprecate Program::getPrefix() methods
2017-12-05 22:25:14 +03:00
Alexander Alekhin
13c4a02157 ocl: low-level API to support OpenCL binary programs 2017-12-05 22:25:14 +03:00
Vadim Pisarevsky
5ce38e516e Merge pull request #10223 from vpisarev:ocl_mac_fixes
* fixed OpenCL functions on Mac, so that the tests pass

* fixed compile warnings; temporarily disabled OCL branch of TV L1 optical flow on mac

* fixed other few warnings on macos
2017-12-05 13:32:28 +03:00
Alexander Alekhin
0595ab3eef ocl: fix usage of invalid OpenCL cache on mixed 64/32-bit platforms
Observed during launch of 32/64-bit applications on Windows.
Added '32-bit' prefix for 32-bit OpenCL devices. No prefix on 64-bit configurations.
2017-12-01 14:20:18 +03:00
Vadim Pisarevsky
f5dba12762 Merge pull request #10180 from alalek:ocl_avoid_unnecessary_initialization 2017-11-29 11:42:22 +00:00
Roman Cattaneo
d381e499ea Fixes Issue #10181
This PR fixes incorrect division by zero handling in template
specialization of `Div_SIMD` for type `double`.
2017-11-28 13:48:40 +01:00
Alexander Alekhin
0ed3209b00 ocl: avoid unnecessary loading/initializing OpenCL subsystem
If there are no OpenCL/UMat methods calls from application.

OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called

Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
Alexander Alekhin
abad8977b6 Merge pull request #9247 from paroj:wrap_alog_rw 2017-11-28 10:35:33 +00:00
Pavel Rojtberg
6fbf0758bc Python: wrap Algorithm::read and Algorithm::write 2017-11-27 17:04:56 +01:00
Alexander Alekhin
c4b158ff91 Merge pull request #10167 from alalek:ocl_fix_issue_contrib1467 2017-11-27 11:05:07 +00:00
Vadim Pisarevsky
a83c12c3d5 Merge pull request #10154 from alalek:ocl_cleanup_obsolete_cache 2017-11-27 10:04:04 +00:00
Alexander Alekhin
92b35e6758 ocl: fix null pointer access crash 2017-11-27 12:43:29 +03:00
Alexander Alekhin
c38620e966 ocl: update OpenCL runtime loader
- fix compilation on Apple (undefined 'oclpath')
- don't warn on OPENCV_OPENCL_RUNTIME=disabled
2017-11-24 19:18:22 +03:00
Alexander Alekhin
b6abf0d3f9 ocl: drop obsolete cache directories after upgrade of OpenCL driver
Entries with the same platform name, the same device name and with different driver versions
are assumed obsolete.
2017-11-24 17:02:28 +03:00
Alexander Alekhin
e5d1790b7b Merge pull request #10018 from alalek:ocl_binary_cache 2017-11-23 13:37:32 +00:00
Alexander Alekhin
9db5cbf9a4 dnn: sync output/internals blobs back 2017-11-22 14:00:58 +03:00
Alexander Alekhin
8e6280fc8e ocl: binary program cache 2017-11-22 12:56:38 +03:00
Alexander Alekhin
b389c1cfe7 core(ocl): syntax fix 2017-11-20 19:19:35 +03:00
Tomoaki Teshima
3cbe60cca2 Merge pull request #9753 from tomoaki0705:universalMatmul
* add accuracy test and performance check for matmul
  * add performance tests for transform and dotProduct
  * add test Core_TransformLargeTest for 8u version of transform

* remove raw SSE2/NEON implementation from matmul.cpp
  * use universal intrinsic instead of raw intrinsic
  * remove unused templated function
  * add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
  * add v_rotate_left/right in universal intrinsic
  * suppress intrinsic on some function and platform
  * add pure SW implementation of new universal intrinsics
  * add test for new universal intrinsics

* core: prevent memory access after the end of buffer

* fix perf tests
2017-11-20 15:56:53 +03:00
Maksim Shabunin
751cee8e67 Merge pull request #9907 from seiko2plus:vsxFixesImproves 2017-11-16 15:20:16 +00:00
Alexander Alekhin
981009ac1f Merge pull request #9999 from mshabunin:fix-gcc72-warnings 2017-11-07 13:37:25 +00:00
Maksim Shabunin
184daa155f Fixed minor issues reported by GCC 7.2 2017-11-03 18:06:39 +03:00
Alexander Alekhin
9c4f0a984f ocl: drop CV_OclDbgAssert 2017-11-03 13:31:37 +03:00
Alexander Alekhin
8fb48c09f7 ocl: improve debug information 2017-11-03 13:31:37 +03:00
Alexander Alekhin
7db612a545 core(ocl): fix parameters for 'intelblas_gemm_buffer_NT' kernel 2017-11-02 12:50:32 +03:00
Alexander Alekhin
bc9c9abab0 Merge pull request #9877 from mapreri:non-linux 2017-10-30 15:41:36 +00:00
Alexander Alekhin
7809c4156f core(ocl): workaround CL_OUT_OF_RESOURCES error
Flush deallocation queue before calling map/unmap
2017-10-30 17:54:56 +03:00
Sayed Adel
def444d99f core: Several improvements to Power/VSX
- changed behavior of vec_ctf, vec_ctu, vec_cts
  in gcc and clang to make them compatible with XLC
- implemented most of missing conversion intrinsics in gcc and clang
- implemented conversions intrinsics of odd-numbered elements
- ignored gcc bug warning that caused by -Wunused-but-set-variable in rare cases
- replaced right shift with algebraic right shift for signed vectors
  to shift in the sign bit.
- added new universal intrinsics v_matmuladd, v_rotate_left/right
- avoid using floating multiply-add in RNG
2017-10-28 17:46:12 +00:00
Maksim Shabunin
bf418ba342 Merge pull request #9917 from alalek:ocl_cache_program_failures 2017-10-23 12:25:25 +00:00
Alexander Alekhin
734ea77c9a ocl(macosx): fix CL_INVALID_BUILD_OPTIONS for gemm programs
MacOSX OpenCL compiler is very strict to whitespace issues
2017-10-23 13:56:11 +03:00
Alexander Alekhin
d96cac1341 ocl: cache program build failures
To prevent unnecessary compiler invocations
2017-10-23 13:46:56 +03:00
Alexander Alekhin
185faf99bd ocl: simplify ocl::Timer interface 2017-10-18 16:01:21 +03:00
Mattia Rizzolo
d026d7dcfb
Fix build for non-linux architectures but still glibc-based
Exampls of these are gnu/kfreebsd and gnu/hurd, both available as
unofficial Debian ports.

They don't define __linux__ (as they are non-linux…) but still define
__GLIBC__, so check on that.

Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>
2017-10-17 00:45:27 +02:00
Vadim Pisarevsky
1563300197 Merge pull request #9833 from tomoaki0705:universalMathFuncs 2017-10-16 10:46:56 +00:00
Gregory Morse
d30a0c6f03 Merge pull request #9856 from GregoryMorse:patch-1
* Update OpenCVCompilerOptimizations.cmake

Neon not supported on MSVC ARM breaking build fix

* Update OpenCVCompilerOptimizations.cmake

Whitespace

* Update intrin.hpp

Many problems in MSVC ARM builds (at least on VS2017) being fixed in this PR now.

C:\Users\Gregory\DOCUME~1\MYLIBR~1\OPENCV~3\opencv\sources\modules\core\include\opencv2/core/hal/intrin.hpp(444): error C3861: '_tzcnt_u32': identifier not found

* Update hal_replacement.hpp

Passing variadic expansion in a macro to another macro does not work properly in MSVC and a famous known workaround is hereby applied.  Discussion of it: https://stackoverflow.com/questions/5134523/msvc-doesnt-expand-va-args-correctly
Only needed the fix for ARM builds: TEGRA_ macros are used for cv_hal_ functions in the carotene library.

C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): warning C4003: not enough actual parameters for macro 'TEGRA_ADD'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2143: syntax error: missing ')' before ','
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2059: syntax error: ')'

* Update hal_replacement.hpp

All hal_replacement's using carotene\hal\tegra_hal.hpp TEGRA_ functions as macros preprocessed by variadic macros should be changed, identical as was done in core.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): warning C4003: not enough actual parameters for macro 'TEGRA_CVTBGRTOBGR'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): error C2059: syntax error: '=='

* Update OpenCVCompilerOptimizations.cmake

* Update hal_replacement.hpp

* Update hal_replacement.hpp
2017-10-16 12:12:35 +03:00
Tomoaki Teshima
2a781bb616 remove raw SSE2/NEON implementation from mathfuncs.cpp
* replace the implementation by universal intrinsic
  * make sure no degradation happens on ARM platform
2017-10-15 00:24:31 +09:00
Alexander Alekhin
b0c6bd0a5b build: resolve naming issue 2017-10-12 13:28:30 +03:00
Vadim Pisarevsky
7d55c09a9f Merge pull request #9763 from seiko2plus:addVsxCore 2017-10-10 10:00:31 +00:00
Vadim Pisarevsky
bfb12acde6 Merge pull request #9792 from alalek:port_9776 2017-10-09 12:46:06 +00:00
Vadim Pisarevsky
44699c59b3 Merge pull request #9799 from alalek:ocl_program 2017-10-09 12:43:46 +00:00
Wu Zhiwen
dbe9ee0924 ocl: simplify ocl::Timer
Use clFinish to gurantee commands completed, instead of waiting for events.

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
2017-10-09 13:48:38 +08:00
Sayed Adel
d077778074 Added support for VSX 2017-10-09 00:32:29 +00:00
Alexander Alekhin
6be25727ec ocl: refactor program compilation 2017-10-08 19:55:01 +03:00
Alexander Alekhin
04b4495493 ocl: define ProgramSource before program
no changes in code
2017-10-08 19:55:01 +03:00
Igor Wodiany
ffb9554787
Extract code from scalarToRawData
The same code was repeated several time for different data types, so
it was extracted as a templated function to improve maintability and
make a code more clear.
2017-10-07 19:46:45 +01:00
Igor Wodiany
b638aa74d7 Fix a memory leak in the Mat copying constructor
Exception may be rasied inside the body of a copying constructor after
refcount has been increased, and beacause in the case of the exception
destrcutor is never called what causes memory leak. This commit adds a
workaround that calls the release() function before the exception is
thrown outside the contructor.
2017-10-07 10:50:17 +03:00