Commit Graph

1648 Commits

Author SHA1 Message Date
Alexander Alekhin
2e45095e8d winrt: fix build 2018-01-31 15:00:45 +03:00
Alexander Alekhin
f57630d92b Merge pull request #10691 from alalek:parallel_for_2018 2018-01-30 14:13:29 +00:00
Ali Sentas
4d80419f29 Fix cv::CommandLineParser::check() documentation 2018-01-29 15:14:21 +03:00
Alexander Alekhin
c8930cc279 opencv_version: dump detected HW features 2018-01-27 17:08:29 +00:00
Alexander Alekhin
01f4a173ab opencv_version: dump OpenCL information via opencv_version
fix missing "opencv2/core/opencl" headers from core module (updated install list)
2018-01-27 17:08:28 +00:00
Alexander Alekhin
c49d5d5252 core: fix pthreads performance
OpenCV pthreads-based implementation changes:
- rework worker threads pool, allow to execute job by the main thread too
- rework synchronization scheme (wait for job completion, threads 'pong' answer is not required)
- allow "active wait" (spin) by worker threads and by the main thread
- use _mm_pause() during active wait (support for Hyper-Threading technology)
- use sched_yield() to avoid preemption of still working other workers
- don't use getTickCount()
- optional builtin thread pool profiler (disabled by compilation flag)
2018-01-26 04:09:11 +00:00
Vadim Pisarevsky
a1d2258ac3 Merge pull request #10635 from csukuangfj:doc-checkVector 2018-01-23 10:42:05 +00:00
Fangjun Kuang
eb2901bd69 Improve the doc for fundamental matrix. 2018-01-19 13:41:47 +01:00
Fangjun Kuang
8efe7bafaa Improve the doc for cv::Mat::checkVector. 2018-01-18 16:48:59 +01:00
Alexander Alekhin
0def2dbb73 Merge pull request #10605 from alalek:ocl_fix_deadlock 2018-01-18 13:39:36 +00:00
Alexander Alekhin
f056e713c3 Merge pull request #10512 from sturkmen72:update_documentation 2018-01-18 04:44:59 +00:00
csukuangfj
decf6cab5e Improve the documentation for cv::completeSymm and cv::RANSACUpdateNumIters. 2018-01-17 08:05:39 +01:00
Alexander Alekhin
4dc788ff84 Merge pull request #10606 from alalek:update_copyright_2018 2018-01-16 17:47:30 +00:00
Alexander Alekhin
cec700525c core(ocl): fix deadlock in UMatDataAutoLock
UMatData locks are not mapped on real locks (they are mapped to some "pre-initialized" pool).

Concurrent execution of these statements may lead to deadlock:
- a.copyTo(b) from thread 1
- c.copyTo(d) from thread 2
where:
- 'a' and 'd' are mapped to single lock "A".
- 'b' and 'c' are mapped to single lock "B".

Workaround is to process locks with strict order.
2018-01-16 17:33:06 +03:00
Maksim Shabunin
8b87c4b96a Fixed several warnings produced by clang 6 and static analyzers 2018-01-16 15:26:28 +03:00
Alexander Alekhin
e6ed853905 copyright: 2018 2018-01-16 13:55:42 +03:00
Suleyman TURKMEN
dcd4f8f5db Update documentation 2018-01-12 22:21:14 +03:00
Fangjun Kuang
a2869109f0 Improve the documentation for cv::Affine3. 2018-01-05 19:35:38 +01:00
Alexander Alekhin
7d67d60fb1 cmake(opt): AVX512_SKX 2017-12-29 07:18:11 +00:00
Alexander Alekhin
898ca38257 cmake: AVX512 -> AVX_512F 2017-12-28 15:20:27 +00:00
Arjan van de Ven
fc8e848a54 Add basic plumbing for AVX512 support
The opencv infrastructure mostly has the basics for supporting avx512 math functions,
but it wasn't hooked up (likely due to lack of users)

In order to compile the DNN functions for AVX512, a few things need to be hooked up
and this patch does that

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-25 21:06:52 +00:00
Alexander Alekhin
047764f476 Merge tag '3.4.0' 2017-12-22 23:22:50 +00:00
Alexander Alekhin
6d4f66472e OpenCV version++
3.4.0
2017-12-22 19:46:21 +00:00
Alexander Alekhin
b450811e4b core(logger): add log level configuration option 2017-12-19 22:57:07 +00:00
Alexander Alekhin
cac4a7e5b5 OpenCV version++
OpenCV 3.4.0-rc
2017-12-16 01:30:43 +03:00
Sayed Adel
1b8acd662f core:ppc Fix several issues for VSX (#10303)
- fix conversion intrinsics compatibility with xlc
- implement odd-elements 2 to 4 conversion intrinsics
- improve implementation of universal intrinsic v_popcount
- rename FORCE_INLINE to VSX_FINLINE in vsx_utils.hpp
2017-12-15 14:03:46 +03:00
Alexander Alekhin
825b14278e core: fix persistence with deprecated traits 2017-12-12 17:07:36 +03:00
Alexander Alekhin
e49febb70f Merge pull request #10269 from terfendail:softdouble_round 2017-12-11 12:48:03 +00:00
Vadim Pisarevsky
9fa505027a Merge pull request #10263 from mshabunin:embedded-build 2017-12-11 12:42:45 +00:00
Vadim Pisarevsky
558b17dede Merge pull request #10231 from alalek:ocl_refactor_program_api 2017-12-11 12:34:22 +00:00
Vitaly Tuzov
86b128dbb3 Added implementation of softdouble rounding to int64_t 2017-12-11 14:29:32 +03:00
Maksim Shabunin
7349b8f5ce Build for embedded systems 2017-12-11 13:27:37 +03:00
Pavel Rojtberg
6fb9d42c3f Hid symbols in static builds, added LTO flags, removed exports from ts 2017-12-07 10:26:48 +03:00
Alexander Alekhin
a82d2363f4 ocl: refactor Program API
- don't store ProgramSource in compiled Programs (resolved problem with "source" buffers lifetime)
- completelly remove Program::read/write methods implementation:
  - replaced with method to query RAW OpenCL binary without any "custom" data
- deprecate Program::getPrefix() methods
2017-12-05 22:25:14 +03:00
Alexander Alekhin
13c4a02157 ocl: low-level API to support OpenCL binary programs 2017-12-05 22:25:14 +03:00
Alexander Alekhin
0105518422 Merge pull request #10190 from seiko2plus:issue10189 2017-11-30 07:16:12 +00:00
Vadim Pisarevsky
f5dba12762 Merge pull request #10180 from alalek:ocl_avoid_unnecessary_initialization 2017-11-29 11:42:22 +00:00
Vadim Pisarevsky
614e254331 Merge pull request #10170 from LaurentBerger:Issue10166 2017-11-29 09:51:20 +00:00
Sayed Adel
6fe6436162 core:ppc Fixed compilation with xlc, clang.
- Use EXPECT_TRUE instead of EXPECT_EQ for comparing NULL in xlc
- Added support for int64 to vec_promote in xlc, clang
- Fixed v_rotate_right in xlc
2017-11-29 07:48:26 +00:00
Vadim Pisarevsky
2a8344f75b Merge pull request #10149 from mshabunin:fix-saturate-intrin 2017-11-28 13:17:10 +00:00
Alexander Alekhin
0ed3209b00 ocl: avoid unnecessary loading/initializing OpenCL subsystem
If there are no OpenCL/UMat methods calls from application.

OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called

Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
2017-11-28 14:02:42 +03:00
Alexander Alekhin
abad8977b6 Merge pull request #9247 from paroj:wrap_alog_rw 2017-11-28 10:35:33 +00:00
Maksim Shabunin
6c135261b2 Universal Intrinsics: aligned v_pack behavior on different platforms, fixed 64-bit register on ARM, added more saturate_cast variants 2017-11-28 13:31:56 +03:00
Pavel Rojtberg
6fbf0758bc Python: wrap Algorithm::read and Algorithm::write 2017-11-27 17:04:56 +01:00
LaurentBerger
606a5fd537 Try to solve issue 10166 2017-11-27 13:13:05 +01:00
Alexander Alekhin
b6abf0d3f9 ocl: drop obsolete cache directories after upgrade of OpenCL driver
Entries with the same platform name, the same device name and with different driver versions
are assumed obsolete.
2017-11-24 17:02:28 +03:00
Alexander Alekhin
e5d1790b7b Merge pull request #10018 from alalek:ocl_binary_cache 2017-11-23 13:37:32 +00:00
Alexander Alekhin
e4aa2ccd66 Merge pull request #10136 from alalek:issue_10134 2017-11-22 18:39:47 +00:00
Alexander Alekhin
e7d62d6ef3 Merge pull request #10126 from alalek:dnn_issue_10125 2017-11-22 18:03:51 +00:00
Alexander Alekhin
3f37be5a30 core: fix compilation of intrinsic code 2017-11-22 17:28:50 +03:00
Alexander Alekhin
9db5cbf9a4 dnn: sync output/internals blobs back 2017-11-22 14:00:58 +03:00
Alexander Alekhin
8e6280fc8e ocl: binary program cache 2017-11-22 12:56:38 +03:00
Maksim Shabunin
e57f22a386 Fixed allocSingleton 2017-11-21 18:07:30 +03:00
Maksim Shabunin
12662e064b align singleton malloc 2017-11-21 17:55:23 +03:00
Maksim Shabunin
e75056a084 static init 2017-11-21 17:55:23 +03:00
Tomoaki Teshima
3cbe60cca2 Merge pull request #9753 from tomoaki0705:universalMatmul
* add accuracy test and performance check for matmul
  * add performance tests for transform and dotProduct
  * add test Core_TransformLargeTest for 8u version of transform

* remove raw SSE2/NEON implementation from matmul.cpp
  * use universal intrinsic instead of raw intrinsic
  * remove unused templated function
  * add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
  * add v_rotate_left/right in universal intrinsic
  * suppress intrinsic on some function and platform
  * add pure SW implementation of new universal intrinsics
  * add test for new universal intrinsics

* core: prevent memory access after the end of buffer

* fix perf tests
2017-11-20 15:56:53 +03:00
Alexander Alekhin
017a38a54e Merge pull request #10108 from mshabunin:fix-eigen-stride 2017-11-17 20:09:08 +00:00
Alexander Alekhin
b45403ed75 Merge pull request #10102 from seiko2plus:coreVsxPacksFix 2017-11-17 19:01:38 +00:00
Maksim Shabunin
f50ec229de Eigen: fix Mat construction stride 2017-11-17 18:27:09 +03:00
Maksim Shabunin
eb136ebba6 Do not reset step for single-row Mat created on user data 2017-11-17 13:15:15 +03:00
Sayed Adel
56bda8917d core:vsx Fix vec_packs in gcc-5 2017-11-16 21:54:56 +00:00
Maksim Shabunin
e730048f69 Merge pull request #10078 from justdoitqd:master 2017-11-16 15:20:44 +00:00
Maksim Shabunin
751cee8e67 Merge pull request #9907 from seiko2plus:vsxFixesImproves 2017-11-16 15:20:16 +00:00
Alexander Alekhin
3a0039d204 core(intrinsics): v_load_low 2017-11-15 16:04:18 +03:00
Simon Guo
2610a47c89 core:ppc Fix 2 interleave logic errors in vsx_utils.hpp
When elements are 64 bits, the vec_st_interleave()/vec_ld_deinterleave()
doesn't interleave 4 elements correctly.

For vec_st_interleave(), following is saved into mem:
	a0 b0 a1 b1 c0 d0 c1 d1
     -> we expected:
	a0 b0 c0 d0 a1 b1 c1 d1

for vec_ld_deinterleave(), following is loaded into a b c d for memory
string { 1 2 3 4 5 6 7 8 }:
	a: 1 3
	b: 2 4
	c: 5 7
	d: 6 8
   -> we expected:
   	a: 1 5
	b: 2 6
	c: 3 7
	d: 4 8

This patch corrects this behavior.

Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
2017-11-13 12:47:10 +08:00
Pavel Rojtberg
7190028b23 FileStorage: use copydoc to add base64 info to constructor 2017-11-10 13:06:36 +01:00
Suleyman TURKMEN
63fb79b519 updates documentation and samples 2017-11-07 19:21:21 +03:00
Sayed Adel
def444d99f core: Several improvements to Power/VSX
- changed behavior of vec_ctf, vec_ctu, vec_cts
  in gcc and clang to make them compatible with XLC
- implemented most of missing conversion intrinsics in gcc and clang
- implemented conversions intrinsics of odd-numbered elements
- ignored gcc bug warning that caused by -Wunused-but-set-variable in rare cases
- replaced right shift with algebraic right shift for signed vectors
  to shift in the sign bit.
- added new universal intrinsics v_matmuladd, v_rotate_left/right
- avoid using floating multiply-add in RNG
2017-10-28 17:46:12 +00:00
Alexander Alekhin
21c8e6d02d Merge tag '3.3.1' 2017-10-23 18:42:41 +03:00
Alexander Alekhin
a871f9e4f7 Merge branch 'update_version' into release 2017-10-23 18:41:12 +03:00
Alexander Alekhin
185faf99bd ocl: simplify ocl::Timer interface 2017-10-18 16:01:21 +03:00
Vadim Pisarevsky
2808bea7fa Merge pull request #9857 from americast:mat_fix 2017-10-16 10:43:37 +00:00
Gregory Morse
d30a0c6f03 Merge pull request #9856 from GregoryMorse:patch-1
* Update OpenCVCompilerOptimizations.cmake

Neon not supported on MSVC ARM breaking build fix

* Update OpenCVCompilerOptimizations.cmake

Whitespace

* Update intrin.hpp

Many problems in MSVC ARM builds (at least on VS2017) being fixed in this PR now.

C:\Users\Gregory\DOCUME~1\MYLIBR~1\OPENCV~3\opencv\sources\modules\core\include\opencv2/core/hal/intrin.hpp(444): error C3861: '_tzcnt_u32': identifier not found

* Update hal_replacement.hpp

Passing variadic expansion in a macro to another macro does not work properly in MSVC and a famous known workaround is hereby applied.  Discussion of it: https://stackoverflow.com/questions/5134523/msvc-doesnt-expand-va-args-correctly
Only needed the fix for ARM builds: TEGRA_ macros are used for cv_hal_ functions in the carotene library.

C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): warning C4003: not enough actual parameters for macro 'TEGRA_ADD'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2143: syntax error: missing ')' before ','
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2059: syntax error: ')'

* Update hal_replacement.hpp

All hal_replacement's using carotene\hal\tegra_hal.hpp TEGRA_ functions as macros preprocessed by variadic macros should be changed, identical as was done in core.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): warning C4003: not enough actual parameters for macro 'TEGRA_CVTBGRTOBGR'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): error C2059: syntax error: '=='

* Update OpenCVCompilerOptimizations.cmake

* Update hal_replacement.hpp

* Update hal_replacement.hpp
2017-10-16 12:12:35 +03:00
Sayan Sinha
60bcb16ca8 Fix typo in mat.hpp 2017-10-14 21:46:11 +05:30
Alexander Alekhin
88225eb65e ocl: fix world compilation on Windows 2017-10-11 19:04:42 +03:00
Alexander Alekhin
024be9b8c9 Merge pull request #9818 from tz70s:issue#9570 2017-10-11 15:19:17 +00:00
tz70s
6c1247b38c fix#9570: implement mat ptr for generic types
The original template based mat ptr for indexing is not implemented,
add the similar implementation as uchar type, but cast to
user-defined type from the uchar pointer.
2017-10-10 21:46:49 +08:00
Vadim Pisarevsky
0739f28e56 Merge pull request #9786 from LaurentBerger:Histo3d 2017-10-10 10:58:34 +00:00
Vadim Pisarevsky
7d55c09a9f Merge pull request #9763 from seiko2plus:addVsxCore 2017-10-10 10:00:31 +00:00
Alexander Alekhin
bd6fb497bc OpenCV version++
OpenCV 3.3.1
2017-10-10 12:29:57 +03:00
LaurentBerger
752f232335 It's done 2017-10-09 22:25:57 +02:00
Sayed Adel
4b968d1fe2 Added universal intrinsic for VSX 2017-10-09 00:32:41 +00:00
Sayed Adel
d077778074 Added support for VSX 2017-10-09 00:32:29 +00:00
Alexander Alekhin
6be25727ec ocl: refactor program compilation 2017-10-08 19:55:01 +03:00
Jasper Shemilt
0136711cf4 Adds fitEllipseAMS to imgproc: The Approximate Mean Square (AMS) proposed by Taubin 1991.
Adds fitEllipseDirect to imgproc: The Direct least square (Direct) method by Fitzgibbon1999.

New Tests are included for the methods.
fitEllipseAMS Tests
fitEllipseDirect Tests

Comparative examples are added to fitEllipse.cpp in Samples.
2017-10-02 16:38:41 +01:00
pengli
e340ff9c3a Merge pull request #9114 from pengli:dnn_rebase
add libdnn acceleration to dnn module  (#9114)

* import libdnn code

Signed-off-by: Li Peng <peng.li@intel.com>

* add convolution layer ocl acceleration

Signed-off-by: Li Peng <peng.li@intel.com>

* add pooling layer ocl acceleration

Signed-off-by: Li Peng <peng.li@intel.com>

* add softmax layer ocl acceleration

Signed-off-by: Li Peng <peng.li@intel.com>

* add lrn layer ocl acceleration

Signed-off-by: Li Peng <peng.li@intel.com>

* add innerproduct layer ocl acceleration

Signed-off-by: Li Peng <peng.li@intel.com>

* add HAVE_OPENCL macro

Signed-off-by: Li Peng <peng.li@intel.com>

* fix for convolution ocl

Signed-off-by: Li Peng <peng.li@intel.com>

* enable getUMat() for multi-dimension Mat

Signed-off-by: Li Peng <peng.li@intel.com>

* use getUMat for ocl acceleration

Signed-off-by: Li Peng <peng.li@intel.com>

* use CV_OCL_RUN macro

Signed-off-by: Li Peng <peng.li@intel.com>

* set OPENCL target when it is available

and disable fuseLayer for OCL target for the time being

Signed-off-by: Li Peng <peng.li@intel.com>

* fix innerproduct accuracy test

Signed-off-by: Li Peng <peng.li@intel.com>

* remove trailing space

Signed-off-by: Li Peng <peng.li@intel.com>

* Fixed tensorflow demo bug.

Root cause is that tensorflow has different algorithm with libdnn
to calculate convolution output dimension.

libdnn don't calculate output dimension anymore and just use one
passed in by config.

* split gemm ocl file

split it into gemm_buffer.cl and gemm_image.cl

Signed-off-by: Li Peng <peng.li@intel.com>

* Fix compile failure

Signed-off-by: Li Peng <peng.li@intel.com>

* check env flag for auto tuning

Signed-off-by: Li Peng <peng.li@intel.com>

* switch to new ocl kernels for softmax layer

Signed-off-by: Li Peng <peng.li@intel.com>

* update softmax layer

on some platform subgroup extension may not work well,
fallback to non subgroup ocl acceleration.

Signed-off-by: Li Peng <peng.li@intel.com>

* fallback to cpu path for fc layer with multi output

Signed-off-by: Li Peng <peng.li@intel.com>

* update output message

Signed-off-by: Li Peng <peng.li@intel.com>

* update fully connected layer

fallback to gemm API if libdnn return false

Signed-off-by: Li Peng <peng.li@intel.com>

* Add ReLU OCL implementation

* disable layer fusion for now

Signed-off-by: Li Peng <peng.li@intel.com>

* Add OCL implementation for concat layer

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>

* libdnn: update license and copyrights

Also refine libdnn coding style

Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>

* DNN: Don't link OpenCL library explicitly

* DNN: Make default preferableTarget to DNN_TARGET_CPU

User should set it to DNN_TARGET_OPENCL explicitly if want to
use OpenCL acceleration.

Also don't fusion when using DNN_TARGET_OPENCL

* DNN: refine coding style

* Add getOpenCLErrorString

* DNN: Use int32_t/uint32_t instread of alias

* Use namespace ocl4dnn to include libdnn things

* remove extra copyTo in softmax ocl path

Signed-off-by: Li Peng <peng.li@intel.com>

* update ReLU layer ocl path

Signed-off-by: Li Peng <peng.li@intel.com>

* Add prefer target property for layer class

It is used to indicate the target for layer forwarding,
either the default CPU target or OCL target.

Signed-off-by: Li Peng <peng.li@intel.com>

* Add cl_event based timer for cv::ocl

* Rename libdnn to ocl4dnn

Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: wzw <zhiwen.wu@intel.com>

* use UMat for ocl4dnn internal buffer

Remove allocateMemory which use clCreateBuffer directly

Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: wzw <zhiwen.wu@intel.com>

* enable buffer gemm in ocl4dnn innerproduct

Signed-off-by: Li Peng <peng.li@intel.com>

* replace int_tp globally for ocl4dnn kernels.

Signed-off-by: wzw <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>

* create UMat for layer params

Signed-off-by: Li Peng <peng.li@intel.com>

* update sign ocl kernel

Signed-off-by: Li Peng <peng.li@intel.com>

* update image based gemm of inner product layer

Signed-off-by: Li Peng <peng.li@intel.com>

* remove buffer gemm of inner product layer

call cv::gemm API instead

Signed-off-by: Li Peng <peng.li@intel.com>

* change ocl4dnn forward parameter to UMat

Signed-off-by: Li Peng <peng.li@intel.com>

* Refine auto-tuning mechanism.

- Use OPENCV_OCL4DNN_KERNEL_CONFIG_PATH to set cache directory
  for fine-tuned kernel configuration.
  e.g. export OPENCV_OCL4DNN_KERNEL_CONFIG_PATH=/home/tmp,
  the cache directory will be /home/tmp/spatialkernels/ on Linux.

- Define environment OPENCV_OCL4DNN_ENABLE_AUTO_TUNING to enable
  auto-tuning.

- OPENCV_OPENCL_ENABLE_PROFILING is only used to enable profiling
  for OpenCL command queue. This fix basic kernel get wrong running
  time, i.e. 0ms.

- If creating cache directory failed, disable auto-tuning.

* Detect and create cache dir on windows

Signed-off-by: Li Peng <peng.li@intel.com>

* Refine gemm like convolution kernel.

Signed-off-by: Li Peng <peng.li@intel.com>

* Fix redundant swizzleWeights calling when use cached kernel config.

* Fix "out of resource" bug when auto-tuning too many kernels.

* replace cl_mem with UMat in ocl4dnnConvSpatial class

* OCL4DNN: reduce the tuning kernel candidate.

This patch could reduce 75% of the tuning candidates with less
than 2% performance impact for the final result.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>

* replace cl_mem with umat in ocl4dnn convolution

Signed-off-by: Li Peng <peng.li@intel.com>

* remove weight_image_ of ocl4dnn inner product

Actually it is unused in the computation

Signed-off-by: Li Peng <peng.li@intel.com>

* Various fixes for ocl4dnn

1. OCL_PERFORMANCE_CHECK(ocl::Device::getDefault().isIntel())
2. Ptr<OCL4DNNInnerProduct<float> > innerProductOp
3. Code comments cleanup
4. ignore check on OCL cpu device

Signed-off-by: Li Peng <peng.li@intel.com>

* add build option for log softmax

Signed-off-by: Li Peng <peng.li@intel.com>

* remove unused ocl kernels in ocl4dnn

Signed-off-by: Li Peng <peng.li@intel.com>

* replace ocl4dnnSet with opencv setTo

Signed-off-by: Li Peng <peng.li@intel.com>

* replace ALIGN with cv::alignSize

Signed-off-by: Li Peng <peng.li@intel.com>

* check kernel build options

Signed-off-by: Li Peng <peng.li@intel.com>

* Handle program compilation fail properly.

* Use std::numeric_limits<float>::infinity() for large float number

* check ocl4dnn kernel compilation result

Signed-off-by: Li Peng <peng.li@intel.com>

* remove unused ctx_id

Signed-off-by: Li Peng <peng.li@intel.com>

* change clEnqueueNDRangeKernel to kernel.run()

Signed-off-by: Li Peng <peng.li@intel.com>

* change cl_mem to UMat in image based gemm

Signed-off-by: Li Peng <peng.li@intel.com>

* check intel subgroup support for lrn and pooling layer

Signed-off-by: Li Peng <peng.li@intel.com>

* Fix convolution bug if group is greater than 1

Signed-off-by: Li Peng <peng.li@intel.com>

* Set default layer preferableTarget to be DNN_TARGET_CPU

Signed-off-by: Li Peng <peng.li@intel.com>

* Add ocl perf test for convolution

Signed-off-by: Li Peng <peng.li@intel.com>

* Add more ocl accuracy test

Signed-off-by: Li Peng <peng.li@intel.com>

* replace cl_image with ocl::Image2D

Signed-off-by: Li Peng <peng.li@intel.com>

* Fix build failure in elementwise layer

Signed-off-by: Li Peng <peng.li@intel.com>

* use getUMat() to get blob data

Signed-off-by: Li Peng <peng.li@intel.com>

* replace cl_mem handle with ocl::KernelArg

Signed-off-by: Li Peng <peng.li@intel.com>

* dnn(build): don't use C++11, OPENCL_LIBRARIES fix

* dnn(ocl4dnn): remove unused OpenCL kernels

* dnn(ocl4dnn): extract OpenCL code into .cl files

* dnn(ocl4dnn): refine auto-tuning

Defaultly disable auto-tuning, set OPENCV_OCL4DNN_ENABLE_AUTO_TUNING
environment variable to enable it.

Use a set of pre-tuned configs as default config if auto-tuning is disabled.
These configs are tuned for Intel GPU with 48/72 EUs, and for googlenet,
AlexNet, ResNet-50

If default config is not suitable, use the first available kernel config
from the candidates. Candidate priority from high to low is gemm like kernel,
IDLF kernel, basick kernel.

* dnn(ocl4dnn): pooling doesn't use OpenCL subgroups

* dnn(ocl4dnn): fix perf test

OpenCV has default 3sec time limit for each performance test.
Warmup OpenCL backend outside of perf measurement loop.

* use ocl::KernelArg as much as possible

Signed-off-by: Li Peng <peng.li@intel.com>

* dnn(ocl4dnn): fix bias bug for gemm like kernel

* dnn(ocl4dnn): wrap cl_mem into UMat

Signed-off-by: Li Peng <peng.li@intel.com>

* dnn(ocl4dnn): Refine signature of kernel config

- Use more readable string as signture of kernel config
- Don't count device name and vendor in signature string
- Default kernel configurations are tuned for Intel GPU with
  24/48/72 EUs, and for googlenet, AlexNet, ResNet-50 net model.

* dnn(ocl4dnn): swap width/height in configuration

* dnn(ocl4dnn): enable configs for Intel OpenCL runtime only

* core: make configuration helper functions accessible from non-core modules

* dnn(ocl4dnn): update kernel auto-tuning behavior

Avoid unwanted creation of directories

* dnn(ocl4dnn): simplify kernel to workaround OpenCL compiler crash

* dnn(ocl4dnn): remove redundant code

* dnn(ocl4dnn): Add more clear message for simd size dismatch.

* dnn(ocl4dnn): add const to const argument

Signed-off-by: Li Peng <peng.li@intel.com>

* dnn(ocl4dnn): force compiler use a specific SIMD size for IDLF kernel

* dnn(ocl4dnn): drop unused tuneLocalSize()

* dnn(ocl4dnn): specify OpenCL queue for Timer and convolve() method

* dnn(ocl4dnn): sanitize file names used for cache

* dnn(perf): enable Network tests with OpenCL

* dnn(ocl4dnn/conv): drop computeGlobalSize()

* dnn(ocl4dnn/conv): drop unused fields

* dnn(ocl4dnn/conv): simplify ctor

* dnn(ocl4dnn/conv): refactor kernelConfig localSize=NULL

* dnn(ocl4dnn/conv): drop unsupported double / untested half types

* dnn(ocl4dnn/conv): drop unused variable

* dnn(ocl4dnn/conv): alignSize/divUp

* dnn(ocl4dnn/conv): use enum values

* dnn(ocl4dnn): drop unused innerproduct variable

Signed-off-by: Li Peng <peng.li@intel.com>

* dnn(ocl4dnn): add an generic function to check cl option support

* dnn(ocl4dnn): run softmax subgroup version kernel first

Signed-off-by: Li Peng <peng.li@intel.com>
2017-10-02 15:38:00 +03:00
Alexander Alekhin
529632f8d0 core: cv::eigenNonSymmetric() via EigenvalueDecomposition 2017-10-01 07:45:32 +00:00
Alexander Alekhin
1283d62e49 ocl: Kernel::runProfiling() 2017-09-19 15:34:35 +03:00
Alexander Alekhin
d9ab31490c ocl: profiling queue 2017-09-19 15:32:15 +03:00
Vadim Pisarevsky
f4136679ea Merge pull request #9551 from ChristofKaufmann:MultiChannelMask 2017-09-18 09:28:34 +00:00
Christof Kaufmann
7ec59fc097 Revert changes of mean and meanStdDev 2017-09-17 21:00:28 +02:00
Alexander Alekhin
9dea296241 Merge pull request #9458 from csukuangfj:fix-doc 2017-09-15 19:35:01 +00:00
Vadim Pisarevsky
5707304777 Merge pull request #9633 from saskatchewancatch:psnr-doc 2017-09-15 12:26:34 +00:00
saskatchewancatch
a90a93b454 i9629 - Added actual documentation for cv::PSNR function 2017-09-14 18:20:00 -06:00
Vadim Pisarevsky
32bb71d686 Merge pull request #9603 from alalek:ocl_device_extensions 2017-09-13 14:43:36 +00:00
Vadim Pisarevsky
f1aa180a40 Merge pull request #9574 from saskatchewancatch:i9482 2017-09-13 13:59:43 +00:00
saskatchewancatch
c9d3c0f206 More whitespace fixes 2017-09-10 19:03:56 -06:00
Alexander Alekhin
169add5aa6 ocl: added cv::ocl::Device::isExtensionSupported() method 2017-09-10 20:32:30 +00:00
Alexander Alekhin
26ad229bd6 Merge pull request #9590 from alalek:ocl_runtime_fix 2017-09-09 19:30:07 +00:00
saskatchewancatch
30ff197f78 Fix whitespace issues 2017-09-08 20:27:03 -06:00
saskatchewancatch
8877e3aedb Adjustmenbts 2017-09-08 19:55:19 -06:00
Alexander Alekhin
576a85368d Merge pull request #9582 from sovrasov:fix_gcc7_macro_warning 2017-09-08 13:31:27 +00:00
Alexander Alekhin
44b75eb116 core: fix OpenCL runtime compilation with HAVE_OPENCL_STATIC 2017-09-08 12:43:33 +03:00
Pavel Vlasov
37ab318657 Compatibility improvement with old IPP versions (tested on 9.0.1);
Manual IPP dispatcher simplification;
2017-09-08 11:08:24 +03:00
saskatchewancatch
fda1e76776 Feedback.
Still need to remove the descriptions of these flags from cv::norm
2017-09-07 22:43:55 -06:00
Vladislav Sovrasov
eeba5c3603 core: fix gcc7 warning about empty VA_ARGS in Assert macro 2017-09-07 16:20:38 +03:00
Vadim Pisarevsky
d25cbaaba8 Merge pull request #9293 from sovrasov:assert_improvement 2017-09-07 11:17:42 +00:00
Maksim Shabunin
c92c99ed0b Enabled forEach for const Mats 2017-09-07 11:35:14 +03:00
saskatchewancatch
570083fb9f i9482:
Removing description for cv::NormTypes that's already in cv::norm

Masking out NORM_TYPE_MASK from docs since it's not intended to be exposed to user
2017-09-06 22:42:01 -06:00
saskatchewancatch
b32685c714 i9482: Improve documentation concerning norm functionality.
Added forkfour Latex command to math js support.

Split cv::norm documentation between the cv::norm and its overload, to make things clearer

Corrected some typos and cleaned up grammar.

Result is clearer documentation for the norms.

Work pending...
2017-09-06 22:35:39 -06:00
Christof Kaufmann
46a668c565 Add multi-channel mask support to mean, meanStdDev and setTo
This adds the possibility to use multi-channel masks for the functions
cv::mean, cv::meanStdDev and the method Mat::setTo. The tests have now a
probability to use multi-channel masks for operations that support them.
This also includes Mat::copyTo, which supported multi-channel masks
before, but there was no test confirming this.
2017-09-04 19:40:27 +02:00
Alexander Alekhin
0451629e22 core(persistence): resolve DMatch/KeyPoint problem 2017-08-31 19:35:48 +03:00
Alexander Alekhin
86b55b3923 core: eliminate CV_ELEM_SIZE() 2017-08-31 19:35:48 +03:00
Alexander Alekhin
7e12c879c2 core: extend traits::Type / traits::Depth for compatible types
DMatch and Keypoint are not compatible types (mixed float/int fields)
2017-08-31 19:35:48 +03:00
Alexander Alekhin
72f789bf34 core: fix type traits 2017-08-31 15:05:46 +03:00
Boris Fomitchev
dde04d5d3e Addressing CUDA9 shfl deprecation 2017-08-29 13:27:21 -07:00
Maarten de Vries
3571e8f263 Restrict std::initializer_list constructors to arithmetic types. 2017-08-29 16:37:20 +02:00
Alexander Alekhin
034aaa7a70 Merge pull request #9465 from tomoaki0705:fixJetsonTK1Build 2017-08-29 11:46:10 +00:00
Tomoaki Teshima
6531fd142c fix build error on Jetson TK1
* guard correctly in header file
  * guard correctly in cmake file
2017-08-29 19:05:13 +09:00
KUANG Fangjun
11fa0094ff Improve the documentation.
Add demo code for cv::reduce, cv::merge and cv::split.
2017-08-28 12:36:23 +02:00
Alexander Alekhin
603fa03ac6 Merge pull request #9441 from wzw-intel:delete_program 2017-08-25 12:03:27 +00:00
Wu Zhiwen
da3da84a20 ocl: Add a function to unload a run-time cached program
This function is the counterpart of "Context::getProg".
With this function, users have chance to unload a program
from global run-time cached programs, and save resource.
2017-08-25 08:42:11 +08:00
Alexander Alekhin
9c14a2f0aa Merge pull request #9395 from lupustr3:pvlasov/icv2017u3_update 2017-08-24 11:48:53 +00:00
Boris Fomitchev
c48807c383 Merge pull request #9418 from borisfom:cuda9
CUDA9 build fixed, added detection (#9418)

* CUDA9 build fixed, added detection

* Replacing deprecated __shfl_xxx with __shfl_sync, fixing bogus CUDA9 warnings
2017-08-24 07:11:44 +00:00
Alexander Alekhin
a893b147dc Merge pull request #9428 from csukuangfj:fix-commandline-parser 2017-08-23 12:48:28 +00:00
Pavel Vlasov
a57718e1ac ICV2017u3 package update;
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
2017-08-23 14:24:43 +03:00
KUANG Fangjun
97ec91ad67 fix cv::CommandLineParser.
It should handle bool value not only of "true" but also of "TRUE" and "True".
2017-08-23 11:38:58 +02:00
KUANG Fangjun
336996152a Improve the documentation. 2017-08-20 17:21:39 +02:00
Alexander Alekhin
71e1889825 core: fix memcpy with zero size 2017-08-17 18:30:31 +03:00
Rostislav Vasilikhin
66b0651607 Merge pull request #9329 from savuor:softfloat_sincos
SoftFloat: added sin, cos and docs (#9329)

* softfloat: comparison operators made inline, min() max() eps() isSubnormal() added

* softfloat: get/set sign/exp

* softfloat: get/set frac

* softfloat: tests rewritten with new tools

* softfloat: added pi(), sin(), cos()

* softfloat: more comments

* softfloat: updated sincos arg reduction

* softfloat: initial tests for sincos added

* softfloat: test works, code cleanup is pending

* softfloat: sincos argreduce rewritten

* softfloat: sincos refactored and simplified

* softfloat sincos: epsilons calibrated

* softfloat: junk code removed from tests

* softfloat: docs added

* inline comparisons undone; warning fixed
2017-08-15 09:23:26 +00:00
Alexander Alekhin
803274e207 Merge pull request #9358 from azatsman:master 2017-08-15 09:16:45 +00:00
Alexander Alekhin
a048cb9f0d Merge pull request #9338 from dkurt:fix_ocl 2017-08-14 12:56:07 +00:00
Alex Zatsman
e2bfd1a036 Changed NORM_RELATIVE_INF, NORM_RELATIVE_L1 and NORM_RELATIVE_L2 to
NORM_RELATIVE | NORM_INF, NORM_RELATIVE | NORM_L1 and NORM_RELATIVE | NORM_L2
respectively in the documentation for cv::norm and cv::NormTypes
2017-08-13 11:55:35 -04:00
Alexander Alekhin
fa288af58b Merge pull request #9343 from PhilLab:patch-4 2017-08-10 10:25:36 +00:00
Philipp Hasper
2c7a15b195 Clarified documentation cv::RotatedRect::points 2017-08-10 11:06:40 +02:00
Dmitry Kurtaev
41519d3ac0 Fixed some OpenCL interface bugs 2017-08-09 11:54:55 +03:00
Vladislav Sovrasov
9a10bdbae5 core: use new assert in matmul.cpp 2017-08-08 23:00:11 +03:00
KUANG, Fangjun
4bbe67451d fix some typos in the documentation. 2017-08-08 17:32:04 +02:00
Alexander Alekhin
87c27a074d Merge tag '3.3.0'
OpenCV 3.3.0
2017-08-04 00:00:18 +00:00
Alexander Alekhin
4af3ca4e4d OpenCV version++
OpenCV 3.3.0
2017-08-03 23:58:23 +00:00
Alexander Alekhin
922ac1a1ec Merge pull request #9303 from alalek:akaze_update 2017-08-03 17:17:03 +00:00
Alexander Alekhin
c95a97389d Merge pull request #9235 from sturkmen72:patch-3 2017-08-03 17:04:28 +00:00
Alexander Alekhin
9ca39821c8 core: divUp function 2017-08-03 19:51:45 +03:00
Vladislav Sovrasov
5375a77f84 core: add multi-argument CV_Assert 2017-08-03 15:31:05 +03:00
Alexander Alekhin
321c0ec533 core: empty() for Rect/Size templates
Check for empty objects via .area() is not a good practice due overflows
2017-08-01 17:19:18 +03:00
Alexander Alekhin
37a7e08b38 Merge pull request #9097 from piotr-semenov:fix/cv_rect_monoid_identity 2017-08-01 13:06:37 +00:00
Alexander Alekhin
728bd68977 Merge pull request #9272 from tomoaki0705:fixCudaBuild 2017-07-31 12:02:40 +00:00
Tomoaki Teshima
1c49796e8e guad for CUDA correctly 2017-07-31 18:42:36 +09:00
Suleyman TURKMEN
89480801b8 some improvements on tutorials 2017-07-29 20:08:19 +03:00
Alexander Alekhin
d35422b523 core(tls): hide assertions from Thread Sanitizer 2017-07-27 17:31:51 +03:00
Alexander Alekhin
602f047fe8 build: replace WIN32 => _WIN32 2017-07-25 13:30:48 +03:00
Alexander Alekhin
5bc291937f test: FileStorage format regression test 2017-07-20 19:58:10 +03:00
Tomoaki Teshima
71496e3be4 fix build error on Visual Studio 2012 2017-07-20 22:56:05 +09:00
Alexander Alekhin
ca479c3f5b Merge pull request #9161 from alalek:separate_debug_symbols 2017-07-19 15:34:43 +00:00
Alexander Alekhin
acc8589083 core: clarify documentation of cv::Mat::deallocate() method 2017-07-17 13:31:47 +03:00
Vladislav Sovrasov
e5fbb4f5d2 Merge pull request #9034 from sovrasov:mats_from_initializer_list
Add constructors taking initializer_list for some of OpenCV data types (#9034)

* Add a constructor taking initializer_list for Matx

* Add a constructor taking initializer list for Mat and Mat_

* Add one more method to initialize Mat to the corresponding tutorial

* Add a note how to initialize Matx

* CV_CXX_11->CV_CXX11
2017-07-14 17:17:09 +00:00
Alexander Alekhin
11626fe32c Merge pull request #8975 from sovrasov:fs_additional_errors 2017-07-14 17:13:50 +00:00
Alexander Alekhin
4e39d0371d Merge pull request #9074 from alalek:cpu_dispatch_core_hamming
cpu dispatch(core): hamming
2017-07-14 16:48:07 +00:00
Alexander Alekhin
928bfe0b93 Merge pull request #9088 from sovrasov:no_nostl
core: get rid of OPENCV_NOSTL definition
2017-07-14 16:26:03 +00:00
Alexander Alekhin
e251ed7773 Merge pull request #9122 from ivsgroup:fix_msvc_virtual_destructor 2017-07-14 14:37:22 +00:00
Alexander Alekhin
10e6491c22 Merge pull request #9024 from tomoaki0705:featureDispatchAccumulate 2017-07-14 14:30:06 +00:00
Alexander Alekhin
f448d75aa8 build: added DEBUG build guard
To prevent linkage of binary incompatible DEBUG/RELEASE binaries/runtimes
2017-07-14 01:25:31 +03:00
Alexander Alekhin
9b9e685dbc Merge pull request #9142 from alalek:vzeroupper_guard_unused_warning 2017-07-12 16:44:00 +00:00
Alexander Alekhin
5ebfb52a4a ipp(minmaxIdx): disable SSE4.2 optimizations for 32f datatype
NaN values handling issue
2017-07-12 16:06:18 +03:00
PkLab.net
6dd9e18b2e add std::string overload for cv::read() 2017-07-12 15:51:11 +03:00
Alexander Alekhin
e7cc2eea1d build: fix unused variable warning for vzeroupper guard 2017-07-11 16:46:35 +03:00
Pascal Thomet
309c962169 core/bufferpool.hpp: let msvc accept a non virtual protected destructor
BufferPoolController has a non virtual protected destructor (which is legitimate)

However, Visual Studio sees this as a bug, if you enable more warnings, like below
```
add_compile_options(/W3)     # level 3 warnings
add_compile_options(/we4265) # warning about missing virtual destructors
```

This is a proposition in order to silence this warning.

See https://github.com/ivsgroup/boost_warnings_minimal_demo for a demo of the same problem
with boost/exception.hpp
2017-07-08 16:15:26 +02:00
Alexander Alekhin
da8dbf6cf5 ocl: async cl_buffer cleanup queue (for event callback) 2017-07-07 13:41:20 +03:00
Tomoaki Teshima
e7d5dbfec0 dispatch accumulate series
- use universal intrinsic for base
 - dispatch for float/double version using AVX
 - AVX2 optimization not done yet
2017-07-07 18:45:30 +09:00
Piotr Semenov
c5b5d5c8d3 Fix. Now cv::Rect() is the identity under cv::Rect::operator| operation 2017-07-05 19:01:13 +03:00
Vladislav Sovrasov
2a2a1dc5b4 Get rid of OPENCV_NOSTL definition 2017-07-04 14:17:02 +03:00
Alexander Alekhin
7b8e6307f8 Merge pull request #9080 from alalek:version_3.3.0-rc
version 3.3.0-rc
2017-07-03 16:21:45 +00:00
Tony Lian
c8783f3e23 Merge pull request #9075 from TonyLianLong:master
Remove unnecessary Non-ASCII characters from source code (#9075)

* Remove unnecessary Non-ASCII characters from source code

Remove unnecessary Non-ASCII characters and replace them with ASCII
characters

* Remove dashes in the @param statement

Remove dashes and place single space in the @param statement to keep
coding style

* misc: more fixes for non-ASCII symbols

* misc: fix non-ASCII symbol in CMake file
2017-07-03 16:14:17 +00:00
Alexander Alekhin
8aa3011f2d Merge pull request #9064 from sadika9:patch-1 2017-07-03 16:07:53 +00:00
Alexander Alekhin
1b8d363231 version 3.3.0-rc 2017-07-03 16:47:05 +03:00
Vladislav Sovrasov
267fdc4c91 Add a note about cxx11 range-based loop in Mat_ documentation 2017-07-03 12:49:11 +03:00
Alexander Alekhin
b66c349bba core(stat): add required CV_AVX_GUARD
Added guard with 'vzeroupper' instruction
2017-07-02 22:45:10 +00:00
Sadika Sumanapala
40e1f2fc03 Fix style 2017-07-01 06:59:27 +05:30
Maksim Shabunin
e0393f8557 Fixed some issues found by static analysis (4th round) 2017-06-30 12:26:53 +03:00
Vadim Pisarevsky
5f1b6ee889 Merge pull request #9017 from sovrasov:parallel_for_cxx11 2017-06-29 11:12:57 +00:00
Alexander Alekhin
b9a2d7b600 build: remove #define to prevent unexpected impact on user applications 2017-06-28 18:50:55 +03:00
Vladislav Sovrasov
08db55fb62 core: add CV_CXX_11 flag to cvdef.h 2017-06-28 16:17:53 +03:00
Vladislav Sovrasov
3c748ccf10 core: add an ability to use cxx11 lambda as a parallel_for_ body 2017-06-28 16:16:05 +03:00
Alexander Alekhin
dcf3d988d5 Merge pull request #8543 from csukuangfj:fix-String 2017-06-28 11:20:29 +00:00
Vadim Pisarevsky
8b3d6603d5 another round of dnn optimization (#9011)
* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)

* also, added missing copyrights to many of the layer implementation files

* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
2017-06-28 11:15:22 +03:00
Alexander Alekhin
82ec76c123 Merge pull request #8990 from mshabunin:fix-static-2 2017-06-27 14:53:26 +00:00
Alexander Alekhin
f8a75c4361 dispatch: added CV_TRY_${OPT} macro, fix dnn build
- 1: OPT is available directly or via dispatcher
- 0: optimization is not compiled at all
2017-06-27 17:05:15 +03:00
Maksim Shabunin
32d4af36e2 Fixing some static analysis issues 2017-06-27 14:30:26 +03:00
Alexander Alekhin
006966e629 trace: initial support for code trace 2017-06-26 17:07:13 +03:00
James Clarke
25020f2672 fast_math.hpp: Use __asm__ rather than asm; fixes including with -std=c99 2017-06-23 15:28:09 +01:00
Vadim Pisarevsky
fa7e7e0ff9 Merge pull request #8900 from alalek:update_android_build 2017-06-23 10:58:53 +00:00
Alexander Alekhin
3e3e2dd512 android: make optional "cpufeatures", build fixes for NDK r15 2017-06-21 13:34:19 +03:00
Rostislav Vasilikhin
29593635ed licence updated 2017-06-14 21:20:10 +03:00
Alexander Alekhin
e23b59da5c build: fix v_reduce_sum4 (requires SSE3) 2017-06-14 09:37:06 +00:00
Vadim Pisarevsky
fbafc700ea added v_reduce_sum4() universal intrinsic; corrected number of threads in cv::getNumThreads() in the case of GCD 2017-06-13 18:04:00 +03:00
Alexander Alekhin
71517a910a build: fix errors for MSVS2010-2013, reduce default softfloat scope 2017-06-08 01:09:21 +00:00
Rostislav Vasilikhin
c6a3a18894 SoftFloat integrated (#8668)
* everything is put into softfloat.cpp and softfloat.hpp

* WIP: try to integrate softfloat into OpenCV

* extra functions removed

* softfloat made stateless

* CV_EXPORTS added

* operators fixed

* exp added, log: WIP

* log32 fixed

* shorter names; a lot of TODOs

* log64 rewritten

* cbrt32 added

* minors, refactoring

* "inline" -> "CV_INLINE"

* cast to bool warnings fixed

* several warnings fixed

* fixed warning about unsigned unary minus

* fixed warnings on type cast

* inline -> CV_INLINE

* special cases processing added (NaNs, Infs, etc.)

* constants for NaN and Inf added

* more macros and helper functions added

* added (or fixed) tests for pow32, pow64, cbrt32

* exp-like functions fixed

* minor changes

* fixed random number generation for tests

* tests for exp32 and exp64: values are compared to SoftFloat-based naive implementation

* minor warning fix

* pow(f, i) 32/64: special cases handling added

* unused functions removed

* refactoring is in progress (not compiling)

* CV_inline added

* unions {uint_t, float_t} removed

* tests compilation fixed

* static const members -> static methods returning const

* reinterpret_cast

* warning fixed

* const-ness fixed

* all FP calculations (even compile-time) are done in SoftFloat + minor fixes

* pow(f, i) removed from interface (can cause incorrect cast) to internals of pow(f, f), tests fixed

* CV_INLINE -> inline

* internal constants moved to .cpp file

* toInt_minMag() methods merged into toInt() methods

* macros moved to .cpp file

* refactoring: types renamed to softfloat and softdouble; explicit constructors, etc.

* toFloat(), toDouble() -> operator float(), operator double()

* removed f32/f64 prefixes from functions names

* toType() methods removed, round() and trunc() functions added

* minor change

* minors

* MSVC: warnings fixed

* added int cvRound(), cvFloor, cvCeil, cvTrunc, saturate_cast<T>()

* typo fixed

* type cast fixed
2017-05-29 17:07:25 +03:00
catree
542cdb2c39 Improve solvePnP doc, add assert >= 4 in solvePnP, escape underscore character for Scalar_ documentation.
Add reference to SOLVEPNP_ITERATIVE in the doc.
2017-05-29 14:59:14 +02:00
Vadim Pisarevsky
ee257ffe9e Merge pull request #8455 from terfendail:ovxhal_skipsmall 2017-05-26 12:10:18 +00:00
Vitaly Tuzov
1d62a025b3 Moved size restrictions for OpenVX processed images to corresponding cpp files 2017-05-25 19:25:17 +03:00
mschoeneck
4a4d94f266 Merge pull request #8694 from mschoeneck:Canny
Parallelize Canny with custom gradient (#8694)

* New Canny implementation. Restructuring code in parallelCanny class. Align mag buffer and map.

* Fix warnings.

* Missing SIMD check added.

* Replaced local trailingZeros in contours.cpp. Use alignSize in canny.cpp

* Fix warnings in alignSize and allocate just minimum extra columns.

* Fix another warning in map.create.

* Exchange for loop by do loop to avoid double check at the beginning.
Define extra SIMD CANNY_CHECK to avoid unnecessary continue.
2017-05-24 16:20:25 +03:00
krishraghuram
9ea2f5211e Correct the existing documented T-API functions to match the doxygen format (#8758)
* Correct the existing documented T-API functions to match the doxygen format.

* docs: fix comments style

* T-API documentation: minor formatting changes
2017-05-24 13:31:35 +03:00
Vadim Pisarevsky
a065e4b9aa Merge pull request #8769 from mshabunin:kw-fixes 2017-05-23 14:59:36 +00:00
Alexander Alekhin
0448260ed7 Merge pull request #8542 from jveitchmichaelis:update-cudadevo-doc
Update documentation for getCudaEnabledDeviceCount
2017-05-23 17:07:20 +03:00
Maksim Shabunin
b04ed5956e Fixed several issues found by static analysis in core module 2017-05-23 12:35:31 +03:00
Alexander Alekhin
c5e9d1adae macro for static analysis tools 2017-05-23 12:35:31 +03:00
cDc
003745432f fix Mat_ release #8680 2017-05-23 12:19:57 +03:00
Vadim Pisarevsky
925594d1e3 Merge pull request #7894 from alalek:ocl_program 2017-05-03 13:48:58 +00:00
Vadim Pisarevsky
dd81c29834 Merge pull request #8359 from csukuangfj:patch-fix-error-code-documentation 2017-05-03 13:48:28 +00:00
Vadim Pisarevsky
fe2416575b Merge pull request #8432 from csukuangfj:issue-8411 2017-05-03 12:39:55 +00:00
Vadim Pisarevsky
b92bbffa1a Merge pull request #8557 from grundman:patch-3 2017-05-03 12:12:59 +00:00
André Mewes
70e6391f38 create homogeneous affine matrix when constructing from 4x3 cv::Mat 2017-05-02 14:09:20 +03:00
Robert Bragg
8f5ea7deda core: avoid clash with _N define from ctype.h in headers
This updates the public headers to use _Nm instead of _N in templates
since _N is defined by the widely used ctype.h.
2017-04-27 14:45:24 +01:00
Alexander Alekhin
26be2402a3 Merge pull request #8629 from lupustr3:pvlasov/icv2017u2_update2 2017-04-26 10:45:37 +00:00
Pavel Vlasov
11c2ffaf1c Update for IPP for OpenCV 2017u2 integration;
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
2017-04-25 15:53:12 +03:00
Vadim Pisarevsky
96aaac186d Merge pull request #8616 from vpisarev:dnn4 2017-04-25 06:32:16 +00:00
Peter Würtz
4c095a76c0 Add docstring for UMat::handle 2017-04-22 09:44:29 +02:00
Alexander Alekhin
f1c8094f5f Merge pull request #8575 from lupustr3:pvlasov/icv2017u2_initial_update 2017-04-21 10:55:29 +00:00
Pavel Vlasov
35c7216846 IPP for OpenCV 2017u2 initial enabling patch; 2017-04-20 20:26:30 +03:00
Vadim Pisarevsky
dd54f7a22a got rid of Blob and BlobShape completely; use cv::Mat and std::vector<int> instead 2017-04-19 23:20:17 +03:00
Arnaud Brejeon
636ab095b0 Merge pull request #8535 from arnaudbrejeon:std_array
Add support for std::array<T, N> (#8535)

* Add support for std::array<T, N>

* Add std::array<Mat, N> support

* Remove UMat constructor with std::array parameter
2017-04-19 13:13:39 +03:00
insoow
2922738b6d Merge pull request #8104 from insoow:master
Gemm kernels for Intel GPU (#8104)

* Fix an issue with Kernel object reset release when consecutive Kernel::run calls

Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.

This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* GEMM kernel optimization for Intel GEN

The optimized kernels uses cl_intel_subgroups extension for better
performance.

Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Fix API compatibility error

This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* Revert "Fix API compatibility error"

This reverts commit 2ef427db91.

Conflicts:
	modules/core/src/intel_gpu_gemm.inl.hpp

* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"

This reverts commit cc7f9f5469.

* Fix the case of uninitialization D

When C is null and beta is non-zero, D is used without initialization.
This resloves the issue

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* fix potential output error due to 0 * nan

Signed-off-by: Woo, Insoo <insoo.woo@intel.com>

* whitespace fix, eliminate non-ASCII symbols

* fix build warning
2017-04-19 12:57:54 +03:00
Vitaly Tuzov
9dc36a1ece Tuned restrictions for Canny, Warp, FAST, Accumulate and Convolution OpenVX HAL calls on small images 2017-04-14 13:20:25 +03:00
Alexander Alekhin
8ba95cd498 Merge pull request #8548 from csukuangfj:fix-typo-RNG 2017-04-11 11:46:06 +00:00
Vitaly Tuzov
87bb74312b Disabled vxuConvolution call for Sobel, GaussianBlur and Box filter evaluation 2017-04-11 14:11:55 +03:00
Matthias Grundmann
fce7469961 Update utility.hpp
Adding missing header for ostream decl. in line 384
2017-04-10 12:04:09 -07:00
Fangjun KUANG
4065778255 fix typos. 2017-04-10 09:32:50 +02:00
Vitaly Tuzov
0f1a56da7c Changed restrictions for OpenVX HAL calls on small images 2017-04-09 01:17:57 +03:00
Fangjun KUANG
ff31d069d0 avoid allocating memory for string with a length of zero.
Remove the specifier "explicit", because the constructor has no parameter. There is no point to add it here.
2017-04-07 20:46:17 +02:00
jveitchmichaelis
8f19363c07 Update documentation for getCudaEnabledDeviceCount
Inform users that getCudaEnabledDeviceCount can return -1 in some cases.
2017-04-07 14:30:14 +01:00
nnorwitz
24e8cd1a78 Use %% for inline assembly rather than % so this compiles with clang.
Same as 9210cefb36 but for this file too.
2017-04-06 12:54:56 -07:00
Vitaly Tuzov
bf62dca45a Extended restrictions for OpenVX HAL calls on small images 2017-04-06 18:17:34 +03:00
Vitaly Tuzov
bf5b7843e8 Extended set of OpenVX HAL calls disabled for small images 2017-04-06 18:17:32 +03:00
Alexander Alekhin
e5d9b608c4 cmake: fix fp16 support 2017-04-04 20:34:58 +03:00
Alexander Alekhin
297ba85323 Merge pull request #8441 from alalek:dispatch_mathfuncs_core 2017-04-03 14:03:49 +00:00
Vadim Pisarevsky
4aa51f6a32 Merge pull request #8484 from berak:patch-2 2017-04-03 09:57:58 +00:00
Alexander Alekhin
bbdd8ba078 Merge pull request #8506 from sergiud:mat-move-assignment-dont-copy 2017-04-02 10:13:17 +00:00
Sergiu Deitsch
4f31759965 prevent copying in cv::Mat_<T> move assignment 2017-04-01 21:53:30 +02:00
Alexander Alekhin
b67382cbeb Merge pull request #8494 from tomoaki0705:fixWarningCuda 2017-03-31 15:42:01 +00:00
Tomoaki Teshima
507071cc6f suppress warnings on Jetson TK1 2017-03-31 08:20:59 +09:00
berak
3e0b63f65b fix comment in optim.hpp 2017-03-30 11:07:52 +02:00
jexner
b45e784beb Fix segmentation fault in cv::Mat::forEach
This issue concerns only matrices with more dimensions than columns.
See https://github.com/opencv/opencv/issues/8447
2017-03-23 18:48:59 +01:00
Fangjun KUANG
da94d85789 add more info to the error code. 2017-03-23 14:40:34 +01:00
Fangjun KUANG
f82d64c6e5 Add more info to the error code. 2017-03-23 14:34:24 +01:00
Alexander Alekhin
17e5e4cd5a core: CPU target dispatcher update
- use suffixes like '.avx.cpp'
- added CMake-generated files for '.simd.hpp' optimization approach
- wrap HAL intrinsic headers into separate namespaces for different build flags
- automatic vzeroupper insertion (via CV_INSTRUMENT_REGION macro)
2017-03-23 16:12:11 +03:00
Fangjun KUANG
94521629ab fix issue 8411. 2017-03-22 23:24:47 +01:00
KUANG, Fangjun
eae1ebfd29 fix issue 8411. 2017-03-22 22:03:29 +01:00
Vadim Pisarevsky
8abd163464 Merge pull request #8404 from khnaba:stream-with-custom-allocator 2017-03-21 20:06:56 +00:00
Vadim Pisarevsky
e5dbd2c3a5 Merge pull request #8406 from khnaba:dft-as-algorithm 2017-03-21 20:05:54 +00:00
Vadim Pisarevsky
a57d144076 Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00