Vadim Pisarevsky
1bdd86edeb
Merge pull request #3523 from jet47:fix-cuda-buffer-pool
2014-12-24 11:20:27 +00:00
Vadim Pisarevsky
cddee22cf2
Merge pull request #3527 from jet47:cuda-deprivate-old-device-layer
2014-12-24 11:20:06 +00:00
Vadim Pisarevsky
214f633dd4
Merge pull request #3541 from jet47:find-cuda-update
2014-12-24 11:14:54 +00:00
Vadim Pisarevsky
5d15676b7b
Merge pull request #3532 from oresths:filter_neon
2014-12-24 09:02:24 +00:00
Vadim Pisarevsky
8d4d36f805
Merge pull request #3538 from alalek:icv_fix_package
2014-12-24 09:00:32 +00:00
Vladislav Vinogradov
e4d0652899
[FindCUDA] improvements for cross-platform build
...
* improve `CUDA_TARGET_CPU_ARCH` cache initialization,
allow to override initial value from calling script;
* add `CUDA_TARGET_OS_VARIANT` option to select OS variant;
* add `CUDA_TARGET_TRIPLET` option to select target triplet from
`${CUDA_TOOLKIT_ROOT_DIR}/targets` folder;
* remove `CUDA_TOOLKIT_TARGET_DIR` option, now it is calculated from
`CUDA_TARGET_TRIPLET`, the old approach still can be used for compatibility;
* for CUDA 6.5 and newer try to locate static libraries too, because
in 6.5 toolkit for ARM cross compilation only static libraries are included.
2014-12-23 17:48:18 +03:00
Vladislav Vinogradov
c4246bc59c
update FindCUDA CMake module to the latest version from upstream
2014-12-23 17:47:04 +03:00
Vladislav Vinogradov
1d82aecf45
minor reorganization for CUDA doxygen groups:
...
move main CUDA group to modules/core/cuda.hpp
2014-12-23 17:42:20 +03:00
Vladislav Vinogradov
b5ab82fdbd
mark old CUDA device layer as deprecated and remove it from doxygen documentation
...
add a note to use new cudev module as a replacement
2014-12-23 17:42:14 +03:00
Vladislav Vinogradov
68e08bbecd
fix null stream initialization for multi-gpu systems
2014-12-23 17:41:24 +03:00
Vladislav Vinogradov
05d40946f3
move StackAllocator to cpp file
...
it is internal class, no need to export it
2014-12-23 17:41:24 +03:00
Vladislav Vinogradov
7ed38b97c3
fix cuda::BufferPool deinitialization
...
The deinitialization of BufferPool internal objects is controled by global
object, but it depends on other global objects, which leads to errors
caused by undefined deinitialization order of global objects.
I merge global objects initialization into single class, which performs
initialization and deinitialization in correct order.
2014-12-23 17:41:24 +03:00
Alexander Alekhin
864ec5ef45
IPPICV: don't use full paths in dependencies
2014-12-23 17:23:35 +03:00
Vadim Pisarevsky
fd6ef87c32
Merge pull request #3529 from jet47:fix-linux-install
2014-12-23 13:38:23 +00:00
Vadim Pisarevsky
95ec92994d
Merge pull request #3536 from mshabunin:doxygen-intro
2014-12-23 13:36:40 +00:00
Maksim Shabunin
06c2a70c49
Fixed some mistakes
2014-12-22 17:21:37 +03:00
Maksim Shabunin
637b615e08
Tutorial: documenting OpenCV
2014-12-22 15:51:37 +03:00
Vadim Pisarevsky
d9f159a554
Merge pull request #3513 from mshabunin:compat-30
2014-12-22 11:58:01 +00:00
Vadim Pisarevsky
c0005fd293
Merge pull request #3520 from JoeHowse:master
2014-12-22 11:14:29 +00:00
Vadim Pisarevsky
f12bd999bf
Merge pull request #3524 from jet47:fix-cuda-warnings
2014-12-22 10:58:07 +00:00
Vadim Pisarevsky
a1df295079
Merge pull request #3525 from jet47:fix-cudev-tests
2014-12-22 10:57:07 +00:00
Vadim Pisarevsky
7b20ce4952
Merge pull request #3490 from oresths:symmcolumnsmall_fix
2014-12-22 10:44:47 +00:00
Vadim Pisarevsky
432546e4c4
Merge pull request #3512 from vins31:OpenNi2_AsusXtion
2014-12-22 10:39:42 +00:00
Vadim Pisarevsky
700a388173
Merge pull request #3499 from StevenPuttemans:fix_2432
2014-12-22 10:29:31 +00:00
Vadim Pisarevsky
5fea331d42
Merge pull request #3510 from boaz001:feature-4057
2014-12-22 10:27:33 +00:00
Vadim Pisarevsky
1ab551487d
Merge pull request #3516 from ana-GT:openni2_defaultMode
2014-12-22 10:26:30 +00:00
Vadim Pisarevsky
060d67517a
Merge pull request #3518 from wangyan42164:ocl_cascade_detect
2014-12-22 10:25:47 +00:00
Vadim Pisarevsky
199f1aec2e
Merge pull request #3519 from fvgoto:patch-1
2014-12-22 10:25:00 +00:00
Vadim Pisarevsky
35d730bf2b
Merge pull request #3528 from ilya-lavrenov:update_android_cmake
2014-12-22 10:22:20 +00:00
Vladislav Vinogradov
ec33c4ae36
increase epsilons for tests due to different optimizations (IPP vs CUDA, float vs double)
2014-12-22 11:48:45 +03:00
Vladislav Vinogradov
25f33a7e30
update cudev color conversions according to the latest changes in CPU code
2014-12-22 11:48:45 +03:00
Vladislav Vinogradov
48c9c24da6
disable -Wshadow warning for CUDA modules:
...
it is generated by CUDA headers and we can't fix it
2014-12-22 11:48:19 +03:00
orestis
fffe2464cd
Change DescriptorExtractor_ORB regression test
...
to compensate for neon ieee754 non-compliancy.
Also changed the comparison between max valid and calculated distance to
make the error message more accurate (in case curMaxDist == maxDist)
2014-12-21 21:27:03 +02:00
orestis
9811a739b0
Change gaussianBlur5x5 perf test epsilon
...
Set it 1 instead of 0.001, as is already done in gaussianBlur3x3. That
will allow integer destination matrices that are not exactly the same,
but very close to the expected result, to pass the test.
2014-12-20 17:14:21 +02:00
orestis
9c6da03504
SymmRowSmallVec_32f 1x5 asymm
...
NEON speedup: 2.31x
Auto-vect speedup: 2.26x
Test kernel: [-0.9432, -1.1528, 0, 1.1528, 0.9432]
2014-12-19 22:51:42 +02:00
orestis
13c0855114
SymmRowSmallVec_32f 1x5
...
NEON speedup: 2.36x
Auto-vect speedup: 2.36x
Test kernel: [0.1, 0.2408, 0.3184, 0.2408, 0.1]
2014-12-19 22:47:06 +02:00
orestis
ed0ce48179
SymmColumnVec_32f16s asymm
...
NEON speedup: 9.46x
Auto-vect speedup: 1x
Test kernel: [-0.9432, -1.1528, 0, 1.1528, 0.9432]
2014-12-19 22:44:39 +02:00
orestis
a2a131799f
SymmColumnVec_32f16s
...
NEON speedup: 8.64x
Auto-vect speedup: 1x
Test kernel: [0.1, 0.2408, 0.3184, 0.2408, 0.1]
2014-12-19 22:42:31 +02:00
orestis
37e018454d
SymmColumnSmallVec_32s16s 3x1 asymm
...
NEON speedup: 2.12x
Auto-vect speedup: 1.01x
Test kernel: [-2, 0, 2]
2014-12-19 22:40:55 +02:00
orestis
4443d6b0a1
SymmColumnSmallVec_32s16s [-1, 0, 1]
...
NEON speedup: 3.27x
Auto-vect speedup: 1.01x
2014-12-19 22:37:52 +02:00
orestis
99e782e62c
SymmColumnSmallVec_32s16s 3x1
...
NEON speedup: 1.75x
Auto-vect speedup: 1x
2014-12-19 22:36:46 +02:00
orestis
33dfeb85be
SymmColumnSmallVec_32s16s [3, 10, 3] Scharr
...
NEON speedup: 2.04x
Auto-vect speedup: 1x
2014-12-19 22:35:52 +02:00
orestis
61a7f48bf4
SymmColumnSmallVec_32s16s [1, -2, 1]
...
NEON speedup: 2.75x
Auto-vect speedup: 1.01x
2014-12-19 22:34:11 +02:00
orestis
4f906372e2
SymmColumnSmallVec_32s16s [1, 2, 1]
...
NEON speedup: 2.66x
Auto-vect speedup: 1x
2014-12-19 22:33:11 +02:00
orestis
80a0364465
SymmColumnVec_32s8u asymm
...
NEON speedup: 2.95x
Auto-vect speedup: 1x
Test kernel: [-0.9432, -1.1528, 0, 1.1528, 0.9432]
2014-12-19 22:29:54 +02:00
orestis
4f5916f12d
SymmColumnVec_32s8u
...
NEON speedup: 1.96x
Auto-vect speedup: 1x
Test kernel: [0.0708, 0.2445, 0.3694, 0.2445, 0.0708]
2014-12-19 22:26:41 +02:00
orestis
1fb966dc61
SymmRowSmallVec_8u32s 1x5 asymm
...
NEON speedup: 3.14x
Auto-vect speedup: 1.6x
Test kernel: [-5, -2, 0, 2, 5]
2014-12-19 22:23:09 +02:00
orestis
2e7b9a2c0f
SymmRowSmallVec_8u32s 1x3 asymmetric
...
NEON speedup: 1.95x
Auto-vect speedup: 1.17x
Test kernel: [-2, 0, 2]
2014-12-19 22:15:37 +02:00
orestis
969a218057
SymmRowSmallVec_8u32s [-1, 0, 1]
...
NEON speedup: 1.84x
Auto-vect speedup: 1.2x
2014-12-19 22:11:52 +02:00
orestis
c0019a42e4
SymmRowSmallVec_8u32s 1x5 general
...
NEON speedup: 3.86x
Auto-vect speedup: 1.67x
Test kernel: [0.0708, 0.2445, 0.3694, 0.2445, 0.0708]
2014-12-19 22:10:58 +02:00