Commit Graph

57 Commits

Author SHA1 Message Date
Namgoo Lee
5a2faab2e6 CUDA 10.1 Build Issue Fix 2019-03-03 16:40:43 +00:00
Namgoo Lee
2b6be3cb0f cudev - Rework some code
- Use shfl_down, instead of __shfl_down, on warp scan
- Remove race conditions
2019-02-25 13:46:32 +09:00
Namgoo Lee
21eb60f88b cudalegacy: Use safe block scan function 2019-02-13 01:55:02 +09:00
Namgoo Lee
970293a229 __shfl_up_sync with mask for CUDA >= 9
* __shfl_up_sync with proper mask value for CUDA >= 9

* BlockScanInclusive for CUDA >= 9

* compatible_shfl_up for use in integral.hpp

* Use CLAHE in cudev

* Add tests for BlockScan
2019-01-21 15:31:15 +00:00
Tomoaki Teshima
e6ef9221cb fix test failure of cudev
* follow the implementation of Luv2RGBfloat in imgproc/src/color_lab.cpp
  * loosen threshold in cudaimgproc
2018-09-29 23:13:12 +09:00
Tomoaki Teshima
6a5266df79 fix CvFp16Test failure 2018-09-25 15:00:37 +09:00
cyy
8f78a1123b fix uninitialized read errors reported by CUDA-INITCHECK 2018-09-11 14:47:39 +08:00
Hamdi Sahloul
a39e0daacf Utilize CV_UNUSED macro 2018-09-07 20:33:52 +09:00
luz.paz
d05714995c Misc. modules/ cont. pt2
Found via `codespell`
2018-02-13 11:28:11 -05:00
Namgoo Lee
25c36fb05f cv::cuda::cvtColor bug fix (#10640)
* cuda::cvtColor bug fix

Fixed bug in conversion formula between RGB space and LUV space.
Testing with opencv_test_cudaimgproc.exe, this commit reduces the number
of failed tests from 191 to 95. (96 more tests pass)

* Rename variables
2018-01-19 14:06:05 +03:00
catree
6d06fcb414 Fix CUDA integral. 2017-12-04 02:22:52 +01:00
Peter J. Stieber
5669ee815b Replace private.cuda.hpp with conditional include of cuda_fp16.h. 2017-10-03 17:47:52 -07:00
Boris Fomitchev
c48807c383 Merge pull request #9418 from borisfom:cuda9
CUDA9 build fixed, added detection (#9418)

* CUDA9 build fixed, added detection

* Replacing deprecated __shfl_xxx with __shfl_sync, fixing bogus CUDA9 warnings
2017-08-24 07:11:44 +00:00
nnorwitz
9210cefb36 Use %% for inline assembly rather than % so this compiles with clang. 2017-04-05 10:57:50 -07:00
Alexander Alekhin
1c18b1d245 Merge pull request #7370 from souch55:Fixxn 2016-10-01 10:44:56 +00:00
sourin
a34fbf7bb1 Fixed identifiers warns 2016-09-30 15:16:29 +05:30
Tomoaki Teshima
2974b049e7 cudev: add feature to convert FP32(float) from/to FP16(half) on GPU
* add feature of Fp16 on GPU (cudev)
  * add test
  * leave template function as unimplemented to raise error
2016-08-01 00:55:16 +09:00
aravind
f4f1561781 Fixed cv::cuda::reduce bug. 2016-02-27 08:30:10 +05:30
Vladislav Vinogradov
2afb02fcb4 fix BORDER_WRAP processing on Maxwell generation 2015-11-27 16:45:26 +03:00
Vladislav Vinogradov
e22979f334 fix #4343 : cv::cuda::findMinMaxLoc incorrect output for single row matrix 2015-05-18 14:16:55 +03:00
Vadim Pisarevsky
0ff67253f7 Merge pull request #3531 from jet47:cuda-core-refactoring 2014-12-26 12:12:42 +00:00
Vladislav Vinogradov
9b8c3fd675 rewrite cuda::cvtColor with new device layer and fix test failures 2014-12-25 19:23:15 +03:00
Vladislav Vinogradov
8237418be6 add Allocator parameter to cudev::GpuMat_ contructors 2014-12-23 17:42:49 +03:00
Vladislav Vinogradov
53862687d5 rename CudaMem -> HostMem to better reflect its purpose 2014-12-23 17:42:49 +03:00
Vladislav Vinogradov
b5ab82fdbd mark old CUDA device layer as deprecated and remove it from doxygen documentation
add a note to use new cudev module as a replacement
2014-12-23 17:42:14 +03:00
Vladislav Vinogradov
25f33a7e30 update cudev color conversions according to the latest changes in CPU code 2014-12-22 11:48:45 +03:00
Maksim Shabunin
ceb6e8bd94 Doxygen documentation: cuda 2014-12-01 15:47:13 +03:00
Vladislav Vinogradov
f1e44fa5ca fix bug #3678 (cuda::integral failures) 2014-05-14 12:48:12 +04:00
Roman Donchenko
bfa40e180f Removed another usage of __func__, following #1763. 2013-11-11 17:02:50 +04:00
Roman Donchenko
21233656bd Merge pull request #1540 from jet47:gpuarithm-cudev 2013-10-21 16:34:45 +04:00
Roman Donchenko
e290436a4c Merge pull request #1492 from jet47:gpucodec-cudev 2013-10-21 16:30:15 +04:00
Vladislav Vinogradov
23cc31e041 used new device layer for cv::cuda::LUT 2013-10-01 15:24:17 +04:00
Vladislav Vinogradov
1ef211b889 used new device layer for cv::gpu::reduce 2013-10-01 12:18:39 +04:00
Vladislav Vinogradov
e1aa2fd06c added gridMinMaxLoc function 2013-10-01 12:18:39 +04:00
Vladislav Vinogradov
bbd519be42 fixed warnings 2013-10-01 12:18:38 +04:00
Vladislav Vinogradov
045a856c24 used new device layer for cv::gpu::minMax 2013-10-01 12:18:38 +04:00
Vladislav Vinogradov
b705e0d886 used new device layer for cv::gpu::sum 2013-10-01 12:18:38 +04:00
Vladislav Vinogradov
9fe92e2111 renamed grid/glob_reduce.hpp -> grid/reduce.hpp 2013-10-01 12:18:38 +04:00
Vladislav Vinogradov
7b3bbcea71 used new device layer for cv::gpu::transpose 2013-10-01 12:18:37 +04:00
Vladislav Vinogradov
6dbb32a05d switched to new device layer in split/merge 2013-10-01 12:18:37 +04:00
Vladislav Vinogradov
7c8c836a7b switched to new device layer in polar <-> cart 2013-10-01 12:18:37 +04:00
Vladislav Vinogradov
b11cccaaca switched to new device layer in bitwize operations 2013-10-01 12:18:36 +04:00
Vladislav Vinogradov
ef9917ecf1 used new device layer for cv::gpu::compare 2013-10-01 12:18:36 +04:00
Vladislav Vinogradov
9c5da2ea22 used new device layer for cv::gpu::add 2013-10-01 12:18:35 +04:00
Vladislav Vinogradov
32d578f5f0 fixed gridTransform overloads problems 2013-10-01 12:18:35 +04:00
Vladislav Vinogradov
f4fb7fe1be fixed compilation error "ambiguous symbol" on CUDA 5.0:
disabled Texture Reference API for old CUDA toolkits
2013-10-01 12:15:30 +04:00
Vladislav Vinogradov
776c0cb08c switched to new device layer in gpucodec module 2013-09-23 12:16:57 +04:00
Alexander Smorkalov
298a1d50d2 Merge pull request #1299 from jet47:gpu-cuda-rename 2013-09-23 10:31:51 +04:00
Vladislav Vinogradov
20f636fcee fixed cudev compilation for old pre-Fermi archs 2013-09-17 17:43:12 +04:00
Vladislav Vinogradov
cfe4a71dc6 renamed gpu* source to cuda* in core module 2013-09-02 14:00:42 +04:00