opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-11 14:39:11 +08:00

Author	SHA1	Message	Date
ohnozzy	db9f611767	Add OpenCL support to linearPolar & logPolar Add OpenCL support to linearPolar & logPolar. The OpenCL code use float instead of double, so that it does not require cl_khr_fp64 extension, with slight precision lost. Add explicit conversion Add explicit conversion from double to float to eliminate warning during compilation.	2016-04-24 08:37:56 +08:00
Zhigang Gong	0b08d2559e	fix potential race condition in canny.cl. See the below code snippet: while(l_counter != 0) { int mod = l_counter % LOCAL_TOTAL; int pix_per_thr = l_counter / LOCAL_TOTAL + ((lid < mod) ? 1 : 0); for (int i = 0; i < pix_per_thr; ++i) { int index = atomic_dec(&l_counter) - 1; .... } .... barrier(CLK_LOCAL_MEM_FENCE); } If we don't put a barrier before the for loop, then there is a possiblity that some work item enter this loop but the others are not, the the l_counter will be reduced in the for loop and may be changed to zero, and the other work items may can't enter the while loop. If this happens, it breaks the barrier's rule which requires all the work items reach the same barrier. And it may hang the GPU depends on the implementation of opencl platform. This issue is raised at: https://github.com/Itseez/opencv/issues/5175 Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>	2016-03-15 19:11:15 +08:00
Zhigang Gong	0f7de40e66	Fixed the race condition between inc and dec on the l_counter. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>	2015-05-26 22:06:18 +08:00
Zhigang Gong	3c85200989	Avoid negative index for a local buffer in Canny.cl. int pix_per_thr = l_counter / LOCAL_TOTAL + ((lid < mod) ? 1 : 0); The pix_per_thr * LOCAL_TOTAL may be larger than l_counter. Thus the index of l_stack may be negative which may cause serious problems. Let's skip the loop when we get negative index and we need to add back the lcounter to keep its balance and avoid potential negative counter. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>	2015-05-26 08:48:05 +08:00
Pavel Rojtberg	1ea41e7246	fix gftt opencv kernel when using mask	2015-04-22 16:13:50 +02:00
Yan Wang	6e7050555e	Optimize pyrUp_unrolled() by mad function. It could improve performance when image size is large. E.g. OCL_PyrUpFixture_PyrUp.PyrUp/18	2014-11-26 16:55:08 +08:00
Alexander Alekhin	569a95e9e1	Merge pull request #3394 from akarsakov:ocl_canny	2014-11-07 12:42:24 +00:00
Alexander Karsakov	7c870014fb	Correctly unrolled some cycles	2014-11-07 12:13:00 +03:00
Alexander Karsakov	0ec0aeb7d0	Minor optimization for ocl_canny	2014-11-06 13:07:33 +03:00
vbystricky	957e5ef8eb	Fix OpenCL version of HoughLinesP function	2014-11-05 14:31:06 +04:00
Alexander Karsakov	643c906f3d	Added optimized loading to YUV2RGB_422 kernel	2014-10-28 15:07:51 +03:00
Alexander Karsakov	1466621f99	Added loading 4 pixels in line instead of 2 to RGB[A] -> YUV(420) kernel	2014-10-27 16:00:34 +03:00
Alexander Karsakov	60367907fe	Used direct float calculations	2014-10-21 17:18:03 +03:00
Alexander Karsakov	5aa9ac9a77	Added OCL code for YUV422 -> RGB[A]\|BGR[A] color conversion	2014-10-21 17:18:03 +03:00
Alexander Karsakov	c8707b891b	Added OCL code for RGB[A]\|BGR[A] -> YUV_[YV12\|IYUV] color conversion	2014-10-21 17:18:03 +03:00
Alexander Karsakov	1cc17a7186	Added OCL code for YUV2BGR_YV12 and YUV2BGR_IYUV color conversions	2014-10-21 17:18:02 +03:00
Alexander Karsakov	85b60ee3cb	Added support for YUV2RGB[A]_NV21 and YUV2BGR[A]_NV21 conversion	2014-10-21 17:18:02 +03:00
Vadim Pisarevsky	397870d7a5	Merge pull request #3279 from akarsakov:ocl_houghlines	2014-10-09 14:56:45 +00:00
Alexander Karsakov	66a8acfd3d	Optimization for HoughLinesP	2014-10-07 17:53:33 +04:00
Alexander Alekhin	14d5358982	Merge pull request #3210 from akarsakov:ocl_gftt_opt	2014-10-07 09:06:54 +00:00
Alexander Karsakov	eaf5a163b1	Added HoughLinesP OCL implementation	2014-09-29 16:48:16 +04:00
Alexander Karsakov	3695a31606	Combined counter and corner buffers into one	2014-09-29 11:10:57 +04:00
Vadim Pisarevsky	470f427a95	Merge pull request #3232 from Chuanbo-Weng:master	2014-09-18 11:48:29 +00:00
Chuanbo Weng	c5552788c5	Use vload to read unaligned data instead of dereference operator. According to opencl 1.2 spec 6.1.5: For arguments to a __kernel function declared to be a pointer to a data type, the OpenCL compiler can assume that the pointee is always appropriately aligned as required by the data type. The behavior of an unaligned load or store is undefined, except for the vloadn, vload_halfn, vstoren, and vstore_halfn functions defined in section 6.12.7. Original code read data of type T from address not aligned by multiple of sizeof(T), so the result is incorrect. With this patch, the cases ./opencv_perf_imgproc --gtest_filter=OCL_ImgSize_TmplSize_Method_MatType_MatchTemplate.MatchTemplate/* could work well with beignet 0.9.3. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>	2014-09-17 19:28:07 +08:00
Alexander Karsakov	8c08714b8c	Remove two "set" kernel call	2014-09-11 18:11:23 +04:00
vbystricky	b0bf8478e5	Optimization OpenCL version of Filter2D	2014-09-11 12:59:51 +04:00
Alexander Karsakov	39b27a19be	Refactoring and optimization	2014-09-05 12:20:29 +04:00
Alexander Karsakov	d59a6fa518	Optimization for getLines	2014-09-05 11:37:16 +04:00
Alexander Karsakov	fee8f29f48	Refactoring, minor optimization	2014-09-04 16:31:30 +04:00
Alexander Karsakov	07d57db91c	Fixed calculation of l_stack_size	2014-09-03 17:40:17 +04:00
Alexander Karsakov	214dab39f6	Fixed BORDER_REFLECT and BORDER_REFLECT_101 extrapolation for case x > 2*maxV	2014-09-02 11:53:31 +04:00
Alexander Alekhin	b332152bef	Merge pull request #2956 from ilya-lavrenov:tapi_accumulate	2014-08-28 09:08:51 +00:00
Alexander Karsakov	f7aadd07f6	Added getLines, fill_accum_local kernels	2014-08-27 17:57:22 +04:00
Vadim Pisarevsky	d66815978a	Merge pull request #3117 from KruchDmitriy:canny_opt	2014-08-27 10:07:55 +00:00
VBystricky	9ee0789174	Fix issues	2014-08-26 14:39:11 +04:00
vbystricky	e75cd74f5a	Optimize OpenCL version of Laplacian filter for kernel size great than 3	2014-08-25 17:56:09 +04:00
Alexander Karsakov	038bfb98ec	Added fill_accum kernel	2014-08-25 13:55:09 +04:00
Ilya Lavrenov	a350b76738	optimization of cv::accumulate**	2014-08-25 11:25:01 +04:00
Alexander Karsakov	5c1f71de51	Added make_point_list kernel	2014-08-22 16:50:01 +04:00
Vadim Pisarevsky	e7539bd2c8	Merge pull request #3144 from ElenaGvozdeva:ocl_morphSmall	2014-08-22 12:14:06 +00:00
$U-KruchininD-ПК\KruchininD$ U-KruchininD-ПК\KruchininD	6ed168d3af	New optimization for canny new hysteresis delete whitespaces fix problem with mad24 Dynamic work group size dynamic work group size Fix problem with warnings Fix some problems with border Another one fix Delete trailing whitespaces some changes fix problem with warning	2014-08-22 11:22:15 +04:00
Alexander Karsakov	3d222d313b	Fixed range for 'v' channel for 8U images	2014-08-21 17:22:06 +04:00
Elena Gvozdeva	5302e56071	fix for ocl_morphSmall	2014-08-21 16:31:24 +04:00
Vadim Pisarevsky	e9729a9601	multiple yet minor fixes to make most of the tests pass on Mac with Iris graphics	2014-08-16 00:29:10 +04:00
Alexander Karsakov	5898dcae4a	Added ROUNDING_EPS for identical rounding after dividing on different platforms	2014-08-12 14:28:48 +04:00
Alexander Alekhin	d0f789dc90	Merge pull request #3044 from akarsakov:fix_ocl_tests	2014-08-07 20:14:17 +00:00
Alexander Karsakov	44fbfb2cf6	Fixed extrapolation in pyrDown	2014-08-07 10:39:25 +04:00
Alexander Karsakov	2a0b39d30a	Fixed calculate_histogram kernel	2014-08-07 10:39:24 +04:00
Alexander Karsakov	eb9fdb0164	Fixed rounding in remap INTER_LINEAR mode	2014-08-07 10:39:24 +04:00
Alexander Karsakov	fec21239c8	Revert optimization for warpAffine INTER_NEAREST mode	2014-08-07 10:39:18 +04:00

1 2 3 4 5 ...

256 Commits