This ocl kernel is 46%~171% faster than current laplacian 3x3 ocl kernel in the perf test, with image format "CV_8UC1". Signed-off-by: Li Peng <peng.li@intel.com>