* add accuracy test and performance check for matmul
* add performance tests for transform and dotProduct
* add test Core_TransformLargeTest for 8u version of transform
* remove raw SSE2/NEON implementation from matmul.cpp
* use universal intrinsic instead of raw intrinsic
* remove unused templated function
* add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
* add v_rotate_left/right in universal intrinsic
* suppress intrinsic on some function and platform
* add pure SW implementation of new universal intrinsics
* add test for new universal intrinsics
* core: prevent memory access after the end of buffer
* fix perf tests
Fix GDAL image decoding color problems identified by issue #10089, by: (#10093)
* Fix GDAL image decoding color problems identified by issue #10089, by:
Fixing CV_8UC1 symbol, which should be CV_8UC3 for RGB GDAL color table images.
Fixing image.ptr<VecX>(row,col)[] to be (*image.ptr<VecX>(row,col))[] to correctly access VecX array elements, as ptr<VecX>() returns a pointer to the VecX, not the first element of VecX. This fixes the color problem with color table gif images, and avoids out-of-bounds memory access.
Respecting the color identification of raster bands provided by the GDAL image driver, and produce BGR or BGRA images. Note that color bands of images using the HSL, CMY, CMYK, or YCbCr color space are ignored, rather than converting them to BGR.
* When reading image files using the GDAL decoder, exit with an error if a color band is encountered that isn't used (eg. from CMYK or YCbCbr), rather than silently ignoring the band's data.
Fix build with FFmpeg master. Some deprecated APIs have been removed. (#10011)
* Fix build with FFmpeg master.
* ffmpeg: update AVFMT_RAWPICTURE support removal
Incorrect type of negative_slope result in this bug.
Also and OCL test for darknet to validate this patch.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
convolution kernel use default queue to run, so that ocl::Timer
, to measure the kernel run time, should use the default queue too.
Also remove useless parameter for convolve()
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
When elements are 64 bits, the vec_st_interleave()/vec_ld_deinterleave()
doesn't interleave 4 elements correctly.
For vec_st_interleave(), following is saved into mem:
a0 b0 a1 b1 c0 d0 c1 d1
-> we expected:
a0 b0 c0 d0 a1 b1 c1 d1
for vec_ld_deinterleave(), following is loaded into a b c d for memory
string { 1 2 3 4 5 6 7 8 }:
a: 1 3
b: 2 4
c: 5 7
d: 6 8
-> we expected:
a: 1 5
b: 2 6
c: 3 7
d: 4 8
This patch corrects this behavior.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>