Paul E. Murphy
c1cdb2416a
imgproc(resize): improve 8u3 HResize vector exit calc
...
Actually, we can do this in constant time. xofs always
contains same or increasing offset values. We can instead
find the most extreme value used and never attempt to load it.
Similarly, we can note for all dx >= 0 and dx < (dwidth - cn)
where xofs[dx] + cn < xofs[dwidth-cn] implies dx < (dwidth - cn).
Thus, we can use this to control our loop termination optimally.
This fixes #16137 with little or no performance impact. I have
also added a debug check as a sanity check.
2020-01-03 14:46:59 -06:00
Alexander Alekhin
40ac72a8f1
Merge pull request #16238 from alalek:imgproc_resize_fix_types
2020-01-03 16:30:28 +00:00
Brian Wignall
f9c514b391
Fix spelling typos
...
backport commit 659ffaddb4
2019-12-27 12:46:53 +00:00
Alexander Alekhin
07729e396d
imgproc(resize): avoid unnecessary type conversions
2019-12-26 00:02:52 +00:00
Alexander Alekhin
4733a19bab
Merge pull request #16194 from alalek:fix_16192
...
* imgproc(test): resize(LANCZOS4) reproducer 16192
* imgproc: fix resize LANCZOS4 coefficients generation
2019-12-19 13:20:42 +03:00
Alexander Alekhin
a8345133ac
Merge pull request #16191 from terfendail:lres2c_fix
2019-12-18 22:31:52 +00:00
Vitaly Tuzov
f5a84f75c4
Fix for CV_8UC2 linear resize vectorization
2019-12-18 21:41:36 +00:00
mcellis33
5d15c65e48
Merge pull request #16136 from mcellis33:mec-nan
...
* Handle det == 0 in findCircle3pts.
Issue 16051 shows a case where findCircle3pts returns NaN for the
center coordinates and radius due to dividing by a determinant of 0. In
this case, the points are colinear, so the longest distance between any
2 points is the diameter of the minimum enclosing circle.
* imgproc(test): update test checks for minEnclosingCircle()
* imgproc: fix handling of special cases in minEnclosingCircle()
2019-12-18 17:25:59 +03:00
Paul Murphy
1c4a64f0a1
Merge pull request #16138 from pmur:reg_16137
...
* imgproc: Prevent 1B overrun of 8C3 SIMD optimization
The fourth value read via v_load_q is essentially ignored,
but can cause trouble if it happens to cross page boundaries.
The final few iterations may attempt to read the most extreme
elements of S, which will read 1B beyond the array in most
aligment cases. Dynamically compute the stop. This could be
hoised from the loop, but will require a more extensive change.
Likewise, cleanup the iteration increment statements to make
it more obvious they do channel count (3) elements per pass.
This should resolve #16137
* imgproc(resize): extra check
2019-12-12 13:00:44 +03:00
shimat
b89581960c
s/Voroni/Voronoi/g
2019-12-11 09:13:58 +09:00
Maksim Shabunin
435c97c7a2
imgproc: add parameter checks in calcHist and calcBackProj
2019-12-10 16:10:19 +03:00
RAJKIRAN NATARAJAN
b9435b9e38
Merge pull request #16094 from saskatchewancatch:issue-16053
...
* Add eps error checking for approxPolyDP to allow sensible values only
for epsilon value of Douglas-Peucker algorithm.
* Review changes for PR
2019-12-09 22:24:35 +03:00
Paul Murphy
a011035ed6
Merge pull request #15257 from pmur:resize
...
* resize: HResizeLinear reduce duplicate work
There appears to be a 2x unroll of the HResizeLinear against k,
however the k value is only incremented by 1 during the unroll. This
results in k - 1 duplicate passes when k > 1.
Likewise, the final pass may not respect the work done by the vector
loop. Start it with the offset returned by the vector op if
implemented. Note, no vector ops are implemented today.
The performance is most noticable on a linear downscale. A set of
performance tests are added to characterize this. The performance
improvement is 10-50% depending on the scaling.
* imgproc: vectorize HResizeLinear
Performance is mostly gated by the gather operations
for x inputs.
Likewise, provide a 2x unroll against k, this reduces the
number of alpha gathers by 1/2 for larger k.
While not a 4x improvement, it still performs substantially
better under P9 for a 1.4x improvement. P8 baseline is
1.05-1.10x due to reduced VSX instruction set.
For float types, this results in a more modest
1.2x improvement.
* Update U8 processing for non-bitexact linear resize
* core: hal: vsx: improve v_load_expand_q
With a little help, we can do this quickly without gprs on
all VSX enabled targets.
* resize: Fix cn == 3 step per feedback
Per feedback, ensure we don't overrun. This was caught via the
failure observed in Test_TensorFlow.inception_accuracy.
2019-12-09 14:54:06 +03:00
Alexander Alekhin
734de34b7a
Merge pull request #16085 from alalek:imgproc_threshold_to_zero_ipp_bug
...
* imgproc(IPP): wrong result from threshold(THRESH_TOZERO)
* imgproc(IPP): disable IPP code to pass THRESH_TOZERO test
2019-12-09 14:51:02 +03:00
Alexander Alekhin
b369c456f2
imgproc(color): clarify error message
2019-12-06 13:25:51 +03:00
Brian Wignall
af997529a1
Fix some typos
2019-11-26 18:41:19 +03:00
Everton Constantino
75315fb297
Merge pull request #15494 from everton1984:hal_vector_get_n
...
Improving VSX performance of integral function
* Adding support for vector get function on VSX datatypes so the
integral function gains a bit of performance.
* Removing get as a datatype member function and implementing a new HAL
instruction v_extract_n to get the n-th element of a vector register.
* Adding SSE/NEON/AVX intrinsics.
* Implement new HAL instruction v_broadcast_element on VSX/AVX/NEON/SSE.
* core(simd): add tests for v_extract_n/v_broadcast_element
- updated docs
- commented out code to repair compilation
- added WASM and MSA default implementations
* core(simd): fix compilation
- x86: avoid _mm256_extract_epi64/32/16/8 with MSVS 2015
- x86: _mm_extract_epi64 is 64-bit only
* cleanup
2019-11-20 13:41:07 +03:00
clunietp
2185bce4b7
Fix 13577
2019-11-18 07:41:34 -05:00
Alexander Alekhin
f4d55d512f
imgproc: fix bit-exact GaussianBlur() / sepFilter2D() ( #15855 )
...
* imgproc: fix bit-exact GaussianBlur() / sepFilter2D()
- avoid kernels with bad approximation
- GaussiabBlur - apply error-diffusion approximation for kernel (8-bit fraction)
* java(test): update features2d ref data
* test: update test_facedetect
2019-11-18 01:39:27 +03:00
Alexander Alekhin
686ea5c1a6
Merge pull request #15917 from ChipKerchner:demosaicingToHal2
2019-11-16 19:45:37 +00:00
ChipKerchner
1d33335e33
Convert demosiacing with variable number of gradients to HAL - 5.5x faster
2019-11-15 07:42:03 -06:00
Alexander Alekhin
6773b938b3
Merge pull request #15896 from alalek:build_gcc_9
2019-11-14 14:22:02 +00:00
Alexander Alekhin
763b80d5fa
imgproc(IPP): disable ippiDistanceTransform_3x3_8u32f_C1R
2019-11-13 14:14:19 +03:00
Alexander Alekhin
7ecdcf6ca6
build: GCC9 compilation
2019-11-12 18:49:34 +03:00
Chip Kerchner
2112aa31e6
Merge pull request #15828 from ChipKerchner:momentsToHal
...
* Convert moments in tile algorithms to HAL (1.3x faster for VSX).
* Adding NEON code back in for non 64-bit platforms.
* Remove floats from post processing.
2019-11-05 18:52:35 +03:00
Ciprian Alexandru Pitis
d2e02779c4
Merge pull request #15799 from Cpitis:feature/parallelization
...
Parallelize pyrDown & calcSharrDeriv
* ::pyrDown has been parallelized
* CalcSharrDeriv parallelized
* Fixed whitespace
* Set granularity based on amount of threads enabled
* Granularity changed to cv::getNumThreads, now each thread should receive 1/n sized stripes
* imgproc: move PyrDownInvoker<CastOp>::operator() implementation
* imgproc(pyramid): remove syloopboundary()
* video: SharrDerivInvoker replace 'Mat*' => 'Mat&' fields
2019-10-31 23:38:49 +03:00
Alexander Alekhin
bad4e5c3eb
Merge pull request #15692 from alalek:core_tls_handle_thread_termination
2019-10-29 20:40:35 +00:00
Alexander Alekhin
7cf1054d36
Merge pull request #15764 from ChipKerchner:demosaicingToHal
2019-10-25 13:49:46 +00:00
Alexander Alekhin
17e2bf5717
core(tls): implement releasing of TLS on thread termination
...
- move TLS & instrumentation code out of core/utility.hpp
- (*) TLSData lost .gather() method (to dispose thread data on thread termination)
- use TLSDataAccumulator for reliable collecting of thread data
- prefer using of .detachData() + .cleanupDetachedData() instead of .gather() method
(*) API is broken: replace TLSData => TLSDataAccumulator if gather required
(objects disposal on threads termination is not available in accumulator mode)
2019-10-24 06:36:18 +00:00
ChipKerchner
c46f119e0e
Convert demosaic functions to HAL
2019-10-23 10:47:07 -05:00
Steve Nicholson
acb3b3bd4d
Add documentation and example program for intersectConvexConvex
2019-10-19 22:08:07 -07:00
jasjuang
4c7db02925
document CC_STAT_MAX in ConnectedComponentsTypes
2019-10-16 17:22:25 -07:00
Everton Constantino
9ca9249992
Merge pull request #15527 from everton1984:faster_acc
...
* Adding support for vectorized masking for uchar/ushort.
* Fixing bug where mask was zeroing the dst. Improved the way to calculate
the mask and tweaked for further performance improvements.
* Fixing mask comparison test.
* Restricting to one channel.
* Adding support for 3 channels, switch old approach to start using HAL's
v_select.
2019-10-11 18:32:59 +03:00
Alexander Alekhin
4748aca61f
Merge pull request #15642 from alalek:issue_15597
2019-10-08 00:33:20 +03:00
Alexander Alekhin
a007220c52
imgproc: update histogram test
2019-10-07 15:06:43 +03:00
Alexander Alekhin
f301f17b61
imgproc: accurate histogram value thresholding
2019-10-04 19:56:25 +03:00
Alexander Alekhin
c69245da1f
imgproc: fix fitLine() implementation
...
- update optimal solutions on each iteration
2019-10-03 21:23:52 +00:00
Alexander Alekhin
f81e401cd0
imgproc: fix indexing issue in pyramids
...
UBSAN violation expression: 'tab = tabR - x;'
2019-09-26 18:09:47 +03:00
Vitaly Tuzov
1c17b3281a
Fixed OOB reading in pyrDown
2019-09-25 13:24:17 +03:00
Vitaly Tuzov
7b3a752012
Fixed universal intrinsic undistort() implementation
2019-09-16 17:16:38 +03:00
Alexander Alekhin
e7b6753a10
imgproc: avoid manual memory allocation in connectedcomponents.cpp
2019-09-05 16:20:08 +03:00
Everton Constantino
76e403cf25
Merge pull request #15440 from everton1984:new_integral_tests
...
* Adding all possible data type interactions to the perf tests since some
use SIMD acceleration and others do not.
* Disabling full tests by default.
* Giving proper names, removing magic numbers and sanity checks of new
performance tests for the integral function.
* Giving proper names, making array static.
2019-09-04 19:14:00 +03:00
Chip Kerchner
26228e6b4d
Merge pull request #15358 from ChipKerchner:imgwarpToHal
...
* Convert ImgWarp from SSE SIMD to HAL - 2.8x faster on Power (VSX) and 15% speedup on x86
* Change compile flag from CV_SIMD128 to CV_SIMD128_64F for use of v_float64x2 type
* Changing WarpPerspectiveLine from class functions and dispatching to static functions.
* Re-add dynamic runtime and dispatch execution.
* RRestore SSE4_1 optimizations inside opt_SSE4_1 namespace
2019-08-31 13:47:58 +03:00
atinfinity
824465ea27
Merge pull request #15388 from atinfinity:impl-turbo-colormap
...
Implementation of colormap "Turbo" (#15388 )
* implemented turbo colormap
* add colormap image
* changed float value to avoid cast
* sorted flag check alphabetically
2019-08-26 17:55:10 +03:00
Alexander Alekhin
29dbeb253c
build: fix build with ICC
2019-08-23 16:36:32 +03:00
luz.paz
fcc7d8dd4e
Fix modules/ typos
...
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`
backporting of commit: ec43292e1e
2019-08-16 17:34:29 +03:00
Alexander Alekhin
32772a5436
3.4: backported changes from 'master' branch
2019-08-14 16:36:08 +03:00
Maksim Shabunin
6d5ac67681
Restored IPP call reduction
2019-07-31 15:41:22 +03:00
dcouwenh
d3cf0d2c06
Bayer VNG Demosaicing Fix #2 (Merge pull request #15086 )
...
* Update demosaicing.cpp
Fixed calculation of Bs for non-green pixels.
* Fixed cvtColor perf test for bayer VNG
2019-07-30 23:49:46 +03:00
Vitaly Tuzov
e0f8bb83a6
Merge pull request #14994 from terfendail:wintr_undistort
...
WUI based implementation to initUndistortRectifyMap (#14994 )
* Add initUndistortRectifyMap performance test
* Move cv namespace boundaries
* Add wide universal intrinsics based implementation to initUndistortRectifyMap
* Dispatch undistort
2019-07-18 19:32:51 +03:00
Chip Kerchner
c9fcc12e3b
Merge pull request #15048 from ChipKerchner:reduceStoreGatheringThreshold
...
* Reduce store gathering pressures - speeds thresholds by up to 20%
* Rename temporary histogram array and initialize so that MACOSX builder is happy
2019-07-16 16:10:49 +03:00
Alexander Alekhin
054c796213
Merge pull request #15026 from terfendail:gaussian_fix
2019-07-12 18:31:09 +00:00
Vitaly Tuzov
894ad33bf4
Fix pixel value evaluation overflow in bit-exact GaussianBlur implementation
2019-07-12 18:11:51 +03:00
Alexander Alekhin
32c6e58bdb
imgproc: fix unaligned memory access
...
may cause crashes on ARM platform
2019-07-11 20:49:47 +00:00
Alexander Alekhin
39a975cb29
Merge pull request #14983 from tomoaki0705:fixOclCvtColorMRGBA
2019-07-05 09:31:08 +00:00
Tomoaki Teshima
594a95839c
fix test failure of OCL_ImgProc/CvtColor8u.mRGBA2RGBA
2019-07-05 11:22:22 +09:00
Vitaly Tuzov
82e5b961d3
Fixed initUndistortRectifyMap AVX2 implementation
2019-07-04 15:49:33 +03:00
arnaudbrejeon
a37201abee
Fix crash, add assert and test
2019-07-02 09:56:31 -07:00
Vitaly Tuzov
9befb7a1d7
Merge pull request #14916 from terfendail:wsignmask_deprecated
...
* Avoid using v_signmask universal intrinsic and mark it as deprecated
* Renamed v_find_negative to v_scan_forward
2019-07-01 19:53:51 +03:00
StefanBruens
3e4a195b61
Merge pull request #14936 from StefanBruens:crosscorr_cleanup
...
Crosscorr cleanup (#14936 )
* Simplify code for convolution destination type/size
For the 2d filter code, destination size equals source size, and the
crossCorr function even (re-)creates the output matrix with the given size.
The number of channels also have to match. The destination type() is the
one used to create the output matrix, so we can use its type() here.
This is a preparatory patch.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
* Remove redundant destination size and type parameters from crossCorr
All calling sites of crossCorr already use (...,
mat, mat.size(), mat.type(), ...), so the parameters are redundant.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
2019-06-30 19:04:25 +03:00
Alexander Alekhin
4112866821
Merge pull request #14886 from alalek:fix_grabcut_kmeans_call_14879
2019-06-26 20:03:04 +00:00
Alexander Alekhin
0a461e7922
Merge pull request #13252 from take1014:filter2d_13179
2019-06-26 13:34:10 +00:00
Alexander Alekhin
4a6888ccf6
imgproc: fix kmeans() call from grabCut()
2019-06-25 13:42:04 +03:00
Alexander Alekhin
5ac55fc132
core: eliminate AVX512 build warnings
...
from MSVS2017 and GCC8 -O1 mode
2019-06-20 20:00:09 +03:00
Alexander Alekhin
8ca4252303
Merge pull request #14583 from FanaticsKang:fix_undistortPoint_bug
2019-06-14 18:30:26 +00:00
Kang
549c53121a
fix the bug, when k[4] is negative, icdist may be negative at the edge of image.
2019-06-14 19:00:36 +03:00
Vitaly Tuzov
d2aadabc5e
Merge pull request #14743 from terfendail:wui512_fixvswarn
...
Fix for MSVS2019 build warnings (#14743 )
* AVX512 arch support for MSVS
* Fix for MSVS2019 build warnings: updated integral() AVX512 implementation
* Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation
* fix indentation
2019-06-11 23:07:39 +03:00
Alexander Alekhin
1e9ad5476d
core(intrin): drop hasSIMD128 checks
...
- use compile-time checks instead (`#if CV_SIMD128`)
- runtime checks are useless
2019-06-08 19:20:20 +00:00
bommo1
a38157a1f4
Fix https://github.com/opencv/opencv/issues/14265
2019-06-03 23:05:03 +02:00
Vitaly Tuzov
3b015dfc7d
Merge pull request #14210 from terfendail:wui_512
...
AVX512 wide universal intrinsics (#14210 )
* Added implementation of 512-bit wide universal intrinsics(WIP)
* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks
* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros
* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines
* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.
* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build
* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Alexander Alekhin
aaf56c2839
Merge pull request #14649 from savuor:fix/luv_hls_read_oob
2019-05-27 16:24:55 +00:00
Alexander Alekhin
a81c0e6db9
Merge pull request #14447 from catree:fix_issue_14423
2019-05-27 15:00:21 +00:00
Rostislav Vasilikhin
8c698262ea
rgb2hls_b: out of bounds read fixed
2019-05-27 16:19:52 +03:00
Rostislav Vasilikhin
791ebd05fc
out of bounds read fixed in rgb2luv_b
2019-05-27 16:19:01 +03:00
Rostislav Vasilikhin
e07ffe902e
Merge pull request #14616 from savuor:hsv_wide
...
HSV and HLS color conversions rewritten to wide intrinsics (#14616 )
* RGB2HSV_b vectorized
* RGB2HSV_f: widen
* RGB2HSV_f: shorten, more intuitive
* HSV2RGB_f and HSV2RGB_b widen
* hls2rgb_f widen
* instrumentation instead vx_cleanup
* RGB2HLS_f widen
* RGB2HLS_b rewritten to wide universal intrinsics
* define guard against no SIMD code
* hls2rgb_b rewritten
* extra define removed
* warning fixed
* hls2rgb_b: performance fixed
2019-05-24 23:01:08 +03:00
Ahmed Ashour
f3319f6140
java: remove redundant declaration of java.lang package
2019-05-23 14:06:34 +02:00
catree
7ed858e38e
Fix issue with solvePnPRansac and Nx3 1-channel input when the number of points is 5. Try to uniform the input shape of projectPoints and undistortPoints.
2019-05-22 14:19:16 +02:00
Rostislav Vasilikhin
e90e0ef9aa
Merge pull request #14106 from savuor:lab_wide
...
Lab, Luv and XYZ conversions rewritten to wide intrinsics (#14106 )
* rgb2xyz<float> re-vectorized
* rgb2xyz_i vectorized for ushort and uchar
* xyz2rgb<float> vectorized
* xyz2rgb_i vectorized for both uchar and ushort
* intermediate conversions (int->float) rewritten
* packed rgb2luv rewritten
* (some) float conversions rewritten
* burnt volatile int _3 and similar
* RGB2Lab_b rewritten
* tests: logging made better
* RGB2Lab_f (LRGB path) rewritten
* Lab2RGBfloat rewritten
* Lab2RGBinteger and Lab2RGB_b rewritten to wide universal intrinsics
* Luv2RGBinteger wide vectorized
* RGB2Lab_b fixed: v_sub_wrap instead of saturated sub
* warnings fixed
* trying to fix compilation on older compilers
* using 16x8 registers for 8-element dot product
* cleanup added
* splineInterpolate: loop unrolled, perf fix for f32x4
* Lab2RGBfloat: grab 2x more data to process on f32x4
* nrepeats for Luv2RGBfloat, +20% perf
* minor
* nrepeats to RGB2Lab_f
* Lab2RGBinteger: no tab for linear BGR
* nrepeats for RGB2Luvfloat
* Luv2RGBinteger: no tab for linear RGB
* +10% more to perf of Luv2RGBfloat
* nrepeats for 256-simd for Lab2RGBfloat
* less warnings
* BOM removed
* CV_SIMD_WIDTH used for lanes number checking
* trilinearPackedInterpolate: 128-bit specialization added
* fix build; no vx_cleanup(), instrumentation instead
2019-05-20 21:10:20 +03:00
Alexander Alekhin
30a595789c
Merge pull request #14463 from thangktran:thangktran/fix-imgproc-intersectConvexConvex
2019-05-16 14:50:20 +00:00
Thang Tran
1aff378ae8
imgproc: fixed bug from intersectConvexConvex
...
Added checks for all of vertices from each contour instead of checking
only for the first vertex.
2019-05-01 11:06:30 +02:00
Alexander Alekhin
1c180f4c7f
imgproc: fix RemoveOverlaps() with empty input vector
2019-04-29 21:15:23 +00:00
Suleyman TURKMEN
3f9343e238
Update imgproc.hpp
2019-04-22 00:48:11 +03:00
Alexander Alekhin
9dccfe2a96
Merge pull request #13917 from sturkmen72:removed_c_api
2019-04-17 19:04:33 +00:00
Brad Kelly
0fe17eeb68
Implementing AVX512 Support for 1 channel mats for CV_64F format
2019-03-22 09:44:23 -07:00
Alexander Alekhin
8c8715c4dd
fix static analysis issues
2019-03-13 17:19:39 +03:00
take1014
e0b664f390
fix dftFilter2D
2019-03-13 00:27:56 +09:00
Alexander Alekhin
2c07c6718f
imgproc: dispatch morph
2019-03-11 13:54:12 +00:00
Alexander Alekhin
5a01227aa1
imgproc: dispatch box_filter
2019-03-11 13:54:12 +00:00
Alexander Alekhin
ce3c92eb1f
imgproc: dispatch bilateral_filter
2019-03-11 13:54:12 +00:00
Alexander Alekhin
b99c9145bf
imgproc: dispatch smooth
2019-03-11 13:54:12 +00:00
Alexander Alekhin
6ec08f268f
imgproc: dispatch medianBlur
2019-03-11 13:54:12 +00:00
Alexander Alekhin
8546ac3ce6
imgproc: get rid of filter.avx2.cpp
2019-03-11 13:54:12 +00:00
Alexander Alekhin
9a8dbfd57f
imgproc: dispatch filter.cpp
2019-03-11 13:54:12 +00:00
Alexander Alekhin
756a98a395
imgproc: keep history of filters files
2019-03-11 13:54:07 +00:00
Alexander Alekhin
9dc7554089
imgproc: copy .dispatch.cpp
2019-03-11 13:53:59 +00:00
Alexander Alekhin
6eac8f78b9
imgproc: copy .simd.hpp
2019-03-11 13:53:59 +00:00
Alexander Alekhin
7e8cc580c9
Merge pull request #13997 from alalek:imgproc_dispatch_cvtcolor
2019-03-08 16:18:44 +00:00
Alexander Alekhin
8b541e450b
imgproc: dispatch color*
...
Lab/XYZ modes have been postponed (color_lab.cpp):
- need to split code for tables initialization and for pixels processing first
- no significant performance improvements for switching between SSE42 / AVX2 code generation
2019-03-07 15:45:05 +03:00
Alexander Alekhin
39783a6584
core: keep history of color*.cpp
2019-03-07 15:38:13 +03:00
Alexander Alekhin
f26912960f
imgproc: clone color*.dispatch.cpp
2019-03-07 15:35:49 +03:00
Alexander Alekhin
db588bb831
imgproc: clone color*.simd.hpp
2019-03-07 15:35:13 +03:00
Alexander Alekhin
d5a2fe5180
perf: ignore _ovx tests
2019-03-06 15:52:23 +03:00
Vitaly Tuzov
99b39aa5bd
Fixed out of bound reading in LINEAR_EXACT resize for 8UC3
2019-03-05 17:21:21 +03:00
Suleyman TURKMEN
3d1dbd2ccd
clean up C API
2019-03-03 21:43:27 +03:00
Alexander Alekhin
3ba49ccecc
imgproc: removed LSD code due original code license conflict
2019-03-01 16:25:39 +03:00
Vitaly Tuzov
9548093b46
Horizontal line processing for pyrDown() reworked using wide universal intrinsics.
2019-02-28 00:12:57 +03:00
Alexander Alekhin
1db5d82b7f
Merge pull request #13844 from brad-kelly:integral_avx512_cn234
2019-02-20 12:27:16 +00:00
Vitaly Tuzov
334c4d62b5
Merge pull request #13781 from terfendail:warp_wintr
...
Resize reworked using wide universal intrinsics (#13781 )
* Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize
* Reworked linear resize using new wide LUT intrinsics
* Fix for VSX intrinsics
2019-02-20 14:30:28 +03:00
Brad Kelly
507f8add1c
Implementing AVX512 Support for 2 and 4 channel mats for CV_64F format
2019-02-19 11:31:20 -08:00
Alexander Alekhin
757d8ac8f7
Merge pull request #13769 from savuor:cvtColor_tests_16u_32f
2019-02-08 15:29:35 +00:00
Alexander Alekhin
8f7e92e466
Merge pull request #13764 from nglee:dev_CudaCLAHE16bitSupport
2019-02-08 10:13:11 +00:00
Rostislav Vasilikhin
4e679e1cc5
disabled 16u and 32f perf tests
2019-02-07 19:26:36 +03:00
Rostislav Vasilikhin
87f651c119
disabled sanity check for 32f
2019-02-07 18:20:29 +03:00
Vitaly Tuzov
07c10d6fc3
Fixed out of bound reading issue in erode() and dilate()
2019-02-07 17:28:58 +03:00
Namgoo Lee
fb8e652c3f
Add CV_16UC1 support for cuda::CLAHE
...
Due to size limit of shared memory, histogram is built on
the global memory for CV_16UC1 case.
The amount of memory needed for building histogram is:
65536 * 4byte = 256KB
and shared memory limit is 48KB typically.
Added test cases for CV_16UC1 and various clip limits.
Added perf tests for CV_16UC1 on both CPU and CUDA code.
There was also a bug in CV_8UC1 case when redistributing
"residual" clipped pixels. Adding the test case where clip
limit is 5.0 exposes this bug.
2019-02-06 17:21:55 +00:00
Rostislav Vasilikhin
bbedebb57c
perf tests for cvtColor for 16U and 32f added
2019-02-06 17:56:44 +03:00
Rostislav Vasilikhin
554eae56d1
Merge pull request #13708 from savuor:yuv42x_wide
...
YUV42x color conversions rewritten to wide intrinsics (#13708 )
* a*b+c -> fma
* YUV420sp2RGB initially vectorized
* shorter var names
* loops by 4
* yuv420p2rgb vectorized
* yuv422toRGB vectorized
* reg arrays
* rgb2yuv420 vectorized
* warnings fixed
* try to fix align error
2019-02-01 19:09:31 +03:00
Vitaly Tuzov
2f5af1bd33
Merge pull request #13693 from terfendail:spatialgrad_wintr
...
* spatialGradient() reworked to use wide universal intrinsics
* Moved row pointers inside loops
2019-01-30 22:37:27 +03:00
Alexander Alekhin
268d73165e
Merge pull request #13684 from terfendail:lblend_wintr
2019-01-29 16:21:08 +00:00
Alexander Alekhin
5916ebf500
Merge pull request #13679 from alalek:imgproc_median_blur_cleanup
...
* imgproc: cleanup medianBlur_8u_O1 code
Unnecessary per-channel buffers: H[c] / lut[c]
* imgproc(medianBlur_8u_O1): use CV_SIMD_WIDTH for alignment
2019-01-29 19:20:24 +03:00
Arnaud Brejeon
d998e70a25
Merge pull request #13672 from arnaudbrejeon:bug_fix_12961
...
PyrDown: Fix bug #12961 (#13672 )
* Force unaligned pointer and create test
* More cross-platform solution
* MSVC expects a proper order
* Remove useless clang macro
2019-01-28 21:36:00 +03:00
Vitaly Tuzov
ed2e1af3e8
Added performance test for blendLinear
2019-01-25 14:16:19 +03:00
Vitaly Tuzov
266725a378
blendLinear() reworked to use wide universal intrinsics
2019-01-25 14:16:20 +03:00
Rostislav Vasilikhin
74ba4b7ae2
fixed (un)signed packing s16 -> u8
2019-01-21 18:10:29 +03:00
Alexander Alekhin
a84e11451b
imgproc(test): RGB2YUV regression test
2019-01-21 16:07:20 +03:00
Alexander Alekhin
0395b2ea9c
Merge pull request #13650 from terfendail:shapedescr_wintr
2019-01-18 16:18:47 +00:00
Rostislav Vasilikhin
3812ae7949
Merge pull request #13649 from savuor:yuv_wide
...
YUV/YCrCb conversions rewritten to wide intrinsics (#13649 )
* YUV: minors
* YUV42x conversions template-merged
* more template-merged YUV42x conversions; some NEON code removed
* rgb2yuv<float> vectorized
* yuv2rgb<float> vectorized
* memcpy removed
* Yuv2RGB<ushort> vectorized
* unused code removed
* rgb2yuv<ushort> vectorized
* rgb2yuv<uchar> vectorized
* v_pack_u used (up to +30% perf)
* yuv2rgb<uchar> vectorized
* fixed compilation
2019-01-18 19:06:29 +03:00
Vitaly Tuzov
a84bbc62b1
boundingRect() reworked to use wide universal intrinsics
2019-01-18 18:31:54 +03:00
Vitaly Tuzov
78f80c35d2
Performance test for bounding rect estimation
2019-01-18 15:50:21 +03:00
Alexander Alekhin
ca00c1dce2
Merge pull request #13631 from terfendail:thresh_wintr
2019-01-16 15:45:26 +00:00
Alexander Alekhin
133eb8d13a
Merge pull request #13593 from brad-kelly:integral_avx512_ver34
2019-01-15 17:47:21 +00:00
Vitaly Tuzov
a202dc9a90
threshold() reworked to use wide universal intrinsics
2019-01-15 19:15:19 +03:00
Alexander Alekhin
0e9c90a0d9
Merge pull request #13610 from terfendail:morph_wintr
2019-01-15 11:22:00 +00:00
Brad Kelly
0165ffa90d
Implementing AVX512 support for 3 channel cv::integral for CV_64F
2019-01-14 16:11:01 -08:00
Vitaly Tuzov
012e43de4b
Morphology reworked to use wide universal intrinsics
2019-01-14 19:02:58 +03:00
Vitaly Tuzov
ea882d58c6
Added CV_ALWAYS_INLINE macro
2019-01-11 22:40:35 +03:00
catree
d745af6763
Add Matplotlib Perceptually Uniform Sequential colormaps (viridis, plasma, inferno, magma, cividis, twilight and twilight shifted).
2019-01-06 22:48:06 +01:00
Vitaly Tuzov
7beb24553a
Speedup filter2d by loop unrolling
...
Added filter2d tests for 16S
2018-12-25 14:40:48 +03:00
Alexander Alekhin
c0e11bb50e
imgproc: revert "Speedup filter2d by loop unrolling"
...
Commit: 124011c321
PR: https://github.com/opencv/opencv/pull/13392
Sobel filter with 16S/16U datatype is broken.
2018-12-22 05:37:29 +00:00
Alexander Alekhin
26c5b846e6
Merge pull request #13392 from terfendail:filter_wintr
2018-12-21 11:00:44 +00:00
Vitaly Tuzov
124011c321
Speedup filter2d by loop unrolling
2018-12-20 21:18:42 +03:00
Vitaly Tuzov
131c09cf76
Fixed medianBlur implementation for hi-resolution images
2018-12-19 18:05:42 +03:00
Vitaly Tuzov
06f32e3b3e
Reworked separable filter to use wide universal intrinsics
2018-12-19 17:50:09 +03:00
vishwesh5
3eb2c940de
Fix Scharr and Sobel functions
...
Resolves #13375
2018-12-17 20:39:22 +05:30
Rostislav Vasilikhin
d99a4af229
Merge pull request #13379 from savuor:color_5x5
...
RGB to/from Gray rewritten to wide intrinsics (#13379 )
* 5x5 to RGB added
* RGB25x5 added
* Gray2RGB added
* Gray2RGB5x5 added
* vx_set moved out of loops
* RGB5x52Gray added
* RGB2Gray written
* warnings fixed (int -> (u)short conversion)
* warning fixed
* warning fixed
* "i < n-vsize+1" to "i <= n-vsize"
* RGBA2mRGBA vectorized
* try to fix ARM builds
* fixed ARM build for RGB2RGB5x5
* mRGBA2RGBA: saturation, vectorization
* fixed CL implementation of mRGBA2RGBA (saturation added)
2018-12-14 17:01:01 +03:00
Vitaly Tuzov
3903174f7c
Merge pull request #13334 from terfendail:histogram_wintr
...
* added performance test for compareHist
* compareHist reworked to use wide universal intrinsics
* Disabled vectorization for CV_COMP_CORREL and CV_COMP_BHATTACHARYYA if f64 is unsupported
2018-12-13 14:20:22 +03:00
Namgoo Lee
83c7dfb6a4
Fix error in LineIterator example code in doc
2018-12-05 11:31:19 +09:00
Alexander Alekhin
2d5ccc7b3e
imgproc(resize): update checks (static analyzers)
2018-12-03 13:13:48 +03:00
Alexander Alekhin
4e29e2fc7d
imgproc(test): fix resize bitexact test
...
- use "random" area on input image
- avoid duplicate cases
2018-11-30 16:38:07 +03:00
Alexander Alekhin
5ed7d5a5d9
imgproc: local "CV_Assert(totalSampleCount > 0)" check
2018-11-28 20:16:37 +00:00
Alexander Alekhin
b1064efb44
Merge pull request #13294 from terfendail:contours_wintr
2018-11-27 13:54:23 +00:00
Alexander Alekhin
83c8214b38
eliminate build warnings
2018-11-27 15:24:59 +03:00
Vitaly Tuzov
e991e05b9b
Added anonymous namespace to perf_contours
2018-11-27 11:35:40 +03:00
Alexander Alekhin
223893ea5a
Merge pull request #13242 from terfendail:contours_wintr
2018-11-26 12:29:31 +00:00
Vitaly Tuzov
e9e8bf4b81
Added performance tests for findContours
2018-11-21 19:57:02 +03:00
Vitaly Tuzov
e1a2c034e8
Updated findContours to use wide universal intrinsics
2018-11-21 19:57:02 +03:00
Vitaly Tuzov
9ad1a84853
Unrolled bilateral filter neighbor processing loop
2018-11-16 13:51:46 +03:00
Vitaly Tuzov
f5b6bea2d4
Raised bilateralFilter processing precision for CV_32F matrices containing NaNs
2018-11-16 12:07:04 +03:00
Alexander Alekhin
1c04a5ec47
Merge pull request #12965 from terfendail:medianBlur_wintr
2018-11-16 00:47:11 +00:00
Alexander Alekhin
42742727d6
imgproc(ocl): fix morph generic filter checks
...
'ksize' is not updated with 'kernel'
2018-11-14 20:15:01 +03:00
Vitaly Tuzov
2dd98e7cc6
bilateralFilter implementation moved to separate file
2018-11-09 18:26:26 +03:00
Vitaly Tuzov
28fd967148
Updated bilateralFilter implementations to use wide universal intrinsics
2018-11-09 15:27:30 +03:00
tompollok
2da56d5af6
refactoring catching all exceptions as const ref
2018-11-08 19:59:47 +03:00
Alexander Alekhin
b74b05d1b3
Revert CV_TRY/CV_CATCH macros
...
This reverts commit 7349b8f5ce
(partially).
2018-11-08 19:56:52 +03:00
Vitaly Tuzov
e5d7f446d6
Merge pull request #13056 from terfendail:box_wintr
...
* Updated boxFilter implementations to use wide universal intrinsics
* boxFilter implementation moved to separate file
* Replaced ROUNDUP macro with roundUp() function
2018-11-07 23:59:36 +03:00
Alexander Alekhin
4531f9f2f4
Merge pull request #13023 from terfendail:medianBlur_sep
2018-11-06 20:22:08 +00:00
lqy123000
cceeca3052
Merge pull request #12916 from lqy123000:bugfix_templmatch
...
* avoid rounding errors
* imgproc: replace condition in matchTemplate
2018-11-06 19:13:48 +03:00
Vitaly Tuzov
877de883b0
medianBlur() implementation moved to separate file
2018-11-02 16:28:23 +03:00
Vitaly Tuzov
0fda551dbc
Updated medianBlur implementations to use wide universal intrinsics
2018-11-02 12:26:23 +03:00
Suleyman TURKMEN
4d0ed5c13c
Merge pull request #12971 from sturkmen72:upd_imgproc_hpp
...
* Update imgproc.hpp
* update color conversion codes
2018-10-31 18:08:24 +03:00
Rostislav Vasilikhin
fa91d621fa
Merge pull request #12876 from savuor:color_rgb2rgb_wide
...
* RGB2RGB initially rewritten
* NEON impl removed
* templated version added for ushort, float
* data copying allowed for RGB2RGB
* inplace processing fixed
* fields to local vars
* no zeroupper until it's fixed
* vx_cleanup() added back
2018-10-30 18:36:23 +03:00
maver1
e397434cb6
Merge pull request #12877 from maver1:3.4
...
* Updated ICV packages and IPP integration
* core(test): minMaxIdx IPP regression test
* core(ipp): workaround minMaxIdx problem
* core(ipp): workaround meanStdDev() CV_32FC3 buffer overrun
* Returned semicolon after CV_INSTRUMENT_REGION_IPP()
2018-10-24 15:02:53 +03:00
Michał Janiszewski
c8e6ce304f
Catch exceptions by const-reference
...
Exceptions caught by value incur needless cost in C++, most of them can
be caught by const-reference, especially as nearly none are actually
used. This could allow compiler generate a slightly more efficient code.
2018-10-16 22:43:54 +02:00
Alexander Alekhin
1cc3f7abbb
Merge pull request #12516 from seiko2plus:changeUnvMultiply16
2018-10-15 12:07:40 +00:00
tompollok
0b77600718
change area() emptiness checks to empty()
2018-10-13 21:35:10 +02:00
Alexander Alekhin
0d63c4c28e
Merge pull request #12811 from take1014:resize_large_image
2018-10-12 16:59:25 +00:00
take1014
24af70c7e0
resolves 11283
2018-10-12 23:08:25 +09:00
Sayed Adel
9dc1d388af
imgproc: Enable VSX on pyrDown & pyrUp
2018-10-11 23:03:57 +00:00
Alexander Alekhin
53785b6ac6
Merge pull request #12784 from terfendail:pyramids_wintr
2018-10-11 19:26:36 +00:00
Alexander Alekhin
2332fb852d
Merge pull request #12748 from terfendail:resize_wintr2
2018-10-11 19:26:17 +00:00
Sayed Adel
8965f3ae06
imgproc:simd Enable VSX and wide universal intrinsics for accumulate operations
...
- improve cpu dispatching calls to allow more SIMD extentions
(SSE4.1, AVX2, VSX)
- wide universal intrinsics
- replace dummy v_expand with v_expand_low
- replace v_expand + v_mul_wrap with v_mul_expand for product accumulate operations
- use FMA for accumulate operations
- add mask and more types to accumulate's performance tests
2018-10-11 04:37:12 +02:00
Sayed Adel
5771fd693d
Change behaviour of 16-bit multiply operator
...
- redefine 16-bit multiply operator to perform saturating multiply
instead of non-saturating multiply
- implement 8-bit multiply operator to perform saturating multiply
- implement v_mul_wrap() for 8-bit, 16-bit non-saturating multiply
- improve performance of v_mul_hi() for VSX
- update intrin tests with new changes
- replace unv 16-bit multiplication operator with v_mul_wrap due behavior changes
- Several improvements depend on vpisarev review
* initial forward declarations for universal intrinsics
* move emulating SSE intrinsics into separate file
* implement v_mul_expand for 8-bit
* reimplement saturating multiply using v_mul_expand + v_pack
* map v_expand, v_load_expand, v_load_expand_q to sse4.1
* fix overflow avx2::v_pack(uint32)
* implement two universal intrinsics v_expand_low and v_expand_high
2018-10-11 04:35:39 +02:00
Vitaly Tuzov
cc10e6b344
pyrDown and pyrUp SSE2 implementations replaced with wide universal intrinsics implementations
2018-10-10 21:12:47 +03:00
Apoorv Goel
b8aa0cddab
Merge pull request #12777 from UnderscoreAsterisk:document-cvtColorTwoPlane
...
* Add documentation for cvtColorTwoPlane
* Change brief and add links
2018-10-09 15:49:17 +03:00
Vitaly Tuzov
9d602f2752
Replaced SSE2 area resize implementation with wide universal intrinsic implementation
2018-10-08 16:27:52 +03:00
Vitaly Tuzov
6b84990620
integral() implementation updated to utilize wide universal intrinsics
2018-10-01 17:25:43 +03:00
Alexander Alekhin
70f38b4dfa
Merge pull request #12510 from take1014:doc_hough
2018-09-17 21:39:12 +03:00
Alexander Alekhin
92ec971453
Merge pull request #12526 from terfendail:avx2_resize_fix
2018-09-14 15:57:47 +00:00
Hamdi Sahloul
5d54def264
Add semicolons after CV_INSTRUMENT
macros
2018-09-14 06:45:31 +09:00
Vitaly Tuzov
29770e13e8
Fixed bit-exact resize SIMD implementation for AVX2 baseline
2018-09-13 18:20:27 +03:00
Mark Harfouche
095b0d3272
Fix BayerXX2RGBA when blue is on the first line.
2018-09-12 16:06:44 -04:00
take1014
57ae3ac7a2
fix document about HoughLines
2018-09-12 22:18:30 +09:00
Mark Harfouche
53bbed89ae
Output RGBA images when bayer_xx2YYYA is called
2018-09-07 16:03:04 -04:00
Hamdi Sahloul
a39e0daacf
Utilize CV_UNUSED macro
2018-09-07 20:33:52 +09:00
Alexander Alekhin
bd98ed46bd
Merge pull request #12446 from alalek:imgproc_grabcut_numeric_issues
2018-09-06 20:18:45 +00:00
Alexander Alekhin
24e72e151a
imgproc: grabcut numeric stability
2018-09-06 17:05:54 +03:00
Alexander Alekhin
8a3c394d6a
don't use constructors for C API structures
2018-09-06 14:34:16 +03:00
Alexander Alekhin
ad146e5a6b
core: remove constructors from C API structures
...
POD structures can't have constructors.
2018-09-06 14:34:09 +03:00
Alexander Alekhin
e70526625f
imgproc: fix Subdiv2D::getTriangleList()
2018-09-05 16:24:27 +03:00
Alexander Alekhin
7f7f30a08b
Merge pull request #12406 from alalek:backport_12357_12391
2018-09-04 16:09:44 +00:00