Commit Graph

3051 Commits

Author SHA1 Message Date
Alexander Alekhin
92b9888837 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-12 13:02:19 +03:00
Paul Murphy
1c4a64f0a1 Merge pull request #16138 from pmur:reg_16137
* imgproc: Prevent 1B overrun of 8C3 SIMD optimization

The fourth value read via v_load_q is essentially ignored,
but can cause trouble if it happens to cross page boundaries.

The final few iterations may attempt to read the most extreme
elements of S, which will read 1B beyond the array in most
aligment cases. Dynamically compute the stop. This could be
hoised from the loop, but will require a more extensive change.

Likewise, cleanup the iteration increment statements to make
it more obvious they do channel count (3) elements per pass.

This should resolve #16137

* imgproc(resize): extra check
2019-12-12 13:00:44 +03:00
shimat
b89581960c s/Voroni/Voronoi/g 2019-12-11 09:13:58 +09:00
Maksim Shabunin
435c97c7a2 imgproc: add parameter checks in calcHist and calcBackProj 2019-12-10 16:10:19 +03:00
RAJKIRAN NATARAJAN
b9435b9e38 Merge pull request #16094 from saskatchewancatch:issue-16053
* Add eps error checking for approxPolyDP to allow sensible values only
for epsilon value of Douglas-Peucker algorithm.

* Review changes for PR
2019-12-09 22:24:35 +03:00
Paul Murphy
a011035ed6 Merge pull request #15257 from pmur:resize
* resize: HResizeLinear reduce duplicate work

There appears to be a 2x unroll of the HResizeLinear against k,
however the k value is only incremented by 1 during the unroll. This
results in k - 1 duplicate passes when k > 1.

Likewise, the final pass may not respect the work done by the vector
loop. Start it with the offset returned by the vector op if
implemented. Note, no vector ops are implemented today.

The performance is most noticable on a linear downscale. A set of
performance tests are added to characterize this.  The performance
improvement is 10-50% depending on the scaling.

* imgproc: vectorize HResizeLinear

Performance is mostly gated by the gather operations
for x inputs.

Likewise, provide a 2x unroll against k, this reduces the
number of alpha gathers by 1/2 for larger k.

While not a 4x improvement, it still performs substantially
better under P9 for a 1.4x improvement. P8 baseline is
1.05-1.10x due to reduced VSX instruction set.

For float types, this results in a more modest
1.2x improvement.

* Update U8 processing for non-bitexact linear resize

* core: hal: vsx: improve v_load_expand_q

With a little help, we can do this quickly without gprs on
all VSX enabled targets.

* resize: Fix cn == 3 step per feedback

Per feedback, ensure we don't overrun. This was caught via the
failure observed in Test_TensorFlow.inception_accuracy.
2019-12-09 14:54:06 +03:00
Alexander Alekhin
734de34b7a
Merge pull request #16085 from alalek:imgproc_threshold_to_zero_ipp_bug
* imgproc(IPP): wrong result from threshold(THRESH_TOZERO)

* imgproc(IPP): disable IPP code to pass THRESH_TOZERO test
2019-12-09 14:51:02 +03:00
Alexander Alekhin
b369c456f2 imgproc(color): clarify error message 2019-12-06 13:25:51 +03:00
Brian Wignall
af997529a1 Fix some typos 2019-11-26 18:41:19 +03:00
Brian Wignall
9276f1910b Fix some typos 2019-11-25 19:55:07 -05:00
Alexander Alekhin
ad0ab4109a Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-22 22:47:13 +00:00
Everton Constantino
75315fb297 Merge pull request #15494 from everton1984:hal_vector_get_n
Improving VSX performance of integral function

* Adding support for vector get function on VSX datatypes so the
integral function gains a bit of performance.

* Removing get as a datatype member function and implementing a new HAL
instruction v_extract_n to get the n-th element of a vector register.

* Adding SSE/NEON/AVX intrinsics.

* Implement new HAL instruction v_broadcast_element on VSX/AVX/NEON/SSE.

* core(simd): add tests for v_extract_n/v_broadcast_element

- updated docs
- commented out code to repair compilation
- added WASM and MSA default implementations

* core(simd): fix compilation

- x86: avoid _mm256_extract_epi64/32/16/8 with MSVS 2015
- x86: _mm_extract_epi64 is 64-bit only

* cleanup
2019-11-20 13:41:07 +03:00
Alexander Alekhin
318cba4ce3 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-19 19:48:49 +00:00
clunietp
2185bce4b7 Fix 13577 2019-11-18 07:41:34 -05:00
Alexander Alekhin
fc41c18c6f Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-18 13:56:24 +03:00
Alexander Alekhin
f4d55d512f
imgproc: fix bit-exact GaussianBlur() / sepFilter2D() (#15855)
* imgproc: fix bit-exact GaussianBlur() / sepFilter2D()

- avoid kernels with bad approximation
- GaussiabBlur - apply error-diffusion approximation for kernel (8-bit fraction)

* java(test): update features2d ref data

* test: update test_facedetect
2019-11-18 01:39:27 +03:00
Alexander Alekhin
686ea5c1a6 Merge pull request #15917 from ChipKerchner:demosaicingToHal2 2019-11-16 19:45:37 +00:00
ChipKerchner
1d33335e33 Convert demosiacing with variable number of gradients to HAL - 5.5x faster 2019-11-15 07:42:03 -06:00
Alexander Alekhin
6773b938b3 Merge pull request #15896 from alalek:build_gcc_9 2019-11-14 14:22:02 +00:00
Alexander Alekhin
763b80d5fa imgproc(IPP): disable ippiDistanceTransform_3x3_8u32f_C1R 2019-11-13 14:14:19 +03:00
Alexander Alekhin
7ecdcf6ca6 build: GCC9 compilation 2019-11-12 18:49:34 +03:00
Alexander Alekhin
b6a58818bb Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-11 20:25:42 +00:00
Chip Kerchner
2112aa31e6 Merge pull request #15828 from ChipKerchner:momentsToHal
* Convert moments in tile algorithms to HAL (1.3x faster for VSX).

* Adding NEON code back in for non 64-bit platforms.

* Remove floats from post processing.
2019-11-05 18:52:35 +03:00
Alexander Alekhin
0d7f770996 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-04 09:58:29 +00:00
Ciprian Alexandru Pitis
d2e02779c4 Merge pull request #15799 from Cpitis:feature/parallelization
Parallelize pyrDown & calcSharrDeriv

* ::pyrDown has been parallelized

* CalcSharrDeriv parallelized

* Fixed whitespace

* Set granularity based on amount of threads enabled

* Granularity changed to cv::getNumThreads, now each thread should receive 1/n sized stripes

* imgproc: move PyrDownInvoker<CastOp>::operator() implementation

* imgproc(pyramid): remove syloopboundary()

* video: SharrDerivInvoker replace 'Mat*' => 'Mat&' fields
2019-10-31 23:38:49 +03:00
Alexander Alekhin
ea5499fa51 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-29 20:46:51 +00:00
Alexander Alekhin
bad4e5c3eb Merge pull request #15692 from alalek:core_tls_handle_thread_termination 2019-10-29 20:40:35 +00:00
Alexander Alekhin
7cf1054d36 Merge pull request #15764 from ChipKerchner:demosaicingToHal 2019-10-25 13:49:46 +00:00
Alexander Alekhin
055ffc0425 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-24 18:21:19 +00:00
Alexander Alekhin
17e2bf5717 core(tls): implement releasing of TLS on thread termination
- move TLS & instrumentation code out of core/utility.hpp
- (*) TLSData lost .gather() method (to dispose thread data on thread termination)
- use TLSDataAccumulator for reliable collecting of thread data
- prefer using of .detachData() + .cleanupDetachedData() instead of .gather() method

(*) API is broken: replace TLSData => TLSDataAccumulator if gather required
(objects disposal on threads termination is not available in accumulator mode)
2019-10-24 06:36:18 +00:00
ChipKerchner
c46f119e0e Convert demosaic functions to HAL 2019-10-23 10:47:07 -05:00
Steve Nicholson
acb3b3bd4d Add documentation and example program for intersectConvexConvex 2019-10-19 22:08:07 -07:00
jasjuang
4c7db02925 document CC_STAT_MAX in ConnectedComponentsTypes 2019-10-16 17:22:25 -07:00
Everton Constantino
9ca9249992 Merge pull request #15527 from everton1984:faster_acc
* Adding support for vectorized masking for uchar/ushort.

* Fixing bug where mask was zeroing the dst. Improved the way to calculate
the mask and tweaked for further performance improvements.

* Fixing mask comparison test.

* Restricting to one channel.

* Adding support for 3 channels, switch old approach to start using HAL's
v_select.
2019-10-11 18:32:59 +03:00
Alexander Alekhin
65573784c4 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-09 19:46:18 +00:00
Alexander Alekhin
4748aca61f
Merge pull request #15642 from alalek:issue_15597 2019-10-08 00:33:20 +03:00
Alexander Alekhin
a007220c52 imgproc: update histogram test 2019-10-07 15:06:43 +03:00
Alexander Alekhin
626bfbf309 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-05 15:45:31 +00:00
Alexander Alekhin
f301f17b61 imgproc: accurate histogram value thresholding 2019-10-04 19:56:25 +03:00
Alexander Alekhin
c69245da1f imgproc: fix fitLine() implementation
- update optimal solutions on each iteration
2019-10-03 21:23:52 +00:00
Alexander Alekhin
3fb6617d62 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-02 17:49:19 +03:00
Alexander Alekhin
507ca291e1 Merge pull request #12670 from alalek:imgproc_getRotationMatrix2D_return_type 2019-09-28 18:03:34 +00:00
Alexander Alekhin
f81e401cd0 imgproc: fix indexing issue in pyramids
UBSAN violation expression: 'tab = tabR - x;'
2019-09-26 18:09:47 +03:00
Alexander Alekhin
e2a5a6a05c Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-09-25 18:32:44 +00:00
Vitaly Tuzov
1c17b3281a Fixed OOB reading in pyrDown 2019-09-25 13:24:17 +03:00
Vitaly Tuzov
7b3a752012 Fixed universal intrinsic undistort() implementation 2019-09-16 17:16:38 +03:00
Alexander Alekhin
b4c5b50a3e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-09-13 17:15:45 +00:00
Alexander Alekhin
e7b6753a10 imgproc: avoid manual memory allocation in connectedcomponents.cpp 2019-09-05 16:20:08 +03:00
Alexander Alekhin
bea2c75452 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-09-05 14:29:22 +03:00
Everton Constantino
76e403cf25 Merge pull request #15440 from everton1984:new_integral_tests
* Adding all possible data type interactions to the perf tests since some
use SIMD acceleration and others do not.

* Disabling full tests by default.

* Giving proper names, removing magic numbers and sanity checks of new
performance tests for the integral function.

* Giving proper names, making array static.
2019-09-04 19:14:00 +03:00