Bit exact gaussian blur for 16bit unsigned int
* bit-exact gaussian kernel for CV_16U
* SIMD optimization
* template GaussianBlurFixedPoint
* remove template specialization
* simd support for h3N121 uint16
* test for u16 gaussian blur
* remove unnecessary comments
* fix return type of raw()
* add typedef of native internal type in fixedpoint
* update return type of raw()
Bit-exact Nearest Neighbor Resizing
* bit exact resizeNN
* change the value of method enum
* add bitexact-nn to ResizeExactTest
* test to compare with non-exact version
* add perf for bit-exact resizenn
* use cvFloor-equivalent
* 1/3 scaling is not stable for floating calculation
* stricter test
* bugfix: broken data in case of 6 or 12bytes elements
* bugfix: broken data in default pix_size
* stricter threshold
* use raw() for floor
* use double instead of int
* follow code reviews
* fewer cases in perf test
* center pixel convention
Clarify component statistics documentation
* Change ConnectedComponentsTypes documentation
Change from "algorithm output formats" to "statistics" because it specifies types of statistics, not formats.
* Documentation: clarify component statistics
Explain that ConnectedComponentTypes selects a statistic.
* improved fitEllipse and fitEllipseDirect accuracy in singular or close-to-singular cases (see issue #9923)
* scale points using double precision
* added normalization to fitEllipseAMS as well; fixed Java test case by raising the tolerance (it's unclear what is the correct result in this case).
* improved point perturbation a bit. make the code a little bit more clear
* trying to fix Java fitEllipseTest by slightly raising the tolerance threshold
* synchronized C++ version of Java's fitEllipse test
* removed trailing whitespaces
* Vectorize calculating integral for line for single and multiple channels
* Single vector processing for 4-channels - 25-30% faster
* Single vector processing for 4-channels - 25-30% faster
* Fixed AVX512 code for 4 channels
* Disable 3 channel 8UC1 to 32S for SSE2 and SSE3 (slower). Use new version of 8UC1 to 64F for AVX512.
fixed the ordering of contour convex hull points
* partially fixed the issue #4539
* fixed warnings and test failures
* fixed integer overflow (issue #14521)
* added comment to force buildbot to re-run
* extended the test for the issue 4539. Check the expected behaviour on the original contour as well
* added comment; fixed typo, renamed another variable for a little better clarity
* added yet another part to the test for issue #4539, where we run convexHull and convexityDetects on the original contour, without any manipulations. the rest of the test stays the same
* fixed several problems when running tests on Mac:
* OCL_pyrUp
* OCL_flip
* some basic UMat tests
* histogram badarg test (out of range access)
* retained the storepix fix in ocl_flip only for 16U/16S datatype, where the OpenCL compiler on Mac generates incorrect code
* moved deletion of ACCESS_FAST flag to non-SVM branch (where SVM is shared virtual memory (in OpenCL 2.x), not support vector machine)
* force OpenCL to use read/write for GPU<=>CPU memory transfers on machines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write)
* Fix NN resize with dimentions > 4
* add test check for nn resize with channels > 4
* Change types from float to double
* Del unnecessary test file. Move nn test to test_imgwarp. Add 5 channels test only.
Actually, we can do this in constant time. xofs always
contains same or increasing offset values. We can instead
find the most extreme value used and never attempt to load it.
Similarly, we can note for all dx >= 0 and dx < (dwidth - cn)
where xofs[dx] + cn < xofs[dwidth-cn] implies dx < (dwidth - cn).
Thus, we can use this to control our loop termination optimally.
This fixes#16137 with little or no performance impact. I have
also added a debug check as a sanity check.