OpenCV pthreads-based implementation changes:
- rework worker threads pool, allow to execute job by the main thread too
- rework synchronization scheme (wait for job completion, threads 'pong' answer is not required)
- allow "active wait" (spin) by worker threads and by the main thread
- use _mm_pause() during active wait (support for Hyper-Threading technology)
- use sched_yield() to avoid preemption of still working other workers
- don't use getTickCount()
- optional builtin thread pool profiler (disabled by compilation flag)
allows Stitcher to be used for scans from within python.
I had to use very strange notation because I couldn't export the `enum`
`Mode` making the Cpython generated code unable to compile.
```c++
class Stitcher {
public:
enum Mode
{
PANORAMA = 0,
SCANS = 1,
};
...
```
Also removed duplicate code from the `createStitcher` function making
use of the `Stitcher::create` function
* Bit-exact implementation of GaussianBlur smoothing
* Added universal intrinsics based implementation for bit-exact CV_8U GaussianBlur smoothing.
* Added parallel_for to evaluation of bit-exact GaussianBlur
* Added custom implementations for 3x3 and 5x5 bit-exact GaussianBlur
* cuda::cvtColor bug fix
Fixed bug in conversion formula between RGB space and LUV space.
Testing with opencv_test_cudaimgproc.exe, this commit reduces the number
of failed tests from 191 to 95. (96 more tests pass)
* Rename variables
Documentation generation refactoring (#10621)
* Documentation build updates:
- disable documentation by default, do not add to ALL target
- combine Doxygen and Javadoc
- optimize Doxygen html
* javadoc: fix path in build directory
* cmake: fix "Documentation" status line
* Newton's method can be more efficient
when we get the result of function distortPoint with a point (0, 0) and then undistortPoint with the result, we get the point not (0, 0). and then we discovered that the old method is not convergence sometimes. finally we have gotten the right values by Newton's method.
* modify by advice Newton's method...#10574
* calib3d(fisheye): fix codestyle, update theta before exit EPS check
UMatData locks are not mapped on real locks (they are mapped to some "pre-initialized" pool).
Concurrent execution of these statements may lead to deadlock:
- a.copyTo(b) from thread 1
- c.copyTo(d) from thread 2
where:
- 'a' and 'd' are mapped to single lock "A".
- 'b' and 'c' are mapped to single lock "B".
Workaround is to process locks with strict order.
- fix imports override.
Problem is observed with BoostDesc.
- add Ptr<> handling (constructor is protected from other packages).
Observed in ximgproc:
Ptr<StereoMatcher> createRightMatcher(Ptr<StereoMatcher> matcher_left)"
where, "StereoMather" is from another package (calib3d)
Adding capability to parse subsections of a byte array in Java bindings (#10489)
* Adding capability to parse subsections of a byte array in Java bindings. (Because Java lacks pointers. Therefore, reading images within a subsection of a byte array is impossible by Java's nature and limitations. Because of this, many IO functions in Java require additional parameters offset and length to define, which section of an array to be read.)
* Corrected according to the review. Previous interfaces were restored, instead internal interfaces were modified to provide subsampling of java byte arrays.
* Adding tests and test related files.
* Adding missing files for the test.
* Simplified the test
* Check was corrected according to discussion. An OutOfRangeException will be thrown instead of returning.
* java: update MatOfByte implementation checks / tests
v2: fix stray trailing whitespace
v3: only allow for up to one property window at the time
Opening multiple windows in the same process will just confuse
the camera filter or outright crash.
Suggested-by: @alalek
Also return whether a dialog was opened at the time.
* added write as pbm
* add tests for pbm
* imgcodecs: PBM support
- drop additional PBM parameters
- write: fix P1/P4 mode (no maxval 255 value after width/height)
- write: invert values for P1/P4
- write: P1: compact ASCII mode (no spaces)
- simplify pbm test
- drop .pxm extension (http://netpbm.sourceforge.net/doc/ doesn't know such extension)
Hough many circles (#10232)
* Add Hui's optimization. Merge with latest changes in OpenCV.
* Use conditional compilation instead of a runtime flag.
* Whitespace.
* Create the sequence for the nonzero edge pixels only if using that approach.
* Improve performance for finding very large numbers of circles
* Return the circles with the larger accumulator values first, as per API documentation.
Use a separate step to check distance between circles. Allows circles to be sorted by strength first. Avoids locking in EstimateRadius which was slowing it down.
Return centers only if maxRadius == 0 as per API documentation.
* Sort the circles so results are deterministic. Otherwise the order of circles with the same strength depends on parallel processing completion order.
* Add test for HoughCircles.
* Add beads test.
* Wrap the non-zero points structure in a common interface so the code can use either a vector or a matrix.
* Remove the special case for skipping the radius search if maxRadius==0.
* Add performance tests.
* Use NULL instead of nullptr.
OpenCV should compile with C++98 compiler.
* Put test suite name first.
Use different test suite names for each test to avoid an error from the test runner.
* Address build bot errors and warnings.
* Skip radius search if maxRadius < 0.
* Dynamically switch to NZPointList when it will be faster than NZPointSet.
* Fix compile error: missing 'typename' prior to dependent type name.
* Fix compile error: missing 'typename' prior to dependent type name.
This time fix it the non C++ 11 way.
* Fix compile error: no type named 'const_reference' in 'class cv::NZPointList'
* Disable ManySmallCircles tests. Failing on Mac.
* Change beads image to JPEG for smaller file size.
Try enabling the ManySmallCircles tests again.
* Remove ManySmallCircles tests. They are failing on the Mac build.
* Fix expectations to check all circles.
* Changing case on a case-insensitive file system
Step 1: remove the old file names
* Changing case on a case-insensitive file system
Step 2: add them back with the new names
* Fix cmpAccum function to be strictly weak ordered.
* Add tests for many small circles.
* imgproc(perf): fix HoughCircles tests
* imgproc(houghCircles): refactor code
- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
fix the "initializing global variables with values that are not
compile-time constants" issue in Intel SDK for OpenCL. The root cause
is when initializing global variables with value, the variable need is
compile-time constants.
Thanks Zheng, Yang <yang.zheng@intel.com>,
Chodor, Jaroslaw <jaroslaw.chodor@intel.com> give a help.
Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
The opencv infrastructure mostly has the basics for supporting avx512 math functions,
but it wasn't hooked up (likely due to lack of users)
In order to compile the DNN functions for AVX512, a few things need to be hooked up
and this patch does that
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
* Simulated Annealing for ANN_MLP training method
* EXPECT_LT
* just to test new data
* manage RNG
* Try again
* Just run buildbot with new data
* try to understand
* Test layer
* New data- new test
* Force RNG in backprop
* Use Impl to avoid virtual method
* reset all weights
* try to solve ABI
* retry
* ABI solved?
* till problem with dynamic_cast
* Something is wrong
* Solved?
* disable backprop test
* remove ANN_MLP_ANNEALImpl
* Disable weight in varmap
* Add example for SimulatedAnnealing
* Fix issue #10114
Convert table change
From:
CV_8U -> HALF
CV_8S -> HALF
CV_16U -> UINT
CV_16S -> UINT
CV_32S -> UINT
CV_32F -> FLOAT
To:
CV_8U -> HALF
CV_8S -> HALF
CV_16U -> UINT
CV_16S -> FLOAT
CV_32S -> FLOAT loss precision
CV_32F -> FLOAT
Signed integer can't be presented well with UINT. Even adjust bias, CV16S and CV32S will be confused when load from exr file.
Also fix CV_8S negative value incorrect bug
* EXR import and export
imread() from EXR returns CV_32F only
imwrite() accepts CV_32 cv::Mat only and stores FLOAT images by default. Add imwrite() flag to store in HALF format.
* fix compiling error
* clean up
* fix EXR import issues
* remove raw SSE2/NEON implementation from convert.cpp
* remove raw implementation from Cvt_SIMD
* remove raw implementation from cvtScale_SIMD
* remove raw implementation from cvtScaleAbs_SIMD
* remove duplicated implementation cvt_<float, short>
* remove duplicated implementation cvtScale_<short, short, float>
* add "from double" version of Cvt_SIMD
* modify the condition of test ConvertScaleAbs
* Update convert.cpp
fixed crash in cvtScaleAbs(8s=>8u)
* fixed compile error on Win32
* fixed several test failures because of accuracy loss in cvtScale(int=>int)
* fixed NEON implementation of v_cvt_f64(int=>double) intrinsic
* another attempt to fix test failures
* keep trying to fix the test failures and just introduced compile warnings
* fixed one remaining test (subtractScalar)
- don't store ProgramSource in compiled Programs (resolved problem with "source" buffers lifetime)
- completelly remove Program::read/write methods implementation:
- replaced with method to query RAW OpenCL binary without any "custom" data
- deprecate Program::getPrefix() methods
* Add test that fails
* Fix integer pointPolygonTest for large coordinate values
* Review fixes:
- change type from long long to int64
- move test code to test_contours.cpp, and make it C++98 compliant
* Hopefully fix compiler error by using push_back instead of emplace_back
* fixed OpenCL functions on Mac, so that the tests pass
* fixed compile warnings; temporarily disabled OCL branch of TV L1 optical flow on mac
* fixed other few warnings on macos
'WITH_' variables is intended to enable CMake scripts with some autodetection logic.
'WITH_' can be off, but components is really enabled via command-line options
with proper variables setup (including 'HAVE_').
If there are no OpenCL/UMat methods calls from application.
OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called
Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
Tests are usually lauched from source directory, so additional unnecessary
files should be eliminated.
Alternative ways (command line):
- python -B ...
- PYTHONDONTWRITEBYTECODE=1 python ...
* add accuracy test and performance check for matmul
* add performance tests for transform and dotProduct
* add test Core_TransformLargeTest for 8u version of transform
* remove raw SSE2/NEON implementation from matmul.cpp
* use universal intrinsic instead of raw intrinsic
* remove unused templated function
* add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
* add v_rotate_left/right in universal intrinsic
* suppress intrinsic on some function and platform
* add pure SW implementation of new universal intrinsics
* add test for new universal intrinsics
* core: prevent memory access after the end of buffer
* fix perf tests
Fix GDAL image decoding color problems identified by issue #10089, by: (#10093)
* Fix GDAL image decoding color problems identified by issue #10089, by:
Fixing CV_8UC1 symbol, which should be CV_8UC3 for RGB GDAL color table images.
Fixing image.ptr<VecX>(row,col)[] to be (*image.ptr<VecX>(row,col))[] to correctly access VecX array elements, as ptr<VecX>() returns a pointer to the VecX, not the first element of VecX. This fixes the color problem with color table gif images, and avoids out-of-bounds memory access.
Respecting the color identification of raster bands provided by the GDAL image driver, and produce BGR or BGRA images. Note that color bands of images using the HSL, CMY, CMYK, or YCbCr color space are ignored, rather than converting them to BGR.
* When reading image files using the GDAL decoder, exit with an error if a color band is encountered that isn't used (eg. from CMYK or YCbCbr), rather than silently ignoring the band's data.
Fix build with FFmpeg master. Some deprecated APIs have been removed. (#10011)
* Fix build with FFmpeg master.
* ffmpeg: update AVFMT_RAWPICTURE support removal
Incorrect type of negative_slope result in this bug.
Also and OCL test for darknet to validate this patch.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
convolution kernel use default queue to run, so that ocl::Timer
, to measure the kernel run time, should use the default queue too.
Also remove useless parameter for convolve()
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
When elements are 64 bits, the vec_st_interleave()/vec_ld_deinterleave()
doesn't interleave 4 elements correctly.
For vec_st_interleave(), following is saved into mem:
a0 b0 a1 b1 c0 d0 c1 d1
-> we expected:
a0 b0 c0 d0 a1 b1 c1 d1
for vec_ld_deinterleave(), following is loaded into a b c d for memory
string { 1 2 3 4 5 6 7 8 }:
a: 1 3
b: 2 4
c: 5 7
d: 6 8
-> we expected:
a: 1 5
b: 2 6
c: 3 7
d: 4 8
This patch corrects this behavior.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Add layer forward interface with InputArrayOfArrays and
OutputArrayOfArrays parameters, it allows UMat buffer to be
processed and transferred in the layers.
Signed-off-by: Li Peng <peng.li@intel.com>
* Adding constants to json file
* adding sub-module to constants name
* adding namespace to functions
* adding namespace to classes
* remove namespace from methods
* static methods
* python signatures generation
* python: more fixes for signatures generation
Python names existence can be checked via command:
python -c "import cv2 as cv; print(repr(<py_name>))"
* Error in the documentation for cv::getRectSubPix. #9788
The function name is corrected to GetRectSubPix since, it uses the notation
of src, dst and center. Also added the number of channel assertion criteria.
* Error in the documentation for cv::getRectSubPix. #9788
Replace dst with patch in the formula, reverted function name to
getRectSubPix, removed BorderTypes comment line due to no explicit call
to the function found.
* Error in the documentation for cv::getRectSubPix. #9788
Replace dst with patch in the formula, reverted function name to
getRectSubPix, removed BorderTypes comment line due to no explicit call
to the function found.
- changed behavior of vec_ctf, vec_ctu, vec_cts
in gcc and clang to make them compatible with XLC
- implemented most of missing conversion intrinsics in gcc and clang
- implemented conversions intrinsics of odd-numbered elements
- ignored gcc bug warning that caused by -Wunused-but-set-variable in rare cases
- replaced right shift with algebraic right shift for signed vectors
to shift in the sign bit.
- added new universal intrinsics v_matmuladd, v_rotate_left/right
- avoid using floating multiply-add in RNG
getViewerPose() doesn't modify an object of the class so it can be
made const. It also makes this method consistent with other getters
in the class as they are defined as const.
Exampls of these are gnu/kfreebsd and gnu/hurd, both available as
unofficial Debian ports.
They don't define __linux__ (as they are non-linux…) but still define
__GLIBC__, so check on that.
Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>
* Update OpenCVCompilerOptimizations.cmake
Neon not supported on MSVC ARM breaking build fix
* Update OpenCVCompilerOptimizations.cmake
Whitespace
* Update intrin.hpp
Many problems in MSVC ARM builds (at least on VS2017) being fixed in this PR now.
C:\Users\Gregory\DOCUME~1\MYLIBR~1\OPENCV~3\opencv\sources\modules\core\include\opencv2/core/hal/intrin.hpp(444): error C3861: '_tzcnt_u32': identifier not found
* Update hal_replacement.hpp
Passing variadic expansion in a macro to another macro does not work properly in MSVC and a famous known workaround is hereby applied. Discussion of it: https://stackoverflow.com/questions/5134523/msvc-doesnt-expand-va-args-correctly
Only needed the fix for ARM builds: TEGRA_ macros are used for cv_hal_ functions in the carotene library.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): warning C4003: not enough actual parameters for macro 'TEGRA_ADD'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2143: syntax error: missing ')' before ','
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2059: syntax error: ')'
* Update hal_replacement.hpp
All hal_replacement's using carotene\hal\tegra_hal.hpp TEGRA_ functions as macros preprocessed by variadic macros should be changed, identical as was done in core.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): warning C4003: not enough actual parameters for macro 'TEGRA_CVTBGRTOBGR'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): error C2059: syntax error: '=='
* Update OpenCVCompilerOptimizations.cmake
* Update hal_replacement.hpp
* Update hal_replacement.hpp
These two typdefs are not compiled when BUILD_opencv_dnn is set to
false, however there are other modules that uses these typedef so
it may cause build errors. Moving typedef to the python module
ensures they are always defined.
The original template based mat ptr for indexing is not implemented,
add the similar implementation as uchar type, but cast to
user-defined type from the uchar pointer.
The same code was repeated several time for different data types, so
it was extracted as a templated function to improve maintability and
make a code more clear.
Exception may be rasied inside the body of a copying constructor after
refcount has been increased, and beacause in the case of the exception
destrcutor is never called what causes memory leak. This commit adds a
workaround that calls the release() function before the exception is
thrown outside the contructor.
The entire AssetsLibrary framework is deprecated since iOS 8.0. The code
used in the camera example code can use UIKit to save videos to the
camera instead, which allows to avoid linking with PhotoKit instead to
prevent increasing the iOS deployment target.
Adds fitEllipseDirect to imgproc: The Direct least square (Direct) method by Fitzgibbon1999.
New Tests are included for the methods.
fitEllipseAMS Tests
fitEllipseDirect Tests
Comparative examples are added to fitEllipse.cpp in Samples.
add libdnn acceleration to dnn module (#9114)
* import libdnn code
Signed-off-by: Li Peng <peng.li@intel.com>
* add convolution layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add pooling layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add softmax layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add lrn layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add innerproduct layer ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* add HAVE_OPENCL macro
Signed-off-by: Li Peng <peng.li@intel.com>
* fix for convolution ocl
Signed-off-by: Li Peng <peng.li@intel.com>
* enable getUMat() for multi-dimension Mat
Signed-off-by: Li Peng <peng.li@intel.com>
* use getUMat for ocl acceleration
Signed-off-by: Li Peng <peng.li@intel.com>
* use CV_OCL_RUN macro
Signed-off-by: Li Peng <peng.li@intel.com>
* set OPENCL target when it is available
and disable fuseLayer for OCL target for the time being
Signed-off-by: Li Peng <peng.li@intel.com>
* fix innerproduct accuracy test
Signed-off-by: Li Peng <peng.li@intel.com>
* remove trailing space
Signed-off-by: Li Peng <peng.li@intel.com>
* Fixed tensorflow demo bug.
Root cause is that tensorflow has different algorithm with libdnn
to calculate convolution output dimension.
libdnn don't calculate output dimension anymore and just use one
passed in by config.
* split gemm ocl file
split it into gemm_buffer.cl and gemm_image.cl
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix compile failure
Signed-off-by: Li Peng <peng.li@intel.com>
* check env flag for auto tuning
Signed-off-by: Li Peng <peng.li@intel.com>
* switch to new ocl kernels for softmax layer
Signed-off-by: Li Peng <peng.li@intel.com>
* update softmax layer
on some platform subgroup extension may not work well,
fallback to non subgroup ocl acceleration.
Signed-off-by: Li Peng <peng.li@intel.com>
* fallback to cpu path for fc layer with multi output
Signed-off-by: Li Peng <peng.li@intel.com>
* update output message
Signed-off-by: Li Peng <peng.li@intel.com>
* update fully connected layer
fallback to gemm API if libdnn return false
Signed-off-by: Li Peng <peng.li@intel.com>
* Add ReLU OCL implementation
* disable layer fusion for now
Signed-off-by: Li Peng <peng.li@intel.com>
* Add OCL implementation for concat layer
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* libdnn: update license and copyrights
Also refine libdnn coding style
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
* DNN: Don't link OpenCL library explicitly
* DNN: Make default preferableTarget to DNN_TARGET_CPU
User should set it to DNN_TARGET_OPENCL explicitly if want to
use OpenCL acceleration.
Also don't fusion when using DNN_TARGET_OPENCL
* DNN: refine coding style
* Add getOpenCLErrorString
* DNN: Use int32_t/uint32_t instread of alias
* Use namespace ocl4dnn to include libdnn things
* remove extra copyTo in softmax ocl path
Signed-off-by: Li Peng <peng.li@intel.com>
* update ReLU layer ocl path
Signed-off-by: Li Peng <peng.li@intel.com>
* Add prefer target property for layer class
It is used to indicate the target for layer forwarding,
either the default CPU target or OCL target.
Signed-off-by: Li Peng <peng.li@intel.com>
* Add cl_event based timer for cv::ocl
* Rename libdnn to ocl4dnn
Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: wzw <zhiwen.wu@intel.com>
* use UMat for ocl4dnn internal buffer
Remove allocateMemory which use clCreateBuffer directly
Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: wzw <zhiwen.wu@intel.com>
* enable buffer gemm in ocl4dnn innerproduct
Signed-off-by: Li Peng <peng.li@intel.com>
* replace int_tp globally for ocl4dnn kernels.
Signed-off-by: wzw <zhiwen.wu@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
* create UMat for layer params
Signed-off-by: Li Peng <peng.li@intel.com>
* update sign ocl kernel
Signed-off-by: Li Peng <peng.li@intel.com>
* update image based gemm of inner product layer
Signed-off-by: Li Peng <peng.li@intel.com>
* remove buffer gemm of inner product layer
call cv::gemm API instead
Signed-off-by: Li Peng <peng.li@intel.com>
* change ocl4dnn forward parameter to UMat
Signed-off-by: Li Peng <peng.li@intel.com>
* Refine auto-tuning mechanism.
- Use OPENCV_OCL4DNN_KERNEL_CONFIG_PATH to set cache directory
for fine-tuned kernel configuration.
e.g. export OPENCV_OCL4DNN_KERNEL_CONFIG_PATH=/home/tmp,
the cache directory will be /home/tmp/spatialkernels/ on Linux.
- Define environment OPENCV_OCL4DNN_ENABLE_AUTO_TUNING to enable
auto-tuning.
- OPENCV_OPENCL_ENABLE_PROFILING is only used to enable profiling
for OpenCL command queue. This fix basic kernel get wrong running
time, i.e. 0ms.
- If creating cache directory failed, disable auto-tuning.
* Detect and create cache dir on windows
Signed-off-by: Li Peng <peng.li@intel.com>
* Refine gemm like convolution kernel.
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix redundant swizzleWeights calling when use cached kernel config.
* Fix "out of resource" bug when auto-tuning too many kernels.
* replace cl_mem with UMat in ocl4dnnConvSpatial class
* OCL4DNN: reduce the tuning kernel candidate.
This patch could reduce 75% of the tuning candidates with less
than 2% performance impact for the final result.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
* replace cl_mem with umat in ocl4dnn convolution
Signed-off-by: Li Peng <peng.li@intel.com>
* remove weight_image_ of ocl4dnn inner product
Actually it is unused in the computation
Signed-off-by: Li Peng <peng.li@intel.com>
* Various fixes for ocl4dnn
1. OCL_PERFORMANCE_CHECK(ocl::Device::getDefault().isIntel())
2. Ptr<OCL4DNNInnerProduct<float> > innerProductOp
3. Code comments cleanup
4. ignore check on OCL cpu device
Signed-off-by: Li Peng <peng.li@intel.com>
* add build option for log softmax
Signed-off-by: Li Peng <peng.li@intel.com>
* remove unused ocl kernels in ocl4dnn
Signed-off-by: Li Peng <peng.li@intel.com>
* replace ocl4dnnSet with opencv setTo
Signed-off-by: Li Peng <peng.li@intel.com>
* replace ALIGN with cv::alignSize
Signed-off-by: Li Peng <peng.li@intel.com>
* check kernel build options
Signed-off-by: Li Peng <peng.li@intel.com>
* Handle program compilation fail properly.
* Use std::numeric_limits<float>::infinity() for large float number
* check ocl4dnn kernel compilation result
Signed-off-by: Li Peng <peng.li@intel.com>
* remove unused ctx_id
Signed-off-by: Li Peng <peng.li@intel.com>
* change clEnqueueNDRangeKernel to kernel.run()
Signed-off-by: Li Peng <peng.li@intel.com>
* change cl_mem to UMat in image based gemm
Signed-off-by: Li Peng <peng.li@intel.com>
* check intel subgroup support for lrn and pooling layer
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix convolution bug if group is greater than 1
Signed-off-by: Li Peng <peng.li@intel.com>
* Set default layer preferableTarget to be DNN_TARGET_CPU
Signed-off-by: Li Peng <peng.li@intel.com>
* Add ocl perf test for convolution
Signed-off-by: Li Peng <peng.li@intel.com>
* Add more ocl accuracy test
Signed-off-by: Li Peng <peng.li@intel.com>
* replace cl_image with ocl::Image2D
Signed-off-by: Li Peng <peng.li@intel.com>
* Fix build failure in elementwise layer
Signed-off-by: Li Peng <peng.li@intel.com>
* use getUMat() to get blob data
Signed-off-by: Li Peng <peng.li@intel.com>
* replace cl_mem handle with ocl::KernelArg
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(build): don't use C++11, OPENCL_LIBRARIES fix
* dnn(ocl4dnn): remove unused OpenCL kernels
* dnn(ocl4dnn): extract OpenCL code into .cl files
* dnn(ocl4dnn): refine auto-tuning
Defaultly disable auto-tuning, set OPENCV_OCL4DNN_ENABLE_AUTO_TUNING
environment variable to enable it.
Use a set of pre-tuned configs as default config if auto-tuning is disabled.
These configs are tuned for Intel GPU with 48/72 EUs, and for googlenet,
AlexNet, ResNet-50
If default config is not suitable, use the first available kernel config
from the candidates. Candidate priority from high to low is gemm like kernel,
IDLF kernel, basick kernel.
* dnn(ocl4dnn): pooling doesn't use OpenCL subgroups
* dnn(ocl4dnn): fix perf test
OpenCV has default 3sec time limit for each performance test.
Warmup OpenCL backend outside of perf measurement loop.
* use ocl::KernelArg as much as possible
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): fix bias bug for gemm like kernel
* dnn(ocl4dnn): wrap cl_mem into UMat
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): Refine signature of kernel config
- Use more readable string as signture of kernel config
- Don't count device name and vendor in signature string
- Default kernel configurations are tuned for Intel GPU with
24/48/72 EUs, and for googlenet, AlexNet, ResNet-50 net model.
* dnn(ocl4dnn): swap width/height in configuration
* dnn(ocl4dnn): enable configs for Intel OpenCL runtime only
* core: make configuration helper functions accessible from non-core modules
* dnn(ocl4dnn): update kernel auto-tuning behavior
Avoid unwanted creation of directories
* dnn(ocl4dnn): simplify kernel to workaround OpenCL compiler crash
* dnn(ocl4dnn): remove redundant code
* dnn(ocl4dnn): Add more clear message for simd size dismatch.
* dnn(ocl4dnn): add const to const argument
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): force compiler use a specific SIMD size for IDLF kernel
* dnn(ocl4dnn): drop unused tuneLocalSize()
* dnn(ocl4dnn): specify OpenCL queue for Timer and convolve() method
* dnn(ocl4dnn): sanitize file names used for cache
* dnn(perf): enable Network tests with OpenCL
* dnn(ocl4dnn/conv): drop computeGlobalSize()
* dnn(ocl4dnn/conv): drop unused fields
* dnn(ocl4dnn/conv): simplify ctor
* dnn(ocl4dnn/conv): refactor kernelConfig localSize=NULL
* dnn(ocl4dnn/conv): drop unsupported double / untested half types
* dnn(ocl4dnn/conv): drop unused variable
* dnn(ocl4dnn/conv): alignSize/divUp
* dnn(ocl4dnn/conv): use enum values
* dnn(ocl4dnn): drop unused innerproduct variable
Signed-off-by: Li Peng <peng.li@intel.com>
* dnn(ocl4dnn): add an generic function to check cl option support
* dnn(ocl4dnn): run softmax subgroup version kernel first
Signed-off-by: Li Peng <peng.li@intel.com>
imgproc: use universal intrinsic as much as possible (#9714)
* use universal intrinsic as much as possible
* make SSE3 part as common as possible with universal intrinsic implementation
* put the reducing part out of the main loop
* follow the comment
* fix the typo
* use v_reduce_sum4
* follow the comment again
* remove all CV_SSE3 part from smooth.cpp
The non-maximum suppression in the Hough accumulator incorrectly ignores maxima that extend over more than one cell, i.e. two neighboring cells both have the same accumulator value. This maximum is dropped completely instead of picking at least one of the entries. This frequently results in obvious circles being missed.
The behavior is now changed to be the same as for hough_lines.
See also https://github.com/opencv/opencv/issues/4440
GSoC 2017: Improve and Extend the JavaScript Bindings for OpenCV (#9466)
* Initial support for build with emscripten
mkdir build_js
cd build_js
cmake -D CMAKE_TOOLCHAIN_FILE=/path/to/emsdk/emsdk-portable/emscripten/master/cmake/Modules/Platform/Emscripten.cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
* Add js module
The output is build/bin/opencv_js.js
* Fix opencv2/calib3d.hpp not found issue
* Add module name
Usage:
var cv = cv();
* Add total memory as 128MB and allow growth
* Add compilation flags for emscripten
* Use EMSCRIPTEN build target
* Disable js module for non emscripten build
* Bind the preload file path to root
Usage:
face_cascade.load('haarcascade_frontalface_default.xml');
* add test folder
* fix test files
* Copy js module test to bin
* Support to run tests on Node.js
Fix tests to import cv Module when runtime is node.
Add tests.js to use qunit to auto run tests.
Modify umd wrapper to support Module is not defined.
Usage:
node tests.js
* Support UMD and file system
Wrap the opencv_js.js to opencv.js by UMD wrapper
Use emscripten file system API to load files instead of generating data file or
embedding them. It supports both browser and node.js usages.
* Fix incorrect module name in tests
* Add package.json to add dependence of qunit
* Add js_tutorials folder and a intro page of opencv.js
Enable BUILD_DOCS in CMakeLists.txt.
Add new folder of js_tutorials in folder opencv/doc.
Imitate the tutorials of OpenCV-Python to create a intro page of opencv.js and a setup guide
* Import and use binding gen from opencvjs project
* Modify the embindgen.py to pass the build and test
* Add classes and functions white list
* Consolidate hdr_parser.py (#31)
Use hdr_parser.py of python module
Add js flag to support js binding generator.
* Use emscripten::vecFromJSArray for input vector param
Fix part of #23
* Fix test cases after #34Fix#39
* Expose groupRectangles and CascadeClassifier.empty
* Add js highgui tutorials
add tutorials of imread&imshow and createTrackbar in doc/js_tutorials/js_gui folder
add interactive tutorial webpage for imread&imshow and createTrackbar in doc/js_tutorials/js_interactive_tutorials folder, and some images needed.
change doc/CMakeLists.txt to copy the interactive tutorial webpage and opencv.js to the tutorials' destination folder
* rm useless annotation in doc/CMakeLists.txt
* fix some nonstandard indentation and space
* add check if canvas is valid
* Expose BackgroundSubtractorMOG2
Fix#43
* Fix build of js doc
Limit copy_js_interactive_tutorials for doxygen build
Add dep to opencv.js
Fix#53
* Implement cv.imread & cv.imshow and insert interactive pages in tutorials (#55)
* add helper.js
* delete ALL in add target copy_js_interactive_tutorials to avoid dependence error
* Insert interactive pages in tutorials
insert the old interactive pages in markdown by using \htmlonly and \endhtmlonly command.
delete the useless interactive page
rename js_interactive_tutorials to js_assets to put some images needed in
* fix the depends of the target doxygen
add opencv.js to depends and delete the useless target of copy_js_assets
* change filename helper.js to helpers.js
* disable button or trankbar before opencv.js is ready
* Expose CV_64F
Fix#65
* improve cv.imshow to display different types as native imshow
* add utils.js to reuse functions and update tutorials
* Make doxygen depend on bin/opencv.js
* Fix memory issue of matFromArray
Fix#37
* Merge pull request from ganwenyao/tutorial_18
* Add notes for ganwenyao/tutorial_18
* Modifying for ganwenyao/tutorial_18
* Change Mat constructor with data to 5 parameters
* Mat supports constructor with Scalar
Fix#60
* update cv.imread cause the memory issue of matFromArray has been fixed
* fix canvas name and default input image
* Expose cv::Moments
Fix#85
* Add -Wno-missing-prototypes for emscripten build
* fix canvas name
* add tutorial of video input and output
* Expose enums as emscripten consts
Fix#72
* update the tutorial to use Mat constructor with Scalar and change lena.jpg
* Exclude cv::Mat for vecFromJSArray
Fix#82
* Add unit tests for cv.moments
* Fix the unit tests.
* add checkbox and stop button
* add adapter.js to make sure compatibility fo video tutorials
* Support default parameters with function overloading
* modify enums to constants
* Use https URL for MathJax.js
Fix#109
* Comment out the debug print in embindgen.py
* Expose RotatedRect
Fix#96
* replace enum with constants and improve onload function
* delete some useless paras cause #105 fixed this
* Modify const name
* Modify Contour Properties
* tutorials for imgprc2 and objdec
* Expose more functions for img proc tutorials
Fix#76
* Expose polylines for video analysis tutorial
Fix#121
* Expose constants for default parameters of img proc tutorials
Fix#122
* Fix wrong parameter types of Mat.copyTo
Fix#87
* Support default parameters of mat.convertTo
Fix#123
* Support default parameters for external constructors
Fix#131
* Revert "Expose polylines for video analysis tutorial"
This reverts commit 3ce7615652e510d30e3c0014706ac38c98883189.
Fix#121
* Support cv.minMaxLoc
Fix#127
* Expose cv.minEnclosingCircle
Fix#126
* Add video analysis tutorials
add three video tutorials, Meanshift and Camshift, Optical Flow Background Subtraction
add cup.mp4 and box.mp4 for demo in tutorials
* improve image processing tutorials
* repalce console.warn with throw to throw exception
* add try-catch to throw exception in code demo
* Change mat.size() return value to JS Array object
Fix#140
* add a note about different channels order between canvas and native opencv
* add a note about how to capture video from video files
* Binding cv.Scalar to JS array
Fix#147
* Add JS cv.Scalar object into helpers.js
* Update Install OpenCV-JavaScript tutorial page
Fix#44
* Update the OpenCV-JavaScript introduction page
Fix#44
* add cv.VideoCapture and read() function
* set the size of the hidden canvas same as the video
* Add Using OpenCV-JavaScript tutorial page
Fix#44
* fix some bad code style
* Update tutorials after 8/2 sync meeting
Changes include:
- Use OpenCV.js name instead of OpenCV-JavaScript
- Put using OpenCV.js ahead of build OpenCV.js
- Refine usage and introduction page
- Muted the video in tutorials
* Fix a typo in introduction page
* use cv.VideoCapture and its read() function to read video
* replace OpenCV-JavaScript with OpenCV.js
* Use onload of async script in js_usage tutorial
* add more info about mat.data
* Change Size to value_object
* Integrate Moh and Sajjad's editing into introduction page
* Change Point to value_object
* Change Rect to value_object with helper object
* Add helper objects for Point and Size
* Change RotatedRect to value_object with helpers
* Change MinMaxLoc and Circle to value_object
* Change TermCriteria to value_object
* Fix core_bindings.cpp for MinMaxLoc and Circle
* Remove unused types
* Change meanShift and CamShift to return Rect
* Change methods of RotatedRect to static
* Change mat.data from methods to property
Fix#75 and #77
* support img id and element in cv.imread
* Change mat.size to property and add mat.step
Fix#163
* Add matFromArray and matFromImageData as JS helpers
Fix#79, #78
* Lower camel case for Mat element getters
Fix#81
* Mat.getRoiRect and tests
Fix#86
* Support type for Mat.ptr
Fix#83
* Name changing of Mat element getters
'getUcharAt` -> 'ucharAt'
* fix code style and args names
* Fix helpers.js due to cv.Mat API update
* Fix opencv.js usage tutorial
* Fix a typo of js_setup
* Change Moments to value_object
* Add Range as value_object
Fix#171
* Support Mat.diag and Mat.isContinous
Fix#84 and #89
* Support Mat.setTo
Fix#88
* Apply edits to js_intro
* Apply edits to js_usage
* Apply edits to js_setup
* update tutorials to apply data type change
* Modify tutorials
* add core tutorials
* delete MatVector elements and delete useless set operation
* add tutorials_objdec_camera
* Add instructions for WebAssembly
* apply tech writer's feedbacks into tutorials
* Organize white list by modules
* Change size to method and bind to MatExpr.size()
Fix#177
* improve tutorials
* Modify core tutorials
* add params list and explanations for OpenCV.js functions
* remove face_profile from Face Detection in Video Capture
* Add demos link
* Change Gui to GUI
* Update js_intro based on Moh and Sajjad's edits
* Fixup for 3.3.0 rebase
* Update js_intro per Moh's suggestion
* Update contributors list per Moh's idea
* add adapter.js in video_display tutorial
* Change Mat.getRoiRect to Mat.roi
Fix#194
* Remove unnecessary files for test
Fix#192
* Licenses updated to UC BSD 3-Clause
* Apply OpenCV coding style for C++ files
* Add OpenCV license for python and js files
* Fix coding style issue in helpers.js
* Remove unused test_commons.js
* Fix coding style of test_imgproc.js
* Fix coding style of test_mat.js
* Fix space before semicolon
* Fix coding style of test_objdetect.js
* Fix coding style of tests.js
* Fix coding style of test_utils.js
* Fix coding style of test_video.js
* Fix failures of node.js tests
* Add eslint rule config and fix eslint errors
* Add eslint config for js/src and fix eslint errors
* Clean up the opencv.js dependencies
Fix#186
* Fix build issue for python generator
* Fix doxygen buildbot failure
* delete trailing whitespace, blank line at EOF and replace tab with space
* Fix tutorial_js_root reference issue for non opencv.js build
* replace the file with small size
* Initial commit of build_js.py
* Move the js build configurations to build script
* Add wasm build support
* Update OpenCV.js build tutorial by using script
* Fix global var issue in tests
* Add a README.md for build_js.py
* Copy the haar cascade files from data dir for tutorials
* Not use memory init file
* Disable debug print for modules/js/CMakeLists.txt
* Check files when build done
* Fix image name in js_gradients tutorial
* Fix image load issue in js_trackbar tutorial
* Find the opencv source directory via relative path by default
* Make the cmake args based on build_doc option
* Fix a typo in js_setup.markdown
* Fix make failure issue on config generated by build_js.py
* Eliminate js branch of hdr_parser.py
* Extract examples from js_basic_ops tutorial
* Fix coding style of utils.js
* Improve examples error handling
Handle:
1. opencv.js loading errors
2. script errors (Error)
3. cv::Exception
Fix#217
* Add enable_exception option into build_js.py
* Support print exception for exception catching disabled build
* Extract example from js_usage tutorial
* Avoid copying .eslintrc.json when building doc
Fix#223
* Revert to use onload as opencv.js ready event
* Use 4 spaces indention for js examples
* embed html in tutorials with iframe tag
* Revert to use onload as opencv.js ready event
* Extract examples from js_video_display tutorial
* Implement Utils object
* modify core imgprc and face_detection tutorials
* Fix examples of js_gui tutorials
* Fix coding style of utils.js
* Modify tutorials
* Extract example from js_face_detection_camera tutorial
* Disable new-cap check in eslint
* Extract examples from js_meanshift tutorial
* Extract examples from video tutorials
* Remove new-cap declaration and update grammer in comments
* Change textarea width to 100 to align with eslint config
* Fix printError issue when opencv.js loading fails
* Remove BUILD_opencv_js dependency for doc build
Fix#213
* Expose cv::getBuildInformation
* Dump opencv build info when opencv.js loaded for live examples
* Make the button to stand out in js live examples
Fix#235
* Style for disabled button
* Add js_imgproc_camera.html example
* Fix coding style of imgproc_camera example
* Add js_imgproc_camera tutorial
* Remove link to opencv.js demos
* doc: copy opencv.js on build, use absolute paths for assets
* doc: reuse existed file box.mp4
Added gradiantSize param into goodFeaturesToTrack API (#9618)
* Added gradiantSize param into goodFeaturesToTrack API
Removed hardcode value 3 in goodFeaturesToTrack API, and
added new param 'gradinatSize' in this API so that user can
pass any gradiant size as 3, 5 or 7.
Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
Signed-off-by: Nilaykumar Patel<nilay.nilpat@gmail.com>
Signed-off-by: Prashanth Voora <prashanthx85@gmail.com>
* fixed compilation error for java test
Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
* Modifying code for previous binary compatibility and fixing other warnings
fixed ABI break issue
resolved merged conflict
compilation error fix
Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
Signed-off-by: Patel, Nilaykumar K <nilay.nilpat@gmail.com>
- use GTest tuple definitions instead of std::tr1
- use "const static" for cv::Size contants to reduce generated binary code
- PERF_TEST_P() violates TEST_P() original semantic. Added PERF_TEST_P_() macro
* lab_tetra squashed
* initial version is almost written
* unfinished work
* compilation fixed, to be debugged
* Lab test removed
* more fixes
* Luv2RGBinteger: channels order fixed
* Lab structs removed
* good trilinear interpolation added
* several fixes
* removed Luv2RGB interpolations, XYZ tables; 8-cell LUT added
* no_interpolate made 8-cell
* interpolations rewritten to 8-cell, minor fixes
* packed interpolation added for RGB2Luv
* tetra implemented
* removing unnecessary code
* LUT building merged
* changes ported to color.cpp
* minor fixes; try to suppress warnings
* fixed v range of Luv
* fixed incorrect src channel number
* minor fixes
* preliminary version of Luv2RGBinteger is done
* Luv2RGB_b is in progress
* XYZ color constants converted to softfloat
* Luv test: precision fixed
* Luv bit-exactness test added
* warnings fixed
* compilation fixed, error message fixed
* Luv check is limited to [0-2,0-2,0-2] by XYZ
* L->Y generation moved to LUT
* LUTs added for up and vp of Luv2RGB_b
* still works
* fixed-point is done, works at maxerr 2
* vectorized code is done, 2x slower than original
* perf improved by 10%
* extra comments removed
* code moved to color.cpp
* test_lab.cpp updated
* minor refactoring
* test added for Luv2RGB
* OCL Luv2RGB_b: XYZ are limited to [0, 2]; docs updated
* Luv2RGB_b rewritten to universal intrinsics
* test_lab.cpp moved to luv_tetra branch
* Using environment variable to store options parsed by av_dict_parse_string(ENV{OPENCV_FFMPEG_CAPTURE_OPTIONS}, ";", "|")
* Adding missing mandatory flags parameter
* Guarding against missing function via LIBAVUTIL version
* Code review fixes
Copy/paste error due to coder mistake reverted
Proper version checking for LIBAVUTIL_BUILD
* Imgproc_ColorLab_Full.accuracy test fixed
* Lab and Luv tests: rewritten, constants explained
* CV_ColorCvtBaseTest: added methods for 8u implementations
* Lab2RGB_b: bit-exactness enabled for all modes; non-vectorized code fixed to comply with vectorized
* srgb support added
* XYZ constants made softdouble
* bit-exact tests written for Lab
* ColorLab_full test fixed
* reverted: no 8u convertors for CV_ColorCvtBaseTest
* added checksum-based test for Lab bit-exactness
* extra declarations removed
* Lab test fix: stop at first mismatch
* test info output improved
* error message fixed
* lab_tetra squashed
* initial version is almost written
* unfinished work
* compilation fixed, to be debugged
* Lab test removed
* more fixes
* Luv2RGBinteger: channels order fixed
* Lab structs removed
* good trilinear interpolation added
* several fixes
* removed Luv2RGB interpolations, XYZ tables; 8-cell LUT added
* no_interpolate made 8-cell
* interpolations rewritten to 8-cell, minor fixes
* packed interpolation added for RGB2Luv
* tetra implemented
* removing unnecessary code
* LUT building merged
* changes ported to color.cpp
* minor fixes; try to suppress warnings
* fixed v range of Luv
* fixed incorrect src channel number
* minor fixes
* preliminary version of Luv2RGBinteger is done
* Luv2RGB_b is in progress
* XYZ color constants converted to softfloat
* Luv test: precision fixed
* Luv bit-exactness test added
* warnings fixed
* compilation fixed, error message fixed
* test_lab.cpp removed
Added forkfour Latex command to math js support.
Split cv::norm documentation between the cv::norm and its overload, to make things clearer
Corrected some typos and cleaned up grammar.
Result is clearer documentation for the norms.
Work pending...
This adds the possibility to use multi-channel masks for the functions
cv::mean, cv::meanStdDev and the method Mat::setTo. The tests have now a
probability to use multi-channel masks for operations that support them.
This also includes Mat::copyTo, which supported multi-channel masks
before, but there was no test confirming this.
CUDA implementation wants to convert std::vector<KeyPoint> <-> GpuMat.
There is no direct mapping from KeyPoint (mix of int/float fields)
into cv::Mat element type, so this conversion must be avoided.
Legacy mode is turned back for CUDA builds.
This function is the counterpart of "Context::getProg".
With this function, users have chance to unload a program
from global run-time cached programs, and save resource.
OpenCL runtime does not require OpenCL development file (libOpenCL.so),
just the "run" library (so.1).
This patch searches for the run library (so.1) if the dev library (.so)
is not found.
Web search shows that this error has been present since at least 2015
http://answers.opencv.org/question/80532/haveopencl-return-false/
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
[GSOC] Enable OCL for AKAZE (#9330)
* revert e0489cb - reenable OCL for AKAZE
* deal with conversion internally in AKAZE
* pass InputArray directly to AKAZE to allow distiguishing input Mat/UMat. deal with conversion there
* ensure that keypoints orientations are always computed. prevents misuse of internal AKAZE class.
* covert internal AKAZE functions to use InputArray/OutputArray
* make internal functions private in AKAZE
* split OCL and CPU paths in AKAZE
* create 2 separate pyramids, 1 for OCL and 1 for CPU
* template functions that use temporaries to always store them as correct type (UMat/Mat)
* remove variable used only in OCL path
causes unused variable warning
* update AKAZE documentation
* run ocl version only when ocl is enabled
* add tests for OCL path in AKAZE
* relax condition for keypoints angle
[GSOC] Speeding-up AKAZE, part #3 (#9249)
* use finding of scale extremas from fast_akaze
* incorporade finding of extremas and subpixel refinement from Hideaki Suzuki's fast_akaze (https://github.com/h2suzuki/fast_akaze)
* use opencv parallel framework
* do not search for keypoints near the border, where we can't compute sensible descriptors (bugs fixed in ffd9ad99f4, 2c5389594b), but the descriptors were not 100% correct. this is a better solution
this version produces less keypoints with the same treshold. It is more effective in pruning similar keypoints (which do not bring any new information), so we have less keypoints, but with high quality. Accuracy is about the same.
* incorporate bugfix from upstream
* fix bug in subpixel refinement
* see commit db3dc22981e856ca8111f2f7fe57d9c2e0286efc in Pablo's repo
* rework finding of scale space extremas
* store just keypoints positions
* store positions in uchar mask for effective spatial search for neighbours
* construct keypoints structs at the very end
* lower inlier threshold in test
* win32 has lower accuracy
[GSOC] Speeding-up AKAZE, part #2 (#8951)
* feature2d: instrument more functions used in AKAZE
* rework Compute_Determinant_Hessian_Response
* this takes 84% of time of Feature_Detection
* run everything in parallel
* compute Scharr kernels just once
* compute sigma more efficiently
* allocate all matrices in evolution without zeroing
* features2d: add one bigger image to tests
* now test have images: 600x768, 900x600 and 1385x700 to cover different resolutions
* explicitly zero Lx and Ly
* add Lflow and Lstep to evolution as in original AKAZE code
* reworked computing keypoints orientation
integrated faster function from https://github.com/h2suzuki/fast_akaze
* use standard fastAtan2 instead of getAngle
* compute keypoints orientation in parallel
* fix visual studio warnings
* replace some wrapped functions with direct calls to OpenCV functions
* improved readability for people familiar with opencv
* do not same image twice in base level
* rework diffusity stencil
* use one pass stencil for diffusity from https://github.com/h2suzuki/fast_akaze
* improve locality in Create_Scale_Space
* always compute determinat od hessian and spacial derivatives
* this needs to be computed always as we need derivatives while computing descriptors
* fixed tests of AKAZE with KAZE descriptors which have been affected by this
Currently it computes all first and second order derivatives together and the determiant of the hessian. For descriptors it would be enough to compute just first order derivates, but it is not probably worth it optimize for scenario where descriptors and keypoints are computed separately, since it is already very inefficient. When computing keypoint and descriptors together it is faster to do it the current way (preserves locality).
* parallelize non linear diffusion computation
* do multiplication right in the nlp diffusity kernel
* rework kfactor computation
* get rid of sharing buffers when creating scale space pyramid, the performace impact is neglegible
* features2d: initialize TBB scheduler in perf tests
* ensures more stable output
* more reasonable profiles, since the first call of parallel_for_ is not getting big performace hit
* compute_kfactor: interleave finding of maximum and computing distance
* no need to go twice through the data
* start to use UMats in AKAZE to leverage OpenCl in the future
* fixed bug that prevented computing determinant for scale pyramid of size 1 (just the base image)
* all descriptors now support writing to uninitialized memory
* use InputArray and OutputArray for input image and descriptors, allows to make use UMAt that user passes to us
* enable use of all existing ocl paths in AKAZE
* all parts that uses ocl-enabled functions should use ocl by now
* imgproc: fix dispatching of IPP version when OCL is disabled
* when OCL is disabled IPP version should be always prefered (even when the dst is UMat)
* get rid of copy in DeterminantHessian response
* this slows CPU version considerably
* do no run in parallel when running with OCL
* store derivations as UMat in pyramid
* enables OCL path computing of determint hessian
* will allow to compute descriptors on GPU in the future
* port diffusivity to OCL
* diffusivity itself is not a blocker, but this saves us downloading and uploading derivations
* implement kernel for nonlinear scalar diffusion step
* download the pyramid from GPU just once
we don't want to downlaod matrices ad hoc from gpu when the function in AKAZE needs it. There is a HUGE mapping overhead and without shared memory support a LOT of unnecessary transfers.
This maps/downloads matrices just once.
* fix bug with uninitialized values in non linear diffusion
* this was causing spurious segfaults in stitching tests due to propagation of NaNs
* added new test, which checks for NaNs (added new debug asserts for NaNs)
* valgrind now says everything is ok
* add nonlinear diffusion step OCL implementation
* Lt in pyramid changed to UMat, it will be downlaoded from GPU along with Lx, Ly
* fix bug in pm_g2 kernel. OpenCV mangles dimensions passed to OpenCL, so we need to check for boundaries in each OCL kernel.
* port computing of determinant to OCL
* computing of determinant is not a blocker, but with this change we don't need to download all spatial derivatives to CPU, we only download determinant
* make Ldet in the pyramid UMat, download it from CPU together with the other parts of the pyramid
* add profiling macros
* fix visual studio warning
* instrument non_linear_diffusion
* remove changes I have made to TEvolution
* TEvolution is used only in KAZE now
* Revert "features2d: initialize TBB scheduler in perf tests"
This reverts commit ba81e2a711.
In OpenCL code in activations.cl, make the type of floating point
literals to be float. Otherwise the values will be interpreted as
doubles, causing Beignet to have type conversion issues.
Previously, only file-based encoding and decoding were supported with
the libtiff library, leading to the possible use of temporary files.
This fixes issue #8483.
Previously, the return value of fwrite and fclose were not properly
checked, leading to possible silent truncation of the data if writing
failed, e.g. due to lack of disk space.
Fixes issue #9251.
RGB2Lab_f added, bugs fixed, moved to float
several bugs fixed
LUT fixed, no switch in tetraInterpolate()
temporary code; to be removed and rewritten
before refactoring
extra interpolations removed, some things to do left
added Lab2RGB_b +XYZ version, etc.
basic version is done, to be sped up
tetra refactored
interpolations: LUT for weights, refactor., etc.
address arithm optimized
initial version of vectorized code added (not compiling now)
compilation fixed, now segfaults
a lot of fixes, vectorization temp. disabled
fixed trilinear shift size, max error dropped from 19 to 10
fixed several bugs (255 vs 256, signed vs unsigned, bIdx)
minor changes
packed: address arithmetics fixed
shorter code
experiments with pure integer calculations
Lab2RGB max error decreased to 2; need to clean the code
ready for vectorization; need cleaning
vectorized, to be debugged
precision fixed, max error is 2
Lab->XYZ shortened
minor fixes
Lab2RGB_f version fixed, to be completely rewritten using _b code
RGB2Lab_f vectorized
minors
moved to separate file
refactored Lab2RGB to float and int versions
minor fix
Lab2RGB_f vectorized
minor refactoring
Lab2RGBint refactored: process methods, vectorize by 4 pix
Lab2RGB_f int version is done
cleanup extra code
code copied to color.cpp
fixed blue idx bug
optimizations enabled when testing; mulFracConst introduced
divConst -> mulFracConst
calc min time in perf instead of avg
minors
process() slightly sped up
Lab2RGB_f: disabled int version
reinterpret added, minor fixes in names
some warnings fixed
changes transferred to color.cpp
RGB2Lab_f code (and trilinear interpolation code) moved to rgb2lab_faster
whitespace
shift negative fixed
more warnings fixed
"constant condition" warnings fixed, little speed up
minor changes
test_photo decolor fixed
changes copied to test_lab.cpp
idx bounds checking in LUT init
several fixes
WIP: softfloat almost integrated
test_lab partially rewritten to SoftFloat
color.cpp rewritten to SoftFloat
test_lab.cpp: accuracy code added
several fixes
RGB2Lab_b testing fixed
splineBuild() rewritten to SoftFloat
accuracy control improved
rounding fixed
Luv <=> RGB: rewritten to SoftFloat
OCL cvtColor Lab and Lut rewritten to SoftFloat
minor fixes
refactored to new SoftFloat interface
round() -> cvRound, etc.
fixed OCL tests
softfloat.cpp: internal functions made static, unused ones removed
meaningful constants
extra lines removed
unused function removed
unfinished work
it works, need to fix TODOs
refactoring; more calls rewritten
mulFracConst removed
constants made bit exact; minors
changes moved to color.cpp
fixed 1 bug and 4 warnings
OCL: fixed constants
pow(x, _1_3f) replaced by cubeRoot(x)
fixed compilation on MSVC32
magic constants explained
file with internal accuracy&speed tests moved to lab_tetra branch
Add constructors taking initializer_list for some of OpenCV data types (#9034)
* Add a constructor taking initializer_list for Matx
* Add a constructor taking initializer list for Mat and Mat_
* Add one more method to initialize Mat to the corresponding tutorial
* Add a note how to initialize Matx
* CV_CXX_11->CV_CXX11
Add gstreamer capture capability for some YUV formats (#8914)
* Add gstreamer capture capability for some YUV formats.(only for gstreamer-1.0)
* avoid cross initialization error
* add checking if pipeline is manualpipeline, for compatibility.
fixed problem in concat layer by disabling memory re-use in layers with multiple inputs
trying to fix the tests when Halide is used to run deep nets
another attempt to fix Halide tests
see if the Halide tests will pass with concat layer fusion turned off
trying to fix failures in halide tests; another try
one more experiment to make halide_concat & halide_enet tests pass
continue attempts to fix halide tests
moving on
uncomment parallel concat layer
seemingly fixed failures in Halide tests and re-enabled concat layer fusion; thanks to dkurt for the patch
BufferPoolController has a non virtual protected destructor (which is legitimate)
However, Visual Studio sees this as a bug, if you enable more warnings, like below
```
add_compile_options(/W3) # level 3 warnings
add_compile_options(/we4265) # warning about missing virtual destructors
```
This is a proposition in order to silence this warning.
See https://github.com/ivsgroup/boost_warnings_minimal_demo for a demo of the same problem
with boost/exception.hpp
merge_histogram kernel only need "BINS" theads to accumulate the
histgrams, it is not efficient to directly use maxGroupSize as
local size if maxGroupSize is far greater then BINS.
Remove unnecessary Non-ASCII characters from source code (#9075)
* Remove unnecessary Non-ASCII characters from source code
Remove unnecessary Non-ASCII characters and replace them with ASCII
characters
* Remove dashes in the @param statement
Remove dashes and place single space in the @param statement to keep
coding style
* misc: more fixes for non-ASCII symbols
* misc: fix non-ASCII symbol in CMake file
The old error message was not giving any hint which input array (image)
led to an ill conditioned matrix. This made it near impossible to
identify poor images in a larger set.
A better approach would be to implement a checker function which gives
each image a rating before the real calibration is performed. This could
also include some image properties like sharpness, etc.
The objective is to:
*Reduce greatly the number of lines of code in the Java codes;
*Make it easy for Java users to add a trackbar and show the results;
*Get the code more similar between C++, Java and Python, making the tutorials more uniform.
Main purpose of this namespace is to avoid using of incompatible
binaries that will cause applications crashes.
This additional namespace will not impact "Source code API".
This change allows to maintain ABI checks (with easy filtering out).
Enable p3p and ap3p in solvePnPRansac (#8585)
* add paper info
* allow p3p and ap3p being RANSAC kernel
* keep previous code
* apply catrees comment
* fix getMat
* add comment
* add solvep3p test
* test return value
* fix warnings
* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)
* also, added missing copyrights to many of the layer implementation files
* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
Previous commit, 6f39f9a, tries to fix the color issue for emulator. But the condition for detecting emulator is incomplete, e.g. it stops working for emulators using Google Play, whose Build.BRAND=="google". https://stackoverflow.com/a/21505193 shows a more accurate condition for this.
[GSOC] Speeding-up AKAZE, part #1 (#8869)
* ts: expand arguments before stringifications in CV_ENUM and CV_FLAGS
added protective macros to always force macro expansion of arguments. This allows using CV_ENUM and CV_FLAGS with macro arguments.
* feature2d: unify perf test
use the same test for all detectors/descriptors we have.
* added AKAZE tests
* features2d: extend perf tests
* add BRISK, KAZE, MSER
* run all extract tests on AKAZE keypoints, so that the test si more comparable for the speed of extraction
* feature2d: rework opencl perf tests
use the same configuration as cpu tests
* feature2d: fix descriptors allocation for AKAZE and KAZE
fix crash when descriptors are UMat
* feature2d: name enum to fix build with older gcc
* Revert "ts: expand arguments before stringifications in CV_ENUM and CV_FLAGS"
This reverts commit 19538cac1e.
This wasn't a great idea after all. There is a lot of flags implemented as #define, that we don't want to expand.
* feature2d: fix expansion problems with CV_ENUM in perf
* expand arguments before passing them to CV_ENUM. This does not need modifications of CV_ENUM.
* added include guards to `perf_feature2d.hpp`
* feature2d: fix crash in AKAZE when using KAZE descriptors
* out-of-bound access in Get_MSURF_Descriptor_64
* this happened reliably when running on provided keypoints (not computed by the same instance)
* feature2d: added regression tests for AKAZE
* test with both MLDB and KAZE keypoints
* feature2d: do not compute keypoints orientation twice
* always compute keypoints orientation, when computing keypoints
* do not recompute keypoint orientation when computing descriptors
this allows to test detection and extraction separately
* features2d: fix crash in AKAZE
* out-of-bound reads near the image edge
* same as the bug in KAZE descriptors
* feature2d: refactor invariance testing
* split detectors and descriptors tests
* rewrite to google test to simplify debugging
* add tests for AKAZE and one test for ORB
* stitching: add tests with AKAZE feature finder
* added basic stitching cpu and ocl tests
* fix bug in AKAZE wrapper for stitching pipeline causing lots of
! OPENCV warning: getUMat()/getMat() call chain possible problem.
! Base object is dead, while nested/derived object is still alive or processed.
! Please check lifetime of UMat/Mat objects!
general:
- all iterative tests have been replaced with parameterized tests
- old-style try..catch tests have been modified to use EXPECT_/ASSERT_ gtest macros
- added temporary files cleanup
- modified MatComparator error message formatting
imgcodecs:
- test_grfmt.cpp split to test_jpg.cpp, test_png.cpp, test_tiff.cpp, etc.
videoio:
- added public HAVE_VIDEO_INPUT, HAVE_VIDEO_OUTPUT definitions to cvconfig.h
- built-in MotionJPEG codec could not be tested on some platforms (read_write test was disabled if ffmpeg is off, encoding/decoding was handled by ffmpeg otherwise).
- image-related tests moved to imgcodecs (Videoio_Image)
- several property get/set tests have been combined into one
- added MotionJPEG test video to opencv_extra
Fixed snprintf for VS 2013 (#8816)
* Fixed snprintf for VS 2013
* snprintf: removed declaration from header, changed implementation
* cv_snprintf corrected according to comments
* update snprintf patch
* avoid link error (move the implementation of software version to header)
* make getConvertFuncFp16 local (move from precomp.hpp to convert.hpp)
* fix error on 32bit x86
Parallelize Canny with custom gradient (#8694)
* New Canny implementation. Restructuring code in parallelCanny class. Align mag buffer and map.
* Fix warnings.
* Missing SIMD check added.
* Replaced local trailingZeros in contours.cpp. Use alignSize in canny.cpp
* Fix warnings in alignSize and allocate just minimum extra columns.
* Fix another warning in map.create.
* Exchange for loop by do loop to avoid double check at the beginning.
Define extra SIMD CANNY_CHECK to avoid unnecessary continue.
* Correct the existing documented T-API functions to match the doxygen format.
* docs: fix comments style
* T-API documentation: minor formatting changes
Aravis: Use of std::fabs, added support for 16-bit mono files and exposure compensation (#8711)
* Use of std::fabs, added support for 16-bit mono files
* Correction in priority2 stage & adding exposure compensation
- persistence.cpp code expects special sizeof value for passed structures
- this assumption is lead to memory corruption problems
- fixed/workarounded test to prevent memory corruption on Linux 32-bit systems
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
Add support for std::array<T, N> (#8535)
* Add support for std::array<T, N>
* Add std::array<Mat, N> support
* Remove UMat constructor with std::array parameter
Gemm kernels for Intel GPU (#8104)
* Fix an issue with Kernel object reset release when consecutive Kernel::run calls
Kernel::run launch OCL gpu kernels and set a event callback function
to decreate the ref count of UMat or remove UMat when the lauched workloads
are completed. However, for some OCL kernels requires multiple call of
Kernel::run function with some kernel parameter changes (e.g., input
and output buffer offset) to get the final computation result.
In the case, the current implementation requires unnecessary
synchronization and cleanupMat.
This fix requires the user to specify whether there will be more work or not.
If there is no remaining computation, the Kernel::run will reset the
kernel object
Signed-off-by: Woo, Insoo <insoo.woo@intel.com>
* GEMM kernel optimization for Intel GEN
The optimized kernels uses cl_intel_subgroups extension for better
performance.
Note: This optimized kernels will be part of ISAAC in a code generation
way under MIT license.
Signed-off-by: Woo, Insoo <insoo.woo@intel.com>
* Fix API compatibility error
This patch fixes a OCV API compatibility error. The error was reported
due to the interface changes of Kernel::run. To resolve the issue,
An overloaded function of Kernel::run is added. It take a flag indicating
whether there are more work to be done with the kernel object without
releasing resources related to it.
Signed-off-by: Woo, Insoo <insoo.woo@intel.com>
* Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp
Signed-off-by: Woo, Insoo <insoo.woo@intel.com>
* Revert "Fix API compatibility error"
This reverts commit 2ef427db91.
Conflicts:
modules/core/src/intel_gpu_gemm.inl.hpp
* Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"
This reverts commit cc7f9f5469.
* Fix the case of uninitialization D
When C is null and beta is non-zero, D is used without initialization.
This resloves the issue
Signed-off-by: Woo, Insoo <insoo.woo@intel.com>
* fix potential output error due to 0 * nan
Signed-off-by: Woo, Insoo <insoo.woo@intel.com>
* whitespace fix, eliminate non-ASCII symbols
* fix build warning
The method does cvCheckPixelBackgroundNP (which reads bgmodel)
before it is ever updated (in cvUpdatePixelBackgroundNP). This
initialization is thus needed to avoid reads of unitialized values.
New p3p algorithm (accepted by CVPR 2017) (#8301)
* add p3p source code
* indent 4
* update publication info
* fix filename
* interface done
* plug in done, test needed
* debugging
* for test
* a working version
* clean p3p code
* test
* test
* fix warning, blank line
* apply patch from @catree
* add reference info
* namespace, indent 4
* static solveQuartic
* put small functions to anonymous namespace
AffineBasedEstimator crashed when called with an existing CameraParameters.
This happens e.g. when using Stitcher in SCANS mode.
CameraraParameters is now cleared before any calculation is executed.
Use identity matrix if homography finding failed. Current behavior zeros out all points.
Update circlesgrid.cpp
Addressed comments
Update circlesgrid.cpp
removed whitespace
In the previous version only the default stream was/could be used, i.e.
cv::cuda::Stream::Null().
With this change, HOG::compute() will now run in parallel over different
cuda::Streams.
The code has been reordered so that all data allocation is completed
first, then all the kernels are run in parallel over streams.
Fix#8177
Added assertios to remap and warpAffine functions
As @mshabunin said, remap and warpAffine functions do not support more than 4 channels in
Bicubic and Lanczos4 interpolation modes. Assertions were added. Appropriate test was chenged.
resolves#8272
This test case uses a matrix with more dimensions than columns. Without
the fix in
b45e784beb
this crashes with a segmentation fault, hangs or simply fails with wrong
values.
Stitcher will now make a working copy of the CameraParams object to avoid side effects when composing Panorama.
Makes it possible to estimate transform once and then compose multiple panoramas. Useful for setup with fixed cameras.
- use suffixes like '.avx.cpp'
- added CMake-generated files for '.simd.hpp' optimization approach
- wrap HAL intrinsic headers into separate namespaces for different build flags
- automatic vzeroupper insertion (via CV_INSTRUMENT_REGION macro)
* export SVM::trainAuto to python #7224
* workaround for ABI compatibility of SVM::trainAuto
* add parameter comments to new SVM::trainAuto function
* Export ParamGrid member variables
Warping a matrix with more than 4 channels using BORDER_CONSTANT and
INTER_NEAREST, INTER_CUBIC or INTER_LANCZOS4 interpolation led to
undefined behaviour. This commit changes the behavior of these methods
to be similar to that of INTER_LINEAR. Changed the scope of some of the
variables to more local. Modified some tests to be able to detect the
error described.
* Introduced OSGi Blueprint XML file and Bean class too automatically load the native library.
* Introduced integration testing module to deploy to Karaf OSGi implementation.
* Clears library executable stack flag during build.
* Updated README document.
Implement cv::cuda::calcHist with mask support (#8367)
* Implement cuda::calcHist with mask
* Fix documentation build warning
* Have their own step sizes for src and mask. Fix review comment.
`template<typename _Tp> inline const _Tp* Mat_<_Tp>::operator [](int y) const` does not support 3d matrix since it checks rows.
This operator[] shall check size.p[0] instead.
added 64b optimization for 3 channels case
not added 64b optimization for 4 channels case since timings did not
show any improvement
split ICV_HLINE cases into inline functions instead of macro for code
size reduction, without significand speed drawback at first sight
Finished with several samples support, need regression testing
Gave a more relevant name to function (getVotes)
Finished implicit implementation
Removed printf, finished regresion testing
Fixed conversion warning
Finished test for Rtrees
Fixed documentation
Initialized variable
Added doxygen documentation
Added parameter name
medianBlur called with "empty" source and ksize >= 7 crashes application with accessviolation. With this extra assert this is avoided and the application may normally catch the thrown exception.
* Fix the documentation for Mat::diag(int).
Fix issue #8181
* Fix the documentation for Mat::diag(int).
Fix issue #8181.
* Add support for printing out cv::Complex.
* Remove extra spaces.
* cv::Complex is submitted as a new pull request.
Add support for printing out cv::Complex. (#8208)
* Add support for printing out cv::Complex.
* Conform to the format of std::complex.
* Remove extra spaces.
* Remove extra spaces.
Although both `cl_platform_info` and `cl_device_info` are defined as macro `cl_uint`, it needs to use `cl_platform_info` to get
the platform information.
- don't use undefined flag=0. It should be CONSTANT instead.
- don't allow 'UMat* m=NULL' argument (except LOCAL/CONSTANT flags).
This case is not handled well to provide NULL __global pointers.
It is better to use '-D' macro defines instead (at least for performance)
Fix typos in the documentation for AutoBuffer. (#8197)
* Allocate 1000 floats to match the documentation
Fix the documentation of `AutoBuffer`. By default, the following code
```.cpp
cv::AutoBuffer<float> m;
````
allocates only 264 floats. But the comment in the demonstration code says it allocates 1000 floats, which is
not correct.
* fix typo in the comment.
fix for opencv/opencv#8105, compilation issue with mingw32 (in
google/googletest#721 a similar issue was solved and the reason was
described as MinGW defines _CRITICAL_SECTION and _RTL_CRITICAL_SECTION
as two separate (equivalent) structs, instead of using typedef)
CMake: Building Dynamic Framework on iOS (#8009)
* Updated python script with dynamic parameter
Updated python script to build static library by default
Updated python script to include bitcode flag
Added bitcode flag to c flags
Fixed directories and targets with static
Bitcode parameter fixed
Fixed script for static library
Fixed parameters in build function
Updated cmake common toolchain
Added changes to OpenCV Utils
Updates to cmake
Added cache internal
Updates to common toolchain
Fixed path in framework destination and added UIKit dependency
Dynamic plist for framework
Lib version removed hardcoded value
Removed trailing whitespace in toolchain
* Removed trailing whitespace
* Fixed typo in comment
* Renamed bitcode variable to bitcodedisabled
* Fixed target device family
Append zero to trailing decimal place for FileStorage JSON write of a float or double value (#7952)
* Fix for FileStorage JSON write of a float or double value that has no fractional part; appends a zero character after the trailing decimal place to meet JSON standard.
* strlen return to size_t type rather than unnecessary cast to int
* moved BLAS/LAPACK detection scripts from opencv_contrib/dnn to the main repository.
* trying to fix the bug with undefined symbols sgesdd_ and dgesdd_
* removed extra whitespaces; disabled LAPACK on IOS
* OpenVX HAL updated to use generic OpenVX wrappers
* vxErr class from OpenVX HAL replaced with ivx::WrapperError
* reduced usage of vxImage class from OpenVX HAL replaced with ivx::Image
* vxImage class rewritten as ivx::Image subclass that calls swapHandle prior release
* Fix OpenVX HAL build
* Fix for review comments
Added OpenVX based processing to FAST (#7720)
* added wrapper for OVX FAST & fixes to IVX wrappers
* fixed type checks in wrappers, array downloading code simplified
* rewritten for new macro use
OpenVX pyrDown wrappers (#7793)
* wrappers for vx_pyramid added
* initial version of pyrDown() wrapper added
* disabled for Khronos
* rewritten for new macro use; border mode added to node
OpenVX optical flow PyrLK wrappers added (#7774)
* wrappers for vx_pyramid added
* initial version of Optical Flow PyrLK wrappers added
* array downloading code simplified
* disabled due to bad accuracy; fixed bugs, e.g. vendor-specific ones
* rewritten for new macro use
Currently, to select a submatrix of a N-dimensional matrix, it requires
two lines of code while only one line of code is required if using a 2D
array.
I added functionality to be able to select an N-dim submatrix using a
vector list instead of a Range pointer. This allows initializer lists to
be used for a one-line selection.
This allows for an N-dimensional array to be setup in one line instead of two when using C++11 initializer lists. cv::Mat(3, {zDim, yDim, xDim}, ...) can be used instead of having to create an int pointer to hold the size array.
the current camera model is only valid up to 180° FOV for larger FOV the
undistort loop does not converge.
Clip values so we still get plausible results for super fisheye images >
180°.
Add new 5x5 gaussian blur kernel for CV_8UC1 format,
it is 50% ~ 70% faster than current ocl kernel in the perf test.
Signed-off-by: Li Peng <peng.li@intel.com>
Add support image save parameters in cv::VideoWriter.
This change will become available setting same parameters as
cv::imwrite() to cv::VideoWriter::set( cv::IMWRITE_*, value ).
Add new OpenCL kernels for bicubic interploation, it is 20% faster
than current warp image kernel with bicubic interploation.
Signed-off-by: Li Peng <peng.li@intel.com>
Add new ocl kernels for warpAffine and warpPerspective,
The average performance improvemnt is about 30%. The new
ocl kernels require CV_8UC1 format and support nearest
neighbor and bilinear interpolation.
Signed-off-by: Li Peng <peng.li@intel.com>
* restore Google Test 1.7.0 (get patch)
* ts: update Google Test to 1.8.0 release
https://github.com/google/googletest
* ts: re-apply OpenCV patch for gtest
* ts: fixes for gtest 1.8.0
* ts: workaround MSVS2015 problem in gtest
ExifReader::getExif may enter infinite loop with jpeg image which have no EOI.
For example, bytesToSkip may be set to 0 and fseek seems like fseek(f, -2 , SEEK_CUR) for image that end with RST7(FF D7) instead of EOI.
This ocl kernel is 46%~171% faster than current laplacian 3x3
ocl kernel in the perf test, with image format "CV_8UC1".
Signed-off-by: Li Peng <peng.li@intel.com>
Change contour test images to be very wide (#7464)
* Change contour test images to be very wide (#7409, #7458)
Unfortunately, slows down the tests.
* Decrease the number of contour test cases, in order to (at least partially) offset the test run duration increase caused by making the test images wider
* Don't test with very wide images on 32-bit architectures
Maximum depth limit var was added to the instrumentation structure;
Trace names output console output fix: improper tree formatting could happen;
Output in case of error was added;
Custom regions improvements;
Improved timing and weight calculation for parallel regions; New TC (threads counter) value to indicate how many different threads accessed particular node;
parallel_for, warnings fixes and ReturnAddress code from Alexander Alekhin;
This ocl kernel is for 3x3 kernel size and CV_8UC1 format
It is 115% ~ 300% faster than current ocl path in perf test
python ./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_GaussianBlurFixture*
Signed-off-by: Li Peng <peng.li@intel.com>
This kernel is for CV_8UC1 format and 3x3 kernel size,
It is about 33% ~ 55% faster than current ocl kernel with below perf test
python ./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_ErodeFixture*
python ./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_DilateFixture*
Also add accuracy test cases for this kernel, the test command is
./bin/opencv_test_imgproc --gtest_filter=OCL_Filter/MorphFilter3x3*
Signed-off-by: Li Peng <peng.li@intel.com>
Fixes#7603, which was caused by OCVViewPort::icvmouseProcessing
not being declared as virtual, and hence was not overriden by
DefaultViewPort::icvmouseProcessing (which does the inverse
coordinate mapping).
modules/objectdetect/src/detection_based_tracker.cpp: made unique_lock<mutex> local to each function
samples/cpp/dbt_face_detection.cpp: fixed warnings on loop in Visual Studio
* use hasSIMD128 rather than calling checkHardwareSupport
* add SIMD check in spartialgradient.cpp
* add SIMD check in stereosgbm.cpp
* add SIMD check in canny.cpp
checks whether the window exists and is visible. On QT closing a window
merley hides it, so the common hack for checking whether a window exists
exists = cv2.getWindowProperty(.., 0) >= 0
does not work.
The optimization is for CV_8UC1 format and 3x3 box filter,
it is 15%~87% faster than current ocl kernel with below perf test
./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_BlurFixture*
Also add test cases for this ocl kernel.
Signed-off-by: Li Peng <peng.li@intel.com>
A bug in ICC improperly identified the first parameter as "void*"
rather than the proper "volatile long*". This is scheduled to be
fixed in ICC in a future release.
This patch casts only to a "long*" to preserve backwards compatibility
with the ICC 16 and ICC 17 releases.
In YAML 1.0 the colon is mandatory. See http://yaml.org/spec/1.0/#id2558635.
This also allows prior releases to read YAML files created with the current version.
[GSOC] New camera model for stitching pipeline
* implement estimateAffine2D
estimates affine transformation using robust RANSAC method.
* uses RANSAC framework in calib3d
* includes accuracy test
* uses SVD decomposition for solving 3 point equation
* implement estimateAffinePartial2D
estimates limited affine transformation
* includes accuracy test
* stitching: add affine matcher
initial version of matcher that estimates affine transformation
* stitching: added affine transform estimator
initial version of estimator that simply chain transformations in homogeneous coordinates
* calib3d: rename estimateAffine3D test
test Calib3d_EstimateAffineTransform rename to Calib3d_EstimateAffine3D. This is more descriptive and prevents confusion with estimateAffine2D tests.
* added perf test for estimateAffine functions
tests both estimateAffine2D and estimateAffinePartial2D
* calib3d: compare error in square in estimateAffine2D
* incorporates fix from #6768
* rerun affine estimation on inliers
* stitching: new API for parallel feature finding
due to ABI breakage new functionality is added to `FeaturesFinder2`, `SurfFeaturesFinder2` and `OrbFeaturesFinder2`
* stitching: add tests for parallel feature find API
* perf test (about linear speed up)
* accuracy test compares results with serial version
* stitching: use dynamic_cast to overcome ABI issues
adding parallel API to FeaturesFinder breaks ABI. This commit uses dynamic_cast and hardcodes thread-safe finders to avoid breaking ABI.
This should be replaced by proper method similar to FeaturesMatcher on next ABI break.
* use estimateAffinePartial2D in AffineBestOf2NearestMatcher
* add constructor to AffineBestOf2NearestMatcher
* allows to choose between full affine transform and partial affine transform. Other params are the as for BestOf2NearestMatcher
* added protected field
* samples: stitching_detailed support affine estimator and matcher
* added new flags to choose matcher and estimator
* stitching: rework affine matcher
represent transformation in homogeneous coordinates
affine matcher: remove duplicite code
rework flow to get rid of duplicite code
affine matcher: do not center points to (0, 0)
it is not needed for affine model. it should not affect estimation in any way.
affine matcher: remove unneeded cv namespacing
* stitching: add stub bundle adjuster
* adds stub bundle adjuster that does nothing
* can be used in place of standard bundle adjusters to omit bundle adjusting step
* samples: stitching detailed, support no budle adjust
* uses new NoBundleAdjuster
* added affine warper
* uses R to get whole affine transformation and propagates rotation and translation to plane warper
* add affine warper factory class
* affine warper: compensate transformation
* samples: stitching_detailed add support for affine warper
* add Stitcher::create method
this method follows similar constructor methods and returns smart pointer. This allows constructing Stitcher according to OpenCV guidelines.
* supports multiple stitcher configurations (PANORAMA and SCANS) for convenient setup
* returns cv::Ptr
* stitcher: dynamicaly determine correct estimator
we need to use affine estimator for affine matcher
* preserves ABI (but add hints for ABI 4)
* uses dynamic_cast hack to inject correct estimator
* sample stitching: add support for multiple modes
shows how to use different configurations of stitcher easily (panorama stitching and scans affine model)
* stitcher: find features in parallel
use new FeatureFinder API to find features in parallel. Parallelized using TBB.
* stitching: disable parallel feature finding for OCL
it does not bring much speedup to run features finder in parallel when OpenCL is enabled, because finder needs to wait for OCL device.
Also, currently ORB is not thread-safe when OCL is enabled.
* stitching: move matcher tests
move matchers tests perf_stich.cpp -> perf_matchers.cpp
* stitching: add affine stiching integration test
test basic affine stitching (SCANS mode of stitcher) with images that have only translation between them
* enable surf for stitching tests
stitching.b12 test was failing with surf
investigated the issue, surf is producing good result. Transformation is only slightly different from ORB, so that resulting pano does not exactly match ORB's result. That caused sanity check to fail.
* added size checks similar to other tests
* sanity check will be applied only for ORB
* stitching: fix wrong estimator choice
if case was exactly wrong, estimators were chosen wrong
added logging for estimated transformation
* enable surf for matchers stitching tests
* enable SURF
* rework sanity checking. Check estimated transform instead of matches. Est. transform should be more stable and comparable between SURF and ORB.
* remove regression checking for VectorFeatures tests. It has a lot if data andtest is the same as previous except it test different vector size for performance, so sanity checking does not add any value here. Added basic sanity asserts instead.
* stitching tests: allow relative error for transform
* allows .01 relative error for estimated homography sanity check in stitching matchers tests
* fix VS warning
stitching tests: increase relative error
increase relative error to make it pass on all platforms (results are still good).
stitching test: allow bigger relative error
transformation can differ in small values (with small absolute difference, but large relative difference). transformation output still looks usable for all platforms. This difference affects only mac and windows, linux passes fine with small difference.
* stitching: add tests for affine matcher
uses s1, s2 images. added also new sanity data.
* stitching tests: use different data for matchers tests
this data should yeild more stable transformation (it has much more matches, especially for surf). Sanity data regenerated.
* stitching test: rework tests for matchers
* separated rotation and translations as they are different by scale.
* use appropriate absolute error for them separately. (relative error does not work for values near zero.)
* stitching: fix affine warper compensation
calculation of rotation and translation extracted for plane warper was wrong
* stitching test: enable surf for opencl integration tests
* enable SURF with correct guard (HAVE_OPENCV_XFEATURES2D)
* add OPENCL guard and correct namespace as usual for opencl tests
* stitching: add ocl accuracy test for affine warper
test consistent results with ocl on and off
* stitching: add affine warper ocl perf test
add affine warper to existing warper perf tests. Added new sanity data.
* stitching: do not overwrite inliers in affine matcher
* estimation is run second time on inliers only, inliers produces in second run will not be therefore correct for all matches
* calib3d: add Levenberg–Marquardt refining to estimateAffine2D* functions
this adds affine Levenberg–Marquardt refining to estimateAffine2D functions similar to what is done in findHomography.
implements Levenberg–Marquardt refinig for both full affine and partial affine transformations.
* stitching: remove reestimation step in affine matcher
reestimation step is not needed. estimateAffine2D* functions are running their own reestimation on inliers using the Levenberg-Marquardt algorithm, which is better than simply rerunning RANSAC on inliers.
* implement partial affine bundle adjuster
bundle adjuster that expect affine transform with 4DOF. Refines parameters for all cameras together.
stitching: fix bug in BundleAdjusterAffinePartial
* use the invers properly
* use static buffer for invers to speed it up
* samples: add affine bundle adjuster option to stitching_detailed
* add support for using affine bundle adjuster with 4DOF
* improve logging of initial intristics
* sttiching: add affine bundle adjuster test
* fix build warnings
* stitching: increase limit on sanity check
prevents spurious test failures on mac. values are still pretty fine.
* stitching: set affine bundle adjuster for SCANS mode
* fix bug with AffineBestOf2NearestMatcher (we want to select affine partial mode)
* select right bundle adjuster
* stitching: increase error bound for matcher tests
* this prevents failure on mac. tranformation is still ok.
* stitching: implement affine bundle adjuster
* implements affine bundle adjuster that is using full affine transform
* existing test case modified to test both affinePartial an full affine bundle adjuster
* add stitching tutorial
* show basic usage of stitching api (Stitcher class)
* stitching: add more integration test for affine stitching
* added new datasets to existing testcase
* removed unused include
* calib3d: move `haveCollinearPoints` to common header
* added comment to make that this also checks too close points
* calib3d: redone checkSubset for estimateAffine* callback
* use common function to check collinearity
* this also ensures that point will not be too close to each other
* calib3d: change estimateAffine* functions API
* more similar to `findHomography`, `findFundamentalMat`, `findEssentialMat` and similar
* follows standard recommended semantic INPUTS, OUTPUTS, FLAGS
* allows to disable refining
* supported LMEDS robust method (tests yet to come) along with RANSAC
* extended docs with some tips
* calib3d: rewrite estimateAffine2D test
* rewrite in googletest style
* parametrize to test both robust methods (RANSAC and LMEDS)
* get rid of boilerplate
* calib3d: rework estimateAffinePartial2D test
* rework in googletest style
* add testing for LMEDS
* calib3d: rework estimateAffine*2D perf test
* test for LMEDS speed
* test with/without Levenberg-Marquart
* remove sanity checking (this is covered by accuracy tests)
* calib3d: improve estimateAffine*2D tests
* test transformations in loop
* improves test by testing more potential transformations
* calib3d: rewrite kernels for estimateAffine*2D functions
* use analytical solution instead of SVD
* this version is faster especially for smaller amount of points
* calib3d: tune up perf of estimateAffine*2D functions
* avoid copying inliers
* avoid converting input points if not necessary
* check only `from` point for collinearity, as `to` does not affect stability of transform
* tutorials: add commands examples to stitching tutorials
* add some examples how to run stitcher sample code
* mention stitching_detailed.cpp
* calib3d: change computeError for estimateAffine*2D
* do error computing in floats instead of doubles
this have required precision + we were storing the result in float anyway. This make code faster and allows auto-vectorization by smart compilers.
* documentation: mention estimateAffine*2D function
* refer to new functions on appropriate places
* prefer estimateAffine*2D over estimateRigidTransform
* stitching: add camera models documentations
* mention camera models in module documentation to give user a better overview and reduce confusion
Aravis several updates
* Fix adressing camera with id=0
* Aravis buffer property control & status added
* Modify of autoexposure algorith, ream frame ID from aravis + new properites
* Change of macro name
* VideoCapture now returns no frame on camera disconnecion
* Allow aravis-0.4 usage, proper camera object release.
Aravis SDK: Basic software based autoexposure control
* Basic software based autoexposure control
* Aravis autoexposure: skip frame taken while changing exposure setup
Fix findContours crash for very large images (#7451)
* Cast step to size_t in order to avoid integer overflow when processing very large images
* Change assert to CV_Assert
* use __GNUC_MINOR__ in correct place to check the version of GCC
* check processor support of FP16 at run time
* check compiler support of FP16 and pass correct compiler option
* rely on ENABLE_AVX on gcc since AVX is generated when mf16c is passed
* guard correctly using ifdef in case of various configuration
* use v_float16x4 correctly by including the right header file
- fixed uninitialized memory access and memory leaks
- extracted several code blocks to separate functions
- updated part of algorithm to use cv::Mat instead of CvMat and IplImage
1) Cameras started with Y16 (V4L2_PIX_FMT_Y16) format via v4l2 backend will now exhibit default camera behavior, i.e. convert the 16-bit image to BGR as with all other formats. 16-bit 1-channel output will now only be produced for Y16 if CV_CAP_PROP_CONVERT_RGB is set to "false" using VideoCap::set method.
2) v4l2 videoio backend now supports setting CV_CAP_PROP_FOURCC explicitly (icvSetPropertyCAM_V4L function in cap_v4l.cpp), allowing users to manually set the codec on cameras that support multiple codecs.
* Changes delegate property from assign to weak
In modern Objective-C delegate should be weak. In very rare conditions you might want delegate be strong.
Assign for delegate is sign of legacy code.
This change prevents crash when you forget nil delegate in dealloc and makes rush with nilling delegate unnecessary.
This change shouldn't break any existing code.
* Adds implementation for setters and getters for weak delegate properties for non ARC Obj-C files
For whatever reason compiler can't synthesize these.
And yes, it's time to convert all Objective-C stuff to ARC.
* seriously improved performance of blur function, especially 3x3 and 5x5 cases
* trying to fix warnings and test failures
* replaced #if 0 with #if IPP_DISABLE_BLOCK
- calculate ticksTotal instead of ticksMean
- local / global width is based on ticksTotal value
- added instrumentation for OpenCL program compilation
- added instrumentation for OpenCL kernel execution
Minor fix in MatAllocator::upload
Minor fix in MatAllocator::copy
Minor fix in setSize function
Minor fix in Mat::Mat
Minor fix in cvMatNDToMat function
Minor fix in _InputArray::getMatVector
Minor fix in _InputArray::getUMatVector
Minor fix in cv::hconcat
Minor fix in cv::vconcat
Minor fix in cv::setIdentity
Minor fix in cv::trace
Minor fix in transposeI_ template function
Minor fix in reduceC_ template function
Minor fix in sort_ template function
Minor fix in sortIdx_ template function
Minor fix in cvRange function
Minor fix in MatConstIterator::seek
Minor fix in SparseMat::create
Minor fix in SparseMat::copyTo
Minor fix in SparseMat::convertTo
Minor fix in SparseMat::convertTo
Minor fix in SparseMat::ptr
Minor fix in SparseMat::resizeHashTab
Fixes indentation
In cases where the signaure string contains a terminating character,
the std::string member function size returns a smaller value than the
allocated string. In these cases, if you then try to use substr,
you will get an out_of_range exception. This patch remedies the problem.
This patch implements the PAM image format as defined at:
http://netpbm.sourceforge.net/doc/pam.html
The PAM format provides a generic means for storing 2 dimensional information.
This is useful for opencv since there are cases where data gets translated into
non standardized formats, which makes it difficult to store and load this information.
* Improve Canny by using _mm_movemask_epi8 to find next pixel magnitude greater than lower threshold. Added parallelized finalPass to Canny with variable gradients. Little changes in finalPass.
* Some things fixed
* use v_float16x4 (universal intrinsic) instead of raw SSE/NEON implementation
* define v_load_f16/v_store_f16 since v_load can't be distinguished when short pointer passed
* brush up implementation on old compiler (guard correctly)
* add test for v_load_f16 and round trip conversion of v_float16x4
* fix conversion error
* use universal intrinsic for accumulate series using float/double
* accumulate, accumulateSquare, accumulateProduct and accumulateWeighted
* add v_cvt_f64_high in both SSE/NEON
* add test for conversion v_cvt_f64_high in test_intrin.cpp
* improve some existing universal intrinsic by using new instructions in Aarch64
* add workaround for Android build in intrin_neon.hpp
There's no sense to log error messages in user’s locale.
Imagine you’re trying to guess what's going on decrypting logs in hebrew, arabic, slavic. localizedDescription is for end user messages, not for logs.
* Use `nth_element()` to find the median instead of `sort()` in `LMeDSPointSetRegistrator::run()`
* Improves performance of this part of LMedS from `n log(n)` to `n` by avoiding doing a full sort.
* Makes LMedS 2x faster for 100 points, 4x faster for 5,000 points in `EstimateAffine2D()`.
* LMedS is now never more than 2x slower than RANSAC and is faster in some cases.
* Added 2-channel ops to match existing 3-channel and 4-channel ops
* v_load_deinterleave() and v_store_interleave()
* Implements float32x4 only on SSE (but all types on NEON and CPP)
* Includes tests
* Will be used to vectorize 2D functions, such as estimateAffine2D()
* expose 2 extra methods from ml::TrainData: getNames() and getVarSymbolFlags(). The first one returns text labels from CSV (if the data has been loaded from CSV); the second one returns a matrix of boolean values; its n-th element is 1 iff the corresponding column in the CSV uses symbolic names, not numbers.
* check that the dynamic_cast succeeds
* Add Grana's connected components algorithm for 8-way connectivity. That algorithm is faster than Wu's one (currently implemented in opencv). For more details see https://github.com/prittt/YACCLAB.
* New functions signature and distance transform compatibility
* Add tests to imgproc/test/test_connectedcomponents.cpp
* Change of test_connectedcomponents.cpp for c++98 support
CvVideoWriter_AVFoundation_Mac had a serious buffer release bug.
Also made writeFrame() block until isReadyForMoreMediaData rather than
return an error.
UIImages with alpha were ending up with garbage pixels in background (random memory values). Need to initialize matrix pixels before drawing UIImage with alpha on it.
Note: didn’t fix Grayscale image with alpha stripping alpha in UIImage -> Mat conversion.
1. if a component's variation is a global minimum than it should be a local minimum
2. for the small image with invert and blur, the MSERs number should be 20
There is an issue with processing of abs(short) function for
negative argument.
Affected OpenCL devices:
- iGPU: Intel(R) HD Graphics 520 (OpenCL 2.0 )
- CPU: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz (OpenCL 2.0 (Build 10094))
the merge() calls growHistory() too many times such that:
1. some CompHistory nodes not used have been created
2. some CompHistory node's val equal their parents
* Common Canny parallelization added. TBB and single thread code removed. Final pass vectorized with SSE2 intrinsics.
* wrong #ifdef replaced with #if
* Merged to actual Canny version
* Merged common parallelized Canny with actual Canny implementation
* Remove 'Mutex *mutex' and pass 'Mutex mutex' from outside to parallelCanny
* Replaced extern Mutex with intern mutable Mutex.
2. fixed parsing of "cat[range_spec]ord[range_spec]" type specification string when using ml::TrainData::loadFromCSV(). Thanks to A. Kaehler for reporting it
* raise an error when wrong bit depth passed
* raise an build error when wrong depth is specified for cvtScaleHalf_
* remove unnecessary safe check in cvtScaleHalf_
* use intrinsic instead of direct pointer access
* update the explanation
When using OCL, the results of goodFeaturesToTrack() vary slightly from
run to run. This appears to be because the order of the results from
the findCorners kernel depends on thread execution and the sorting
function that is used at the end to rank the features only enforces are
partial sort order.
This does not materially impact the quality of the results, but it
makes it hard to build regression tests and generally introduces noise
into the system that should be avoided.
An easy fix is to change the sort function to enforce a total sort on
the features, even in cases where the match quality is exactly the same
for two features.
Non existence of _FPU_EXTENDED definition on powerpc64 (like it is for apple and arm)
prevent compilation when WITH_CUDA is set. Adding powerpc64 as case to not use these
definitions
modifié: modules/cudalegacy/test/TestHaarCascadeApplication.cpp
modifié: modules/cudalegacy/test/test_precomp.hpp
Signed-off-by: Thierry Fauck <tfauck@free.fr>