Hough many circles (#10232)
* Add Hui's optimization. Merge with latest changes in OpenCV.
* Use conditional compilation instead of a runtime flag.
* Whitespace.
* Create the sequence for the nonzero edge pixels only if using that approach.
* Improve performance for finding very large numbers of circles
* Return the circles with the larger accumulator values first, as per API documentation.
Use a separate step to check distance between circles. Allows circles to be sorted by strength first. Avoids locking in EstimateRadius which was slowing it down.
Return centers only if maxRadius == 0 as per API documentation.
* Sort the circles so results are deterministic. Otherwise the order of circles with the same strength depends on parallel processing completion order.
* Add test for HoughCircles.
* Add beads test.
* Wrap the non-zero points structure in a common interface so the code can use either a vector or a matrix.
* Remove the special case for skipping the radius search if maxRadius==0.
* Add performance tests.
* Use NULL instead of nullptr.
OpenCV should compile with C++98 compiler.
* Put test suite name first.
Use different test suite names for each test to avoid an error from the test runner.
* Address build bot errors and warnings.
* Skip radius search if maxRadius < 0.
* Dynamically switch to NZPointList when it will be faster than NZPointSet.
* Fix compile error: missing 'typename' prior to dependent type name.
* Fix compile error: missing 'typename' prior to dependent type name.
This time fix it the non C++ 11 way.
* Fix compile error: no type named 'const_reference' in 'class cv::NZPointList'
* Disable ManySmallCircles tests. Failing on Mac.
* Change beads image to JPEG for smaller file size.
Try enabling the ManySmallCircles tests again.
* Remove ManySmallCircles tests. They are failing on the Mac build.
* Fix expectations to check all circles.
* Changing case on a case-insensitive file system
Step 1: remove the old file names
* Changing case on a case-insensitive file system
Step 2: add them back with the new names
* Fix cmpAccum function to be strictly weak ordered.
* Add tests for many small circles.
* imgproc(perf): fix HoughCircles tests
* imgproc(houghCircles): refactor code
- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)
* Add test that fails
* Fix integer pointPolygonTest for large coordinate values
* Review fixes:
- change type from long long to int64
- move test code to test_contours.cpp, and make it C++98 compliant
* Hopefully fix compiler error by using push_back instead of emplace_back
* fixed OpenCL functions on Mac, so that the tests pass
* fixed compile warnings; temporarily disabled OCL branch of TV L1 optical flow on mac
* fixed other few warnings on macos
If there are no OpenCL/UMat methods calls from application.
OpenCL subsystem is initialized:
- haveOpenCL() is called from application
- useOpenCL() is called from application
- access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called
Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
(independent from OpenCL linkage type)
* Error in the documentation for cv::getRectSubPix. #9788
The function name is corrected to GetRectSubPix since, it uses the notation
of src, dst and center. Also added the number of channel assertion criteria.
* Error in the documentation for cv::getRectSubPix. #9788
Replace dst with patch in the formula, reverted function name to
getRectSubPix, removed BorderTypes comment line due to no explicit call
to the function found.
* Error in the documentation for cv::getRectSubPix. #9788
Replace dst with patch in the formula, reverted function name to
getRectSubPix, removed BorderTypes comment line due to no explicit call
to the function found.
* Update OpenCVCompilerOptimizations.cmake
Neon not supported on MSVC ARM breaking build fix
* Update OpenCVCompilerOptimizations.cmake
Whitespace
* Update intrin.hpp
Many problems in MSVC ARM builds (at least on VS2017) being fixed in this PR now.
C:\Users\Gregory\DOCUME~1\MYLIBR~1\OPENCV~3\opencv\sources\modules\core\include\opencv2/core/hal/intrin.hpp(444): error C3861: '_tzcnt_u32': identifier not found
* Update hal_replacement.hpp
Passing variadic expansion in a macro to another macro does not work properly in MSVC and a famous known workaround is hereby applied. Discussion of it: https://stackoverflow.com/questions/5134523/msvc-doesnt-expand-va-args-correctly
Only needed the fix for ARM builds: TEGRA_ macros are used for cv_hal_ functions in the carotene library.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): warning C4003: not enough actual parameters for macro 'TEGRA_ADD'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2143: syntax error: missing ')' before ','
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\core\src\arithm.cpp(2378): error C2059: syntax error: ')'
* Update hal_replacement.hpp
All hal_replacement's using carotene\hal\tegra_hal.hpp TEGRA_ functions as macros preprocessed by variadic macros should be changed, identical as was done in core.
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): warning C4003: not enough actual parameters for macro 'TEGRA_CVTBGRTOBGR'
C:\Users\Gregory\Documents\My Libraries\opencv330\opencv\sources\modules\imgproc\src\color.cpp(9604): error C2059: syntax error: '=='
* Update OpenCVCompilerOptimizations.cmake
* Update hal_replacement.hpp
* Update hal_replacement.hpp
Adds fitEllipseDirect to imgproc: The Direct least square (Direct) method by Fitzgibbon1999.
New Tests are included for the methods.
fitEllipseAMS Tests
fitEllipseDirect Tests
Comparative examples are added to fitEllipse.cpp in Samples.
imgproc: use universal intrinsic as much as possible (#9714)
* use universal intrinsic as much as possible
* make SSE3 part as common as possible with universal intrinsic implementation
* put the reducing part out of the main loop
* follow the comment
* fix the typo
* use v_reduce_sum4
* follow the comment again
* remove all CV_SSE3 part from smooth.cpp
The non-maximum suppression in the Hough accumulator incorrectly ignores maxima that extend over more than one cell, i.e. two neighboring cells both have the same accumulator value. This maximum is dropped completely instead of picking at least one of the entries. This frequently results in obvious circles being missed.
The behavior is now changed to be the same as for hough_lines.
See also https://github.com/opencv/opencv/issues/4440
GSoC 2017: Improve and Extend the JavaScript Bindings for OpenCV (#9466)
* Initial support for build with emscripten
mkdir build_js
cd build_js
cmake -D CMAKE_TOOLCHAIN_FILE=/path/to/emsdk/emsdk-portable/emscripten/master/cmake/Modules/Platform/Emscripten.cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
* Add js module
The output is build/bin/opencv_js.js
* Fix opencv2/calib3d.hpp not found issue
* Add module name
Usage:
var cv = cv();
* Add total memory as 128MB and allow growth
* Add compilation flags for emscripten
* Use EMSCRIPTEN build target
* Disable js module for non emscripten build
* Bind the preload file path to root
Usage:
face_cascade.load('haarcascade_frontalface_default.xml');
* add test folder
* fix test files
* Copy js module test to bin
* Support to run tests on Node.js
Fix tests to import cv Module when runtime is node.
Add tests.js to use qunit to auto run tests.
Modify umd wrapper to support Module is not defined.
Usage:
node tests.js
* Support UMD and file system
Wrap the opencv_js.js to opencv.js by UMD wrapper
Use emscripten file system API to load files instead of generating data file or
embedding them. It supports both browser and node.js usages.
* Fix incorrect module name in tests
* Add package.json to add dependence of qunit
* Add js_tutorials folder and a intro page of opencv.js
Enable BUILD_DOCS in CMakeLists.txt.
Add new folder of js_tutorials in folder opencv/doc.
Imitate the tutorials of OpenCV-Python to create a intro page of opencv.js and a setup guide
* Import and use binding gen from opencvjs project
* Modify the embindgen.py to pass the build and test
* Add classes and functions white list
* Consolidate hdr_parser.py (#31)
Use hdr_parser.py of python module
Add js flag to support js binding generator.
* Use emscripten::vecFromJSArray for input vector param
Fix part of #23
* Fix test cases after #34Fix#39
* Expose groupRectangles and CascadeClassifier.empty
* Add js highgui tutorials
add tutorials of imread&imshow and createTrackbar in doc/js_tutorials/js_gui folder
add interactive tutorial webpage for imread&imshow and createTrackbar in doc/js_tutorials/js_interactive_tutorials folder, and some images needed.
change doc/CMakeLists.txt to copy the interactive tutorial webpage and opencv.js to the tutorials' destination folder
* rm useless annotation in doc/CMakeLists.txt
* fix some nonstandard indentation and space
* add check if canvas is valid
* Expose BackgroundSubtractorMOG2
Fix#43
* Fix build of js doc
Limit copy_js_interactive_tutorials for doxygen build
Add dep to opencv.js
Fix#53
* Implement cv.imread & cv.imshow and insert interactive pages in tutorials (#55)
* add helper.js
* delete ALL in add target copy_js_interactive_tutorials to avoid dependence error
* Insert interactive pages in tutorials
insert the old interactive pages in markdown by using \htmlonly and \endhtmlonly command.
delete the useless interactive page
rename js_interactive_tutorials to js_assets to put some images needed in
* fix the depends of the target doxygen
add opencv.js to depends and delete the useless target of copy_js_assets
* change filename helper.js to helpers.js
* disable button or trankbar before opencv.js is ready
* Expose CV_64F
Fix#65
* improve cv.imshow to display different types as native imshow
* add utils.js to reuse functions and update tutorials
* Make doxygen depend on bin/opencv.js
* Fix memory issue of matFromArray
Fix#37
* Merge pull request from ganwenyao/tutorial_18
* Add notes for ganwenyao/tutorial_18
* Modifying for ganwenyao/tutorial_18
* Change Mat constructor with data to 5 parameters
* Mat supports constructor with Scalar
Fix#60
* update cv.imread cause the memory issue of matFromArray has been fixed
* fix canvas name and default input image
* Expose cv::Moments
Fix#85
* Add -Wno-missing-prototypes for emscripten build
* fix canvas name
* add tutorial of video input and output
* Expose enums as emscripten consts
Fix#72
* update the tutorial to use Mat constructor with Scalar and change lena.jpg
* Exclude cv::Mat for vecFromJSArray
Fix#82
* Add unit tests for cv.moments
* Fix the unit tests.
* add checkbox and stop button
* add adapter.js to make sure compatibility fo video tutorials
* Support default parameters with function overloading
* modify enums to constants
* Use https URL for MathJax.js
Fix#109
* Comment out the debug print in embindgen.py
* Expose RotatedRect
Fix#96
* replace enum with constants and improve onload function
* delete some useless paras cause #105 fixed this
* Modify const name
* Modify Contour Properties
* tutorials for imgprc2 and objdec
* Expose more functions for img proc tutorials
Fix#76
* Expose polylines for video analysis tutorial
Fix#121
* Expose constants for default parameters of img proc tutorials
Fix#122
* Fix wrong parameter types of Mat.copyTo
Fix#87
* Support default parameters of mat.convertTo
Fix#123
* Support default parameters for external constructors
Fix#131
* Revert "Expose polylines for video analysis tutorial"
This reverts commit 3ce7615652e510d30e3c0014706ac38c98883189.
Fix#121
* Support cv.minMaxLoc
Fix#127
* Expose cv.minEnclosingCircle
Fix#126
* Add video analysis tutorials
add three video tutorials, Meanshift and Camshift, Optical Flow Background Subtraction
add cup.mp4 and box.mp4 for demo in tutorials
* improve image processing tutorials
* repalce console.warn with throw to throw exception
* add try-catch to throw exception in code demo
* Change mat.size() return value to JS Array object
Fix#140
* add a note about different channels order between canvas and native opencv
* add a note about how to capture video from video files
* Binding cv.Scalar to JS array
Fix#147
* Add JS cv.Scalar object into helpers.js
* Update Install OpenCV-JavaScript tutorial page
Fix#44
* Update the OpenCV-JavaScript introduction page
Fix#44
* add cv.VideoCapture and read() function
* set the size of the hidden canvas same as the video
* Add Using OpenCV-JavaScript tutorial page
Fix#44
* fix some bad code style
* Update tutorials after 8/2 sync meeting
Changes include:
- Use OpenCV.js name instead of OpenCV-JavaScript
- Put using OpenCV.js ahead of build OpenCV.js
- Refine usage and introduction page
- Muted the video in tutorials
* Fix a typo in introduction page
* use cv.VideoCapture and its read() function to read video
* replace OpenCV-JavaScript with OpenCV.js
* Use onload of async script in js_usage tutorial
* add more info about mat.data
* Change Size to value_object
* Integrate Moh and Sajjad's editing into introduction page
* Change Point to value_object
* Change Rect to value_object with helper object
* Add helper objects for Point and Size
* Change RotatedRect to value_object with helpers
* Change MinMaxLoc and Circle to value_object
* Change TermCriteria to value_object
* Fix core_bindings.cpp for MinMaxLoc and Circle
* Remove unused types
* Change meanShift and CamShift to return Rect
* Change methods of RotatedRect to static
* Change mat.data from methods to property
Fix#75 and #77
* support img id and element in cv.imread
* Change mat.size to property and add mat.step
Fix#163
* Add matFromArray and matFromImageData as JS helpers
Fix#79, #78
* Lower camel case for Mat element getters
Fix#81
* Mat.getRoiRect and tests
Fix#86
* Support type for Mat.ptr
Fix#83
* Name changing of Mat element getters
'getUcharAt` -> 'ucharAt'
* fix code style and args names
* Fix helpers.js due to cv.Mat API update
* Fix opencv.js usage tutorial
* Fix a typo of js_setup
* Change Moments to value_object
* Add Range as value_object
Fix#171
* Support Mat.diag and Mat.isContinous
Fix#84 and #89
* Support Mat.setTo
Fix#88
* Apply edits to js_intro
* Apply edits to js_usage
* Apply edits to js_setup
* update tutorials to apply data type change
* Modify tutorials
* add core tutorials
* delete MatVector elements and delete useless set operation
* add tutorials_objdec_camera
* Add instructions for WebAssembly
* apply tech writer's feedbacks into tutorials
* Organize white list by modules
* Change size to method and bind to MatExpr.size()
Fix#177
* improve tutorials
* Modify core tutorials
* add params list and explanations for OpenCV.js functions
* remove face_profile from Face Detection in Video Capture
* Add demos link
* Change Gui to GUI
* Update js_intro based on Moh and Sajjad's edits
* Fixup for 3.3.0 rebase
* Update js_intro per Moh's suggestion
* Update contributors list per Moh's idea
* add adapter.js in video_display tutorial
* Change Mat.getRoiRect to Mat.roi
Fix#194
* Remove unnecessary files for test
Fix#192
* Licenses updated to UC BSD 3-Clause
* Apply OpenCV coding style for C++ files
* Add OpenCV license for python and js files
* Fix coding style issue in helpers.js
* Remove unused test_commons.js
* Fix coding style of test_imgproc.js
* Fix coding style of test_mat.js
* Fix space before semicolon
* Fix coding style of test_objdetect.js
* Fix coding style of tests.js
* Fix coding style of test_utils.js
* Fix coding style of test_video.js
* Fix failures of node.js tests
* Add eslint rule config and fix eslint errors
* Add eslint config for js/src and fix eslint errors
* Clean up the opencv.js dependencies
Fix#186
* Fix build issue for python generator
* Fix doxygen buildbot failure
* delete trailing whitespace, blank line at EOF and replace tab with space
* Fix tutorial_js_root reference issue for non opencv.js build
* replace the file with small size
* Initial commit of build_js.py
* Move the js build configurations to build script
* Add wasm build support
* Update OpenCV.js build tutorial by using script
* Fix global var issue in tests
* Add a README.md for build_js.py
* Copy the haar cascade files from data dir for tutorials
* Not use memory init file
* Disable debug print for modules/js/CMakeLists.txt
* Check files when build done
* Fix image name in js_gradients tutorial
* Fix image load issue in js_trackbar tutorial
* Find the opencv source directory via relative path by default
* Make the cmake args based on build_doc option
* Fix a typo in js_setup.markdown
* Fix make failure issue on config generated by build_js.py
* Eliminate js branch of hdr_parser.py
* Extract examples from js_basic_ops tutorial
* Fix coding style of utils.js
* Improve examples error handling
Handle:
1. opencv.js loading errors
2. script errors (Error)
3. cv::Exception
Fix#217
* Add enable_exception option into build_js.py
* Support print exception for exception catching disabled build
* Extract example from js_usage tutorial
* Avoid copying .eslintrc.json when building doc
Fix#223
* Revert to use onload as opencv.js ready event
* Use 4 spaces indention for js examples
* embed html in tutorials with iframe tag
* Revert to use onload as opencv.js ready event
* Extract examples from js_video_display tutorial
* Implement Utils object
* modify core imgprc and face_detection tutorials
* Fix examples of js_gui tutorials
* Fix coding style of utils.js
* Modify tutorials
* Extract example from js_face_detection_camera tutorial
* Disable new-cap check in eslint
* Extract examples from js_meanshift tutorial
* Extract examples from video tutorials
* Remove new-cap declaration and update grammer in comments
* Change textarea width to 100 to align with eslint config
* Fix printError issue when opencv.js loading fails
* Remove BUILD_opencv_js dependency for doc build
Fix#213
* Expose cv::getBuildInformation
* Dump opencv build info when opencv.js loaded for live examples
* Make the button to stand out in js live examples
Fix#235
* Style for disabled button
* Add js_imgproc_camera.html example
* Fix coding style of imgproc_camera example
* Add js_imgproc_camera tutorial
* Remove link to opencv.js demos
* doc: copy opencv.js on build, use absolute paths for assets
* doc: reuse existed file box.mp4
Added gradiantSize param into goodFeaturesToTrack API (#9618)
* Added gradiantSize param into goodFeaturesToTrack API
Removed hardcode value 3 in goodFeaturesToTrack API, and
added new param 'gradinatSize' in this API so that user can
pass any gradiant size as 3, 5 or 7.
Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
Signed-off-by: Nilaykumar Patel<nilay.nilpat@gmail.com>
Signed-off-by: Prashanth Voora <prashanthx85@gmail.com>
* fixed compilation error for java test
Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
* Modifying code for previous binary compatibility and fixing other warnings
fixed ABI break issue
resolved merged conflict
compilation error fix
Signed-off-by: Vipin Anand <anand.vipin@gmail.com>
Signed-off-by: Patel, Nilaykumar K <nilay.nilpat@gmail.com>
* lab_tetra squashed
* initial version is almost written
* unfinished work
* compilation fixed, to be debugged
* Lab test removed
* more fixes
* Luv2RGBinteger: channels order fixed
* Lab structs removed
* good trilinear interpolation added
* several fixes
* removed Luv2RGB interpolations, XYZ tables; 8-cell LUT added
* no_interpolate made 8-cell
* interpolations rewritten to 8-cell, minor fixes
* packed interpolation added for RGB2Luv
* tetra implemented
* removing unnecessary code
* LUT building merged
* changes ported to color.cpp
* minor fixes; try to suppress warnings
* fixed v range of Luv
* fixed incorrect src channel number
* minor fixes
* preliminary version of Luv2RGBinteger is done
* Luv2RGB_b is in progress
* XYZ color constants converted to softfloat
* Luv test: precision fixed
* Luv bit-exactness test added
* warnings fixed
* compilation fixed, error message fixed
* Luv check is limited to [0-2,0-2,0-2] by XYZ
* L->Y generation moved to LUT
* LUTs added for up and vp of Luv2RGB_b
* still works
* fixed-point is done, works at maxerr 2
* vectorized code is done, 2x slower than original
* perf improved by 10%
* extra comments removed
* code moved to color.cpp
* test_lab.cpp updated
* minor refactoring
* test added for Luv2RGB
* OCL Luv2RGB_b: XYZ are limited to [0, 2]; docs updated
* Luv2RGB_b rewritten to universal intrinsics
* test_lab.cpp moved to luv_tetra branch
* Imgproc_ColorLab_Full.accuracy test fixed
* Lab and Luv tests: rewritten, constants explained
* CV_ColorCvtBaseTest: added methods for 8u implementations
* Lab2RGB_b: bit-exactness enabled for all modes; non-vectorized code fixed to comply with vectorized
* srgb support added
* XYZ constants made softdouble
* bit-exact tests written for Lab
* ColorLab_full test fixed
* reverted: no 8u convertors for CV_ColorCvtBaseTest
* added checksum-based test for Lab bit-exactness
* extra declarations removed
* Lab test fix: stop at first mismatch
* test info output improved
* error message fixed
* lab_tetra squashed
* initial version is almost written
* unfinished work
* compilation fixed, to be debugged
* Lab test removed
* more fixes
* Luv2RGBinteger: channels order fixed
* Lab structs removed
* good trilinear interpolation added
* several fixes
* removed Luv2RGB interpolations, XYZ tables; 8-cell LUT added
* no_interpolate made 8-cell
* interpolations rewritten to 8-cell, minor fixes
* packed interpolation added for RGB2Luv
* tetra implemented
* removing unnecessary code
* LUT building merged
* changes ported to color.cpp
* minor fixes; try to suppress warnings
* fixed v range of Luv
* fixed incorrect src channel number
* minor fixes
* preliminary version of Luv2RGBinteger is done
* Luv2RGB_b is in progress
* XYZ color constants converted to softfloat
* Luv test: precision fixed
* Luv bit-exactness test added
* warnings fixed
* compilation fixed, error message fixed
* test_lab.cpp removed
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
[GSOC] Speeding-up AKAZE, part #2 (#8951)
* feature2d: instrument more functions used in AKAZE
* rework Compute_Determinant_Hessian_Response
* this takes 84% of time of Feature_Detection
* run everything in parallel
* compute Scharr kernels just once
* compute sigma more efficiently
* allocate all matrices in evolution without zeroing
* features2d: add one bigger image to tests
* now test have images: 600x768, 900x600 and 1385x700 to cover different resolutions
* explicitly zero Lx and Ly
* add Lflow and Lstep to evolution as in original AKAZE code
* reworked computing keypoints orientation
integrated faster function from https://github.com/h2suzuki/fast_akaze
* use standard fastAtan2 instead of getAngle
* compute keypoints orientation in parallel
* fix visual studio warnings
* replace some wrapped functions with direct calls to OpenCV functions
* improved readability for people familiar with opencv
* do not same image twice in base level
* rework diffusity stencil
* use one pass stencil for diffusity from https://github.com/h2suzuki/fast_akaze
* improve locality in Create_Scale_Space
* always compute determinat od hessian and spacial derivatives
* this needs to be computed always as we need derivatives while computing descriptors
* fixed tests of AKAZE with KAZE descriptors which have been affected by this
Currently it computes all first and second order derivatives together and the determiant of the hessian. For descriptors it would be enough to compute just first order derivates, but it is not probably worth it optimize for scenario where descriptors and keypoints are computed separately, since it is already very inefficient. When computing keypoint and descriptors together it is faster to do it the current way (preserves locality).
* parallelize non linear diffusion computation
* do multiplication right in the nlp diffusity kernel
* rework kfactor computation
* get rid of sharing buffers when creating scale space pyramid, the performace impact is neglegible
* features2d: initialize TBB scheduler in perf tests
* ensures more stable output
* more reasonable profiles, since the first call of parallel_for_ is not getting big performace hit
* compute_kfactor: interleave finding of maximum and computing distance
* no need to go twice through the data
* start to use UMats in AKAZE to leverage OpenCl in the future
* fixed bug that prevented computing determinant for scale pyramid of size 1 (just the base image)
* all descriptors now support writing to uninitialized memory
* use InputArray and OutputArray for input image and descriptors, allows to make use UMAt that user passes to us
* enable use of all existing ocl paths in AKAZE
* all parts that uses ocl-enabled functions should use ocl by now
* imgproc: fix dispatching of IPP version when OCL is disabled
* when OCL is disabled IPP version should be always prefered (even when the dst is UMat)
* get rid of copy in DeterminantHessian response
* this slows CPU version considerably
* do no run in parallel when running with OCL
* store derivations as UMat in pyramid
* enables OCL path computing of determint hessian
* will allow to compute descriptors on GPU in the future
* port diffusivity to OCL
* diffusivity itself is not a blocker, but this saves us downloading and uploading derivations
* implement kernel for nonlinear scalar diffusion step
* download the pyramid from GPU just once
we don't want to downlaod matrices ad hoc from gpu when the function in AKAZE needs it. There is a HUGE mapping overhead and without shared memory support a LOT of unnecessary transfers.
This maps/downloads matrices just once.
* fix bug with uninitialized values in non linear diffusion
* this was causing spurious segfaults in stitching tests due to propagation of NaNs
* added new test, which checks for NaNs (added new debug asserts for NaNs)
* valgrind now says everything is ok
* add nonlinear diffusion step OCL implementation
* Lt in pyramid changed to UMat, it will be downlaoded from GPU along with Lx, Ly
* fix bug in pm_g2 kernel. OpenCV mangles dimensions passed to OpenCL, so we need to check for boundaries in each OCL kernel.
* port computing of determinant to OCL
* computing of determinant is not a blocker, but with this change we don't need to download all spatial derivatives to CPU, we only download determinant
* make Ldet in the pyramid UMat, download it from CPU together with the other parts of the pyramid
* add profiling macros
* fix visual studio warning
* instrument non_linear_diffusion
* remove changes I have made to TEvolution
* TEvolution is used only in KAZE now
* Revert "features2d: initialize TBB scheduler in perf tests"
This reverts commit ba81e2a711.
RGB2Lab_f added, bugs fixed, moved to float
several bugs fixed
LUT fixed, no switch in tetraInterpolate()
temporary code; to be removed and rewritten
before refactoring
extra interpolations removed, some things to do left
added Lab2RGB_b +XYZ version, etc.
basic version is done, to be sped up
tetra refactored
interpolations: LUT for weights, refactor., etc.
address arithm optimized
initial version of vectorized code added (not compiling now)
compilation fixed, now segfaults
a lot of fixes, vectorization temp. disabled
fixed trilinear shift size, max error dropped from 19 to 10
fixed several bugs (255 vs 256, signed vs unsigned, bIdx)
minor changes
packed: address arithmetics fixed
shorter code
experiments with pure integer calculations
Lab2RGB max error decreased to 2; need to clean the code
ready for vectorization; need cleaning
vectorized, to be debugged
precision fixed, max error is 2
Lab->XYZ shortened
minor fixes
Lab2RGB_f version fixed, to be completely rewritten using _b code
RGB2Lab_f vectorized
minors
moved to separate file
refactored Lab2RGB to float and int versions
minor fix
Lab2RGB_f vectorized
minor refactoring
Lab2RGBint refactored: process methods, vectorize by 4 pix
Lab2RGB_f int version is done
cleanup extra code
code copied to color.cpp
fixed blue idx bug
optimizations enabled when testing; mulFracConst introduced
divConst -> mulFracConst
calc min time in perf instead of avg
minors
process() slightly sped up
Lab2RGB_f: disabled int version
reinterpret added, minor fixes in names
some warnings fixed
changes transferred to color.cpp
RGB2Lab_f code (and trilinear interpolation code) moved to rgb2lab_faster
whitespace
shift negative fixed
more warnings fixed
"constant condition" warnings fixed, little speed up
minor changes
test_photo decolor fixed
changes copied to test_lab.cpp
idx bounds checking in LUT init
several fixes
WIP: softfloat almost integrated
test_lab partially rewritten to SoftFloat
color.cpp rewritten to SoftFloat
test_lab.cpp: accuracy code added
several fixes
RGB2Lab_b testing fixed
splineBuild() rewritten to SoftFloat
accuracy control improved
rounding fixed
Luv <=> RGB: rewritten to SoftFloat
OCL cvtColor Lab and Lut rewritten to SoftFloat
minor fixes
refactored to new SoftFloat interface
round() -> cvRound, etc.
fixed OCL tests
softfloat.cpp: internal functions made static, unused ones removed
meaningful constants
extra lines removed
unused function removed
unfinished work
it works, need to fix TODOs
refactoring; more calls rewritten
mulFracConst removed
constants made bit exact; minors
changes moved to color.cpp
fixed 1 bug and 4 warnings
OCL: fixed constants
pow(x, _1_3f) replaced by cubeRoot(x)
fixed compilation on MSVC32
magic constants explained
file with internal accuracy&speed tests moved to lab_tetra branch
merge_histogram kernel only need "BINS" theads to accumulate the
histgrams, it is not efficient to directly use maxGroupSize as
local size if maxGroupSize is far greater then BINS.
Remove unnecessary Non-ASCII characters from source code (#9075)
* Remove unnecessary Non-ASCII characters from source code
Remove unnecessary Non-ASCII characters and replace them with ASCII
characters
* Remove dashes in the @param statement
Remove dashes and place single space in the @param statement to keep
coding style
* misc: more fixes for non-ASCII symbols
* misc: fix non-ASCII symbol in CMake file
* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)
* also, added missing copyrights to many of the layer implementation files
* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
Parallelize Canny with custom gradient (#8694)
* New Canny implementation. Restructuring code in parallelCanny class. Align mag buffer and map.
* Fix warnings.
* Missing SIMD check added.
* Replaced local trailingZeros in contours.cpp. Use alignSize in canny.cpp
* Fix warnings in alignSize and allocate just minimum extra columns.
* Fix another warning in map.create.
* Exchange for loop by do loop to avoid double check at the beginning.
Define extra SIMD CANNY_CHECK to avoid unnecessary continue.
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
Added assertios to remap and warpAffine functions
As @mshabunin said, remap and warpAffine functions do not support more than 4 channels in
Bicubic and Lanczos4 interpolation modes. Assertions were added. Appropriate test was chenged.
resolves#8272
Warping a matrix with more than 4 channels using BORDER_CONSTANT and
INTER_NEAREST, INTER_CUBIC or INTER_LANCZOS4 interpolation led to
undefined behaviour. This commit changes the behavior of these methods
to be similar to that of INTER_LINEAR. Changed the scope of some of the
variables to more local. Modified some tests to be able to detect the
error described.
added 64b optimization for 3 channels case
not added 64b optimization for 4 channels case since timings did not
show any improvement
split ICV_HLINE cases into inline functions instead of macro for code
size reduction, without significand speed drawback at first sight
medianBlur called with "empty" source and ksize >= 7 crashes application with accessviolation. With this extra assert this is avoided and the application may normally catch the thrown exception.
- don't use undefined flag=0. It should be CONSTANT instead.
- don't allow 'UMat* m=NULL' argument (except LOCAL/CONSTANT flags).
This case is not handled well to provide NULL __global pointers.
It is better to use '-D' macro defines instead (at least for performance)
* OpenVX HAL updated to use generic OpenVX wrappers
* vxErr class from OpenVX HAL replaced with ivx::WrapperError
* reduced usage of vxImage class from OpenVX HAL replaced with ivx::Image
* vxImage class rewritten as ivx::Image subclass that calls swapHandle prior release
* Fix OpenVX HAL build
* Fix for review comments
OpenVX pyrDown wrappers (#7793)
* wrappers for vx_pyramid added
* initial version of pyrDown() wrapper added
* disabled for Khronos
* rewritten for new macro use; border mode added to node
Add new 5x5 gaussian blur kernel for CV_8UC1 format,
it is 50% ~ 70% faster than current ocl kernel in the perf test.
Signed-off-by: Li Peng <peng.li@intel.com>
Add new OpenCL kernels for bicubic interploation, it is 20% faster
than current warp image kernel with bicubic interploation.
Signed-off-by: Li Peng <peng.li@intel.com>
Add new ocl kernels for warpAffine and warpPerspective,
The average performance improvemnt is about 30%. The new
ocl kernels require CV_8UC1 format and support nearest
neighbor and bilinear interpolation.
Signed-off-by: Li Peng <peng.li@intel.com>
This ocl kernel is 46%~171% faster than current laplacian 3x3
ocl kernel in the perf test, with image format "CV_8UC1".
Signed-off-by: Li Peng <peng.li@intel.com>
Change contour test images to be very wide (#7464)
* Change contour test images to be very wide (#7409, #7458)
Unfortunately, slows down the tests.
* Decrease the number of contour test cases, in order to (at least partially) offset the test run duration increase caused by making the test images wider
* Don't test with very wide images on 32-bit architectures
Maximum depth limit var was added to the instrumentation structure;
Trace names output console output fix: improper tree formatting could happen;
Output in case of error was added;
Custom regions improvements;
Improved timing and weight calculation for parallel regions; New TC (threads counter) value to indicate how many different threads accessed particular node;
parallel_for, warnings fixes and ReturnAddress code from Alexander Alekhin;
This ocl kernel is for 3x3 kernel size and CV_8UC1 format
It is 115% ~ 300% faster than current ocl path in perf test
python ./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_GaussianBlurFixture*
Signed-off-by: Li Peng <peng.li@intel.com>
This kernel is for CV_8UC1 format and 3x3 kernel size,
It is about 33% ~ 55% faster than current ocl kernel with below perf test
python ./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_ErodeFixture*
python ./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_DilateFixture*
Also add accuracy test cases for this kernel, the test command is
./bin/opencv_test_imgproc --gtest_filter=OCL_Filter/MorphFilter3x3*
Signed-off-by: Li Peng <peng.li@intel.com>
* use hasSIMD128 rather than calling checkHardwareSupport
* add SIMD check in spartialgradient.cpp
* add SIMD check in stereosgbm.cpp
* add SIMD check in canny.cpp
The optimization is for CV_8UC1 format and 3x3 box filter,
it is 15%~87% faster than current ocl kernel with below perf test
./modules/ts/misc/run.py -t imgproc --gtest_filter=OCL_BlurFixture*
Also add test cases for this ocl kernel.
Signed-off-by: Li Peng <peng.li@intel.com>
Fix findContours crash for very large images (#7451)
* Cast step to size_t in order to avoid integer overflow when processing very large images
* Change assert to CV_Assert
* seriously improved performance of blur function, especially 3x3 and 5x5 cases
* trying to fix warnings and test failures
* replaced #if 0 with #if IPP_DISABLE_BLOCK
* Improve Canny by using _mm_movemask_epi8 to find next pixel magnitude greater than lower threshold. Added parallelized finalPass to Canny with variable gradients. Little changes in finalPass.
* Some things fixed
* use universal intrinsic for accumulate series using float/double
* accumulate, accumulateSquare, accumulateProduct and accumulateWeighted
* add v_cvt_f64_high in both SSE/NEON
* add test for conversion v_cvt_f64_high in test_intrin.cpp
* improve some existing universal intrinsic by using new instructions in Aarch64
* add workaround for Android build in intrin_neon.hpp
* Add Grana's connected components algorithm for 8-way connectivity. That algorithm is faster than Wu's one (currently implemented in opencv). For more details see https://github.com/prittt/YACCLAB.
* New functions signature and distance transform compatibility
* Add tests to imgproc/test/test_connectedcomponents.cpp
* Change of test_connectedcomponents.cpp for c++98 support
There is an issue with processing of abs(short) function for
negative argument.
Affected OpenCL devices:
- iGPU: Intel(R) HD Graphics 520 (OpenCL 2.0 )
- CPU: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz (OpenCL 2.0 (Build 10094))
* Common Canny parallelization added. TBB and single thread code removed. Final pass vectorized with SSE2 intrinsics.
* wrong #ifdef replaced with #if
* Merged to actual Canny version
* Merged common parallelized Canny with actual Canny implementation
* Remove 'Mutex *mutex' and pass 'Mutex mutex' from outside to parallelCanny
* Replaced extern Mutex with intern mutable Mutex.
When using OCL, the results of goodFeaturesToTrack() vary slightly from
run to run. This appears to be because the order of the results from
the findCorners kernel depends on thread execution and the sorting
function that is used at the end to rank the features only enforces are
partial sort order.
This does not materially impact the quality of the results, but it
makes it hard to build regression tests and generally introduces noise
into the system that should be avoided.
An easy fix is to change the sort function to enforce a total sort on
the features, even in cases where the match quality is exactly the same
for two features.
Add OpenCL support to linearPolar & logPolar.
The OpenCL code use float instead of double, so that it does not require
cl_khr_fp64 extension, with slight precision lost.
Add explicit conversion
Add explicit conversion from double to float to eliminate warning during
compilation.
Commits:
67fe57a add fixed video
db0ae2c Restore 2.4 source branch for bug fix 6317.
97ac59c Fix a memory leak indirectly caused by cvDestroyWindow
eb40afa Add a workaround for FFmpeg's color conversion accessing past the end of the buffer
421fcf9 Rearrange CvVideoWriter_FFMPEG::writeFrame for better readability
912592d Remove "INSTALL_NAME_DIR lib" target property
bb1c2d7 fix bug on border at pyrUp
Rewrite linearPolar & logPolar so that they do not depend on the
deprecated API CvMat. Issue 6377 is resolved in this way because the two
routines do not convert UMat to CvMat anymore.
When setting a wrong kernel size, the error message only tells the user that it
must be odd, however the conditions for rejection include values > 7 which must
be communicated. Without that, the message would be incorrect and confusing if
the user is unaware that only values 3, 5, 7 are accepted.
See the below code snippet:
while(l_counter != 0)
{
int mod = l_counter % LOCAL_TOTAL;
int pix_per_thr = l_counter / LOCAL_TOTAL + ((lid < mod) ? 1 : 0);
for (int i = 0; i < pix_per_thr; ++i)
{
int index = atomic_dec(&l_counter) - 1;
....
}
....
barrier(CLK_LOCAL_MEM_FENCE);
}
If we don't put a barrier before the for loop, then there is a possiblity
that some work item enter this loop but the others are not, the the l_counter
will be reduced in the for loop and may be changed to zero, and the other
work items may can't enter the while loop. If this happens, it breaks the
barrier's rule which requires all the work items reach the same barrier.
And it may hang the GPU depends on the implementation of opencl platform.
This issue is raised at:
https://github.com/Itseez/opencv/issues/5175
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
- added new functions from core module: split, merge, add, sub, mul, div, ...
- added function replacement mechanism
- added example of HAL replacement library
dotProd_16s - disabled for IPP 9.0.0;
filter2D - fixed kernel preparation;
morphology - conditions fix and disabled FilterMin and FilterMax for IPP 9.0.0;
GaussianBlur - disabled for CV_8UC1 due to buffer overflow;
integral - disabled for IPP 9.0.0;
IppAutoBuffer class was added;
HAVE_IPP_ICV_ONLY will be undefined if OpenCV was linked against ICV packet from IPP9 or greater. ICV9+ packets will be aligned with IPP in OpenCV APIs
This will ease code management between IPP and ICV
IPP_VERSION_MAJOR * 100 + IPP_VERSION_MINOR*10 + IPP_VERSION_UPDATE
to manage changes between updates more easily.
IPP_DISABLE_BLOCK was added to ease tracking of disabled IPP functions;
- IPP is disabled by default when compiler is mingw (couldn't make it
work)
- fixed some warnings
- fixed some `__GNUC__` version checks (for correctness and convenience)
- removed UTF-8 BOM from hough.cpp (fixes#5253)
Removed IPP port for tiny arithm.cpp functions
Additional warnings fix on various platforms.
Build without OPENCL and GCC warnings fixed
Fixed warnings, trailing spaces and removed unused secure_cpy.
IPP code refactored.
IPP code path implemented as separate static functions to simplify future work with IPP code and make it more readable.
int pix_per_thr = l_counter / LOCAL_TOTAL + ((lid < mod) ? 1 : 0);
The pix_per_thr * LOCAL_TOTAL may be larger than l_counter.
Thus the index of l_stack may be negative which may cause serious
problems. Let's skip the loop when we get negative index and we need
to add back the lcounter to keep its balance and avoid potential
negative counter.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Conflicts:
modules/gpu/perf/perf_imgproc.cpp
Cast a long integer to double explicitly.
Conflicts:
modules/python/src2/cv2.cpp
Cast some matrix sizes to type int.
Change some vector mask types to unsigned.
Conflicts:
modules/core/src/arithm.cpp