Commit Graph

2543 Commits

Author SHA1 Message Date
Rostislav Vasilikhin
f0ef7bd149 Lab and Luv tests: rewritten, constants explained 2017-09-12 17:16:30 +03:00
Rostislav Vasilikhin
0e6c335077 Imgproc_ColorLab_Full.accuracy test fixed 2017-09-12 17:16:30 +03:00
Maksim Shabunin
248e2c7d47 Fixed some issues found by static analysis 2017-09-08 12:22:12 +03:00
Alexander Alekhin
89bb028bfc imgproc(ocl): don't use doubles to process float data 2017-09-07 12:42:20 +03:00
Vitaly Tuzov
e8caa9b5c0 removed unused interpolateLinear 2017-08-31 15:34:27 +03:00
Vitaly Tuzov
b1f46b6d69 Move resize implementation to separate file 2017-08-31 14:36:19 +03:00
Vadim Pisarevsky
d743a4c969 Merge pull request #9506 from alalek:ocl_fix_canny_ub_9496 2017-08-30 13:37:44 +00:00
Alexander Alekhin
e3b12bdb59 imgproc(ocl): eliminate div by zero in Canny 2017-08-29 19:29:53 +03:00
Vladislav Sovrasov
91e56abcf1 imgproc: disable buggy inplace processing in convexHull 2017-08-29 15:28:34 +03:00
Alexander Alekhin
9c14a2f0aa Merge pull request #9395 from lupustr3:pvlasov/icv2017u3_update 2017-08-24 11:48:53 +00:00
Pavel Vlasov
a57718e1ac ICV2017u3 package update;
- Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
- Performance regressions fixes for IPP code paths;
- cv::boxFilter integration improvement;
- cv::filter2D integration improvement;
2017-08-23 14:24:43 +03:00
Alexander Alekhin
fdb3d4ff60 Merge pull request #9379 from berak:imgproc_hanning 2017-08-16 10:28:19 +00:00
berak
e7b9cfa8f2 imgproc:fix winSize in createHanningWindow() 2017-08-16 08:53:45 +02:00
Alexander Alekhin
6b2510d312 Merge pull request #9317 from sturkmen72:warpPerspective_demo 2017-08-14 13:15:48 +00:00
Suleyman TURKMEN
8bb3863f52 New example - warpPerspective_demo.cpp
An example program shows using cv::findHomography and cv::warpPerspective for image warping
2017-08-10 15:08:13 +03:00
KUANG, Fangjun
4bbe67451d fix some typos in the documentation. 2017-08-08 17:32:04 +02:00
Alexander Alekhin
c95a97389d Merge pull request #9235 from sturkmen72:patch-3 2017-08-03 17:04:28 +00:00
Jiri Horner
bb6496d9e5 Merge pull request #8951 from hrnr:akaze_part2
[GSOC] Speeding-up AKAZE, part #2 (#8951)

* feature2d: instrument more functions used in AKAZE

* rework Compute_Determinant_Hessian_Response

* this takes 84% of time of Feature_Detection
* run everything in parallel
* compute Scharr kernels just once
* compute sigma more efficiently
* allocate all matrices in evolution without zeroing

* features2d: add one bigger image to tests

* now test have images: 600x768, 900x600 and 1385x700 to cover different resolutions

* explicitly zero Lx and Ly

* add Lflow and Lstep to evolution as in original AKAZE code

* reworked computing keypoints orientation

integrated faster function from https://github.com/h2suzuki/fast_akaze

* use standard fastAtan2 instead of getAngle

* compute keypoints orientation in parallel

* fix visual studio warnings

* replace some wrapped functions with direct calls to OpenCV functions

* improved readability for people familiar with opencv
* do not same image twice in base level

* rework diffusity stencil

* use one pass stencil for diffusity from https://github.com/h2suzuki/fast_akaze
* improve locality in Create_Scale_Space

* always compute determinat od hessian and spacial derivatives

* this needs to be computed always as we need derivatives while computing descriptors
* fixed tests of AKAZE with KAZE descriptors which have been affected by this

Currently it computes all first and second order derivatives together and the determiant of the hessian. For descriptors it would be enough to compute just first order derivates, but it is not probably worth it optimize for scenario where descriptors and keypoints are computed separately, since it is already very inefficient. When computing keypoint and descriptors together it is faster to do it the current way (preserves locality).

* parallelize non linear diffusion computation

* do multiplication right in the nlp diffusity kernel

* rework kfactor computation

* get rid of sharing buffers when creating scale space pyramid, the performace impact is neglegible

* features2d: initialize TBB scheduler in perf tests

* ensures more stable output
* more reasonable profiles, since the first call of parallel_for_ is not getting big performace hit

* compute_kfactor: interleave finding of maximum and computing distance

* no need to go twice through the data

* start to use UMats in AKAZE to leverage OpenCl in the future

* fixed bug that prevented computing determinant for scale pyramid of size 1 (just the base image)
* all descriptors now support writing to uninitialized memory
* use InputArray and OutputArray for input image and descriptors, allows to make use UMAt that user passes to us

* enable use of all existing ocl paths in AKAZE

* all parts that uses ocl-enabled functions should use ocl by now

* imgproc: fix dispatching of IPP version when OCL is disabled

* when OCL is disabled IPP version should be always prefered (even when the dst is UMat)

* get rid of copy in DeterminantHessian response

* this slows CPU version considerably
* do no run in parallel when running with OCL

* store derivations as UMat in pyramid

* enables OCL path computing of determint hessian
* will allow to compute descriptors on GPU in the future

* port diffusivity to OCL

* diffusivity itself is not a blocker, but this saves us downloading and uploading derivations

* implement kernel for nonlinear scalar diffusion step

* download the pyramid from GPU just once

we don't want to downlaod matrices ad hoc from gpu when the function in AKAZE needs it. There is a HUGE mapping overhead and without shared memory support a LOT of unnecessary transfers.

This maps/downloads matrices just once.

* fix bug with uninitialized values in non linear diffusion

* this was causing spurious segfaults in stitching tests due to propagation of NaNs
* added new test, which checks for NaNs (added new debug asserts for NaNs)
* valgrind now says everything is ok

* add nonlinear diffusion step OCL implementation

* Lt in pyramid changed to UMat, it will be downlaoded from GPU along with Lx, Ly
* fix bug in pm_g2 kernel. OpenCV mangles dimensions passed to OpenCL, so we need to check for boundaries in each OCL kernel.

* port computing of determinant to OCL

* computing of determinant is not a blocker, but with this change we don't need to download all spatial derivatives to CPU, we only download determinant
* make Ldet in the pyramid UMat, download it from CPU together with the other parts of the pyramid
* add profiling macros

* fix visual studio warning

* instrument non_linear_diffusion

* remove changes I have made to TEvolution

* TEvolution is used only in KAZE now

* Revert "features2d: initialize TBB scheduler in perf tests"

This reverts commit ba81e2a711.
2017-08-01 12:46:01 +00:00
Suleyman TURKMEN
89480801b8 some improvements on tutorials 2017-07-29 20:08:19 +03:00
Alexander Alekhin
caa5e3b4c5 imgproc: fix vectorized code of accumulate 2017-07-26 17:21:46 +03:00
Rostislav Vasilikhin
70b984434d RGB2Lab_f and trilinear interpolation code are in separate branch; cubeRoot(x) instead of std::pow(x, 1.f/3.f)
file with internal accuracy&speed tests moved to lab_tetra branch
2017-07-17 21:59:54 +03:00
Alexander Alekhin
e5ed9cc612 Merge pull request #8498 from savuor:bit_exact_lab 2017-07-17 14:01:05 +00:00
Alexander Alekhin
4bb4a349c9 imgproc: fix warp optimizations 2017-07-17 15:12:41 +03:00
Shuyu Liang
c10d08f795 Fix typo in imgproc.hpp 2017-07-17 15:51:10 +08:00
Rostislav Vasilikhin
4b75be801e initial version of Lab2RGB_f tetrahedral interpolation written
RGB2Lab_f added, bugs fixed, moved to float

several bugs fixed

LUT fixed, no switch in tetraInterpolate()

temporary code; to be removed and rewritten

before refactoring

extra interpolations removed, some things to do left

added Lab2RGB_b +XYZ version, etc.

basic version is done, to be sped up

tetra refactored

interpolations: LUT for weights, refactor., etc.

address arithm optimized

initial version of vectorized code added (not compiling now)

compilation fixed, now segfaults

a lot of fixes, vectorization temp. disabled

fixed trilinear shift size, max error dropped from 19 to 10

fixed several bugs (255 vs 256, signed vs unsigned, bIdx)

minor changes

packed: address arithmetics fixed

shorter code

experiments with pure integer calculations

Lab2RGB max error decreased to 2; need to clean the code

ready for vectorization; need cleaning

vectorized, to be debugged

precision fixed, max error is 2

Lab->XYZ shortened

minor fixes

Lab2RGB_f version fixed, to be completely rewritten using _b code

RGB2Lab_f vectorized

minors

moved to separate file

refactored Lab2RGB to float and int versions

minor fix

Lab2RGB_f vectorized

minor refactoring

Lab2RGBint refactored: process methods, vectorize by 4 pix

Lab2RGB_f int version is done

cleanup extra code

code copied to color.cpp

fixed blue idx bug

optimizations enabled when testing; mulFracConst introduced

divConst -> mulFracConst

calc min time in perf instead of avg

minors

process() slightly sped up

Lab2RGB_f: disabled int version

reinterpret added, minor fixes in names

some warnings fixed

changes transferred to color.cpp

RGB2Lab_f code (and trilinear interpolation code) moved to rgb2lab_faster

whitespace

shift negative fixed

more warnings fixed

"constant condition" warnings fixed, little speed up

minor changes

test_photo decolor fixed

changes copied to test_lab.cpp

idx bounds checking in LUT init

several fixes

WIP: softfloat almost integrated

test_lab partially rewritten to SoftFloat

color.cpp rewritten to SoftFloat

test_lab.cpp: accuracy code added

several fixes

RGB2Lab_b testing fixed

splineBuild() rewritten to SoftFloat

accuracy control improved

rounding fixed

Luv <=> RGB: rewritten to SoftFloat

OCL cvtColor Lab and Lut rewritten to SoftFloat

minor fixes

refactored to new SoftFloat interface

round() -> cvRound, etc.

fixed OCL tests

softfloat.cpp: internal functions made static, unused ones removed

meaningful constants

extra lines removed

unused function removed

unfinished work

it works, need to fix TODOs

refactoring; more calls rewritten

mulFracConst removed

constants made bit exact; minors

changes moved to color.cpp

fixed 1 bug and 4 warnings

OCL: fixed constants

pow(x, _1_3f) replaced by cubeRoot(x)

fixed compilation on MSVC32

magic constants explained

file with internal accuracy&speed tests moved to lab_tetra branch
2017-07-17 00:32:30 +03:00
Alexander Alekhin
5a54acef4e Merge pull request #9130 from alalek:android_define 2017-07-14 17:17:24 +00:00
Alexander Alekhin
63e89bc326 Merge pull request #9048 from sovrasov:morph_hitmiss_fix
imgproc: fix MORPH_HITMISS operation when kernel has no negative values
2017-07-14 17:13:06 +00:00
Alexander Alekhin
11feae6631 Merge pull request #9041 from terfendail:filter_avx
AVX optimized implementation of separable filters migrated
2017-07-14 16:45:27 +00:00
Alexander Alekhin
9ef742bbf4 Merge pull request #9082 from terfendail:imgwarp_avx
AVX and SSE4.1 optimized implementation of resize and warp functions migrated
2017-07-14 16:42:42 +00:00
Alexander Alekhin
fe7fd4c312 Merge pull request #9098 from savuor:fix/luv_div 2017-07-14 15:46:03 +00:00
Alexander Alekhin
9439872a62 Merge pull request #9021 from terfendail:corner_avx 2017-07-14 14:58:06 +00:00
Alexander Alekhin
f6dd549e58 Merge pull request #9027 from terfendail:undistort_avx 2017-07-14 14:56:42 +00:00
Alexander Alekhin
8f4b534937 Merge pull request #9093 from wzw-intel:histogram 2017-07-14 14:38:55 +00:00
Alexander Alekhin
10e6491c22 Merge pull request #9024 from tomoaki0705:featureDispatchAccumulate 2017-07-14 14:30:06 +00:00
Vladislav Sovrasov
bb0f9d6bc4 core: use matlab-style in 2d fftShift 2017-07-13 13:33:02 +03:00
Vladislav Sovrasov
a683a496ea core: use matlab-style 1d fftShift in pc 2017-07-12 18:01:58 +03:00
Alexander Alekhin
a4a47b538c build: detect Android via '__ANDROID__' macro
https://sourceforge.net/p/predef/wiki/OperatingSystems
2017-07-10 12:43:59 +03:00
Tomoaki Teshima
e7d5dbfec0 dispatch accumulate series
- use universal intrinsic for base
 - dispatch for float/double version using AVX
 - AVX2 optimization not done yet
2017-07-07 18:45:30 +09:00
Vitaly Tuzov
526d1d6db1 AVX optimized implementation of undistort migrated to separate file 2017-07-06 12:08:25 +03:00
Rostislav Vasilikhin
aa621d6f3c magic constants explained 2017-07-06 00:30:53 +03:00
Rostislav Vasilikhin
704c688225 OCL code fixed, fix for NEON added 2017-07-05 22:08:49 +03:00
Rostislav Vasilikhin
6c71988c54 RGB2Luv_f: R, G, B limited to [0, 1] 2017-07-05 22:08:49 +03:00
Rostislav Vasilikhin
82811d0706 Luv: singularities fixed 2017-07-05 22:08:49 +03:00
wzw
635342ab73 ocl_calcHist1: Use proper local size for merge_histogram kernel
merge_histogram kernel only need "BINS" theads to accumulate the
histgrams, it is not efficient to directly use maxGroupSize as
local size if maxGroupSize is far greater then BINS.
2017-07-05 21:24:09 +08:00
Vitaly Tuzov
fadf25acd6 SSE4_1 optimized implementation of resize and warp functions migrated to separate file 2017-07-04 17:05:36 +03:00
Vitaly Tuzov
4d0f789e0a AVX optimized implementation of separable filters migrated to separate file 2017-07-04 13:47:47 +03:00
Vladislav Sovrasov
42936d3227 imgproc: fix MORPH_HITMISS operation when kernel has no negative values 2017-07-04 11:17:44 +03:00
Tony Lian
c8783f3e23 Merge pull request #9075 from TonyLianLong:master
Remove unnecessary Non-ASCII characters from source code (#9075)

* Remove unnecessary Non-ASCII characters from source code

Remove unnecessary Non-ASCII characters and replace them with ASCII
characters

* Remove dashes in the @param statement

Remove dashes and place single space in the @param statement to keep
coding style

* misc: more fixes for non-ASCII symbols

* misc: fix non-ASCII symbol in CMake file
2017-07-03 16:14:17 +00:00
Alexander Alekhin
7bb9237d99 Merge pull request #9060 from alalek:canny_inplace_bug 2017-07-03 16:06:50 +00:00
Vitaly Tuzov
3681dcef1a AVX optimized implementation of resize and warp functions migrated to separate file 2017-07-03 18:18:20 +03:00
Maksim Shabunin
1f23202ad8 Issues found by static analysis (5th round) 2017-07-01 18:56:24 +03:00
Alexander Alekhin
cdf2a59afa canny: disallow broken inplace arguments 2017-06-30 19:32:16 +03:00
Alexander Alekhin
a84a5e8f1a Merge pull request #8936 from terfendail:clipline_fix 2017-06-30 10:55:09 +00:00
Maksim Shabunin
e0393f8557 Fixed some issues found by static analysis (4th round) 2017-06-30 12:26:53 +03:00
Vitaly Tuzov
1ed9a58b64 AVX optimized implementation of Harris corner detector migrated to separate file 2017-06-29 15:19:23 +03:00
Maksim Shabunin
a769d69a9d Fixed several issues found by static analysis 2017-06-28 18:06:18 +03:00
Vadim Pisarevsky
8b3d6603d5 another round of dnn optimization (#9011)
* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)

* also, added missing copyrights to many of the layer implementation files

* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
2017-06-28 11:15:22 +03:00
Maksim Shabunin
32d4af36e2 Fixing some static analysis issues 2017-06-27 14:30:26 +03:00
Alexander Alekhin
006966e629 trace: initial support for code trace 2017-06-26 17:07:13 +03:00
Vitaly Tuzov
3d7fd4132b Fixed clipLine evaluation for very long lines 2017-06-26 16:00:29 +03:00
catree
a084501ecf Add a note to morphologyEx documentation to clarify the behavior when iterations > 1. 2017-06-25 00:16:16 +02:00
Maksim Shabunin
68d01972fe Merge pull request #8883 from abratchik:DNN.java.wrappers.fix 2017-06-20 09:05:53 +00:00
abratchik
037d8fbdcd Refactor OpenCV Java Wrapping 2017-06-15 20:35:12 +04:00
Alexander Alekhin
b21b694444 Merge pull request #8924 from tomoaki0705:fixWarningResize 2017-06-15 16:11:18 +00:00
Tomoaki Teshima
5ad3ddc1b6 suppress warning
- check if compiler is Intel compiler
 - remove not referenced variables
2017-06-15 18:59:48 +09:00
Alexander Alekhin
9b902adea6 Merge pull request #8832 from terfendail:perf_accumulate_fix 2017-06-14 18:49:09 +00:00
Vadim Pisarevsky
e72e34ec36 Merge pull request #8843 from terfendail:resizenn_patch 2017-06-13 17:29:30 +00:00
Vitaly Tuzov
3a5e036feb Updated fix for accumulate performance test in case of multiple iterations 2017-06-13 19:23:34 +03:00
Vitaly Tuzov
2de1aac665 Updated alignment declarations to CV_DECL_ALIGNED macro 2017-06-13 18:44:38 +03:00
Vitaly Tuzov
59373a1ae1 AVX and SSE optimizations for resize NN 2017-06-01 19:08:55 +03:00
Woody Chow
f743603b0a Fallback to single threaded version of IPP gaussian blur / bilateral filter when the mutlithreaded version cannot be called. 2017-06-01 13:34:50 +09:00
Woody Chow
d22fb5f949 Multithread IPP gaussian blur 2017-05-31 18:16:47 +09:00
Vadim Pisarevsky
ee257ffe9e Merge pull request #8455 from terfendail:ovxhal_skipsmall 2017-05-26 12:10:18 +00:00
Vitaly Tuzov
1d62a025b3 Moved size restrictions for OpenVX processed images to corresponding cpp files 2017-05-25 19:25:17 +03:00
Vadim Pisarevsky
6473018d69 eliminated trailing whitespaces 2017-05-24 16:54:12 +03:00
Vadim Pisarevsky
affb60093d Merge branch 'master' of https://github.com/MicheleCancilla/opencv into parallel_ccomp 2017-05-24 16:51:18 +03:00
mschoeneck
4a4d94f266 Merge pull request #8694 from mschoeneck:Canny
Parallelize Canny with custom gradient (#8694)

* New Canny implementation. Restructuring code in parallelCanny class. Align mag buffer and map.

* Fix warnings.

* Missing SIMD check added.

* Replaced local trailingZeros in contours.cpp. Use alignSize in canny.cpp

* Fix warnings in alignSize and allocate just minimum extra columns.

* Fix another warning in map.create.

* Exchange for loop by do loop to avoid double check at the beginning.
Define extra SIMD CANNY_CHECK to avoid unnecessary continue.
2017-05-24 16:20:25 +03:00
Vadim Pisarevsky
2e056fbe8a Merge pull request #6854 from sturkmen72:patch-8 2017-05-24 12:45:00 +00:00
Vadim Pisarevsky
9734ee13e5 Merge pull request #7865 from LaurentBerger:UserColormap 2017-05-24 12:43:55 +00:00
Vadim Pisarevsky
057c10baac Merge pull request #8377 from ottogin:interpMultichannelImg 2017-05-24 12:38:41 +00:00
Vadim Pisarevsky
19464a3ed8 Merge pull request #8780 from vpisarev:fix_boxfilter 2017-05-23 21:46:13 +00:00
Vadim Pisarevsky
883d925f59 replaced SSE2 code with universal intrinsics; improved accuracy of the box filter; it should now be bit-exact 2017-05-23 20:04:35 +03:00
Vadim Pisarevsky
a065e4b9aa Merge pull request #8769 from mshabunin:kw-fixes 2017-05-23 14:59:36 +00:00
Vadim Pisarevsky
55ee8b2917 Merge pull request #8182 from chacha21:drawing_performance 2017-05-23 14:53:12 +00:00
Alexander Alekhin
c5e9d1adae macro for static analysis tools 2017-05-23 12:35:31 +03:00
chacha21
7763a86634 restored memset optimization
when dropping optimizations in the last commit, I forgot to keep the
simplest case where a single memset can be called
2017-05-19 16:05:00 +02:00
Vadim Pisarevsky
913a2dbdaa Merge pull request #8399 from woodychow:filter_avx2 2017-05-17 15:06:24 +00:00
Vadim Pisarevsky
03aa69da99 Merge pull request #8697 from sovrasov:cvt_col_bgra_bgra_fix 2017-05-17 14:11:51 +00:00
Vitaly Tuzov
01f773b803 Fix for accumulate performance test in case of multiple iterations 2017-05-16 18:28:13 +03:00
Woody Chow
67fe820c75 Merging master to filter_avx2, and resolving conflicts 2017-05-16 15:34:11 +09:00
Vladislav Sovrasov
bfc4eb31cb imgproc: fix BGRA2BGRA conversion 2017-05-15 10:23:20 +03:00
chacha21
fa4fd48072 Drop best optimizations to reduce code size
Only keep the ICV_HLINE_X optimization to reduce code size.
2017-05-10 13:55:39 +02:00
Vadim Pisarevsky
833832ac91 Merge pull request #8391 from woodychow:warpAffine_avx2 2017-05-03 15:09:54 +00:00
Vadim Pisarevsky
e00d0528a4 Merge pull request #8397 from woodychow:initUndistortRectifyMap_avx2 2017-05-03 14:50:22 +00:00
Vadim Pisarevsky
8a16997b1e Merge pull request #8580 from terfendail:ovx_newperftest 2017-05-03 11:01:01 +00:00
Maksim Shabunin
f71f76add9 Merge pull request #8649 from saskatchewancatch:8647 2017-05-03 09:19:29 +00:00
saskatchewancatch
bbb785e43c Issue 8647: Updated doc for cv::matchTemplate to reflect current support for methods when mast template is supplied. 2017-04-30 20:22:47 -06:00
Michele Cancilla
9405c6d206 Improvement of array of equivalences’ upper bound + fix some wrong comments 2017-04-27 12:53:33 +02:00
Vitaly Tuzov
2492c299f3 Extended set of existing performance test to OpenVX HAL suitable execution modes 2017-04-27 12:32:29 +03:00
Alexander Alekhin
26be2402a3 Merge pull request #8629 from lupustr3:pvlasov/icv2017u2_update2 2017-04-26 10:45:37 +00:00