Commit Graph

2487 Commits

Author SHA1 Message Date
Vadim Pisarevsky
8b3d6603d5 another round of dnn optimization (#9011)
* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)

* also, added missing copyrights to many of the layer implementation files

* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders
2017-06-28 11:15:22 +03:00
Maksim Shabunin
32d4af36e2 Fixing some static analysis issues 2017-06-27 14:30:26 +03:00
Alexander Alekhin
006966e629 trace: initial support for code trace 2017-06-26 17:07:13 +03:00
Vitaly Tuzov
3d7fd4132b Fixed clipLine evaluation for very long lines 2017-06-26 16:00:29 +03:00
catree
a084501ecf Add a note to morphologyEx documentation to clarify the behavior when iterations > 1. 2017-06-25 00:16:16 +02:00
Maksim Shabunin
68d01972fe Merge pull request #8883 from abratchik:DNN.java.wrappers.fix 2017-06-20 09:05:53 +00:00
abratchik
037d8fbdcd Refactor OpenCV Java Wrapping 2017-06-15 20:35:12 +04:00
Alexander Alekhin
b21b694444 Merge pull request #8924 from tomoaki0705:fixWarningResize 2017-06-15 16:11:18 +00:00
Tomoaki Teshima
5ad3ddc1b6 suppress warning
- check if compiler is Intel compiler
 - remove not referenced variables
2017-06-15 18:59:48 +09:00
Alexander Alekhin
9b902adea6 Merge pull request #8832 from terfendail:perf_accumulate_fix 2017-06-14 18:49:09 +00:00
Vadim Pisarevsky
e72e34ec36 Merge pull request #8843 from terfendail:resizenn_patch 2017-06-13 17:29:30 +00:00
Vitaly Tuzov
3a5e036feb Updated fix for accumulate performance test in case of multiple iterations 2017-06-13 19:23:34 +03:00
Vitaly Tuzov
2de1aac665 Updated alignment declarations to CV_DECL_ALIGNED macro 2017-06-13 18:44:38 +03:00
Vitaly Tuzov
59373a1ae1 AVX and SSE optimizations for resize NN 2017-06-01 19:08:55 +03:00
Woody Chow
f743603b0a Fallback to single threaded version of IPP gaussian blur / bilateral filter when the mutlithreaded version cannot be called. 2017-06-01 13:34:50 +09:00
Woody Chow
d22fb5f949 Multithread IPP gaussian blur 2017-05-31 18:16:47 +09:00
Vadim Pisarevsky
ee257ffe9e Merge pull request #8455 from terfendail:ovxhal_skipsmall 2017-05-26 12:10:18 +00:00
Vitaly Tuzov
1d62a025b3 Moved size restrictions for OpenVX processed images to corresponding cpp files 2017-05-25 19:25:17 +03:00
Vadim Pisarevsky
6473018d69 eliminated trailing whitespaces 2017-05-24 16:54:12 +03:00
Vadim Pisarevsky
affb60093d Merge branch 'master' of https://github.com/MicheleCancilla/opencv into parallel_ccomp 2017-05-24 16:51:18 +03:00
mschoeneck
4a4d94f266 Merge pull request #8694 from mschoeneck:Canny
Parallelize Canny with custom gradient (#8694)

* New Canny implementation. Restructuring code in parallelCanny class. Align mag buffer and map.

* Fix warnings.

* Missing SIMD check added.

* Replaced local trailingZeros in contours.cpp. Use alignSize in canny.cpp

* Fix warnings in alignSize and allocate just minimum extra columns.

* Fix another warning in map.create.

* Exchange for loop by do loop to avoid double check at the beginning.
Define extra SIMD CANNY_CHECK to avoid unnecessary continue.
2017-05-24 16:20:25 +03:00
Vadim Pisarevsky
2e056fbe8a Merge pull request #6854 from sturkmen72:patch-8 2017-05-24 12:45:00 +00:00
Vadim Pisarevsky
9734ee13e5 Merge pull request #7865 from LaurentBerger:UserColormap 2017-05-24 12:43:55 +00:00
Vadim Pisarevsky
057c10baac Merge pull request #8377 from ottogin:interpMultichannelImg 2017-05-24 12:38:41 +00:00
Vadim Pisarevsky
19464a3ed8 Merge pull request #8780 from vpisarev:fix_boxfilter 2017-05-23 21:46:13 +00:00
Vadim Pisarevsky
883d925f59 replaced SSE2 code with universal intrinsics; improved accuracy of the box filter; it should now be bit-exact 2017-05-23 20:04:35 +03:00
Vadim Pisarevsky
a065e4b9aa Merge pull request #8769 from mshabunin:kw-fixes 2017-05-23 14:59:36 +00:00
Vadim Pisarevsky
55ee8b2917 Merge pull request #8182 from chacha21:drawing_performance 2017-05-23 14:53:12 +00:00
Alexander Alekhin
c5e9d1adae macro for static analysis tools 2017-05-23 12:35:31 +03:00
chacha21
7763a86634 restored memset optimization
when dropping optimizations in the last commit, I forgot to keep the
simplest case where a single memset can be called
2017-05-19 16:05:00 +02:00
Vadim Pisarevsky
913a2dbdaa Merge pull request #8399 from woodychow:filter_avx2 2017-05-17 15:06:24 +00:00
Vadim Pisarevsky
03aa69da99 Merge pull request #8697 from sovrasov:cvt_col_bgra_bgra_fix 2017-05-17 14:11:51 +00:00
Vitaly Tuzov
01f773b803 Fix for accumulate performance test in case of multiple iterations 2017-05-16 18:28:13 +03:00
Woody Chow
67fe820c75 Merging master to filter_avx2, and resolving conflicts 2017-05-16 15:34:11 +09:00
Vladislav Sovrasov
bfc4eb31cb imgproc: fix BGRA2BGRA conversion 2017-05-15 10:23:20 +03:00
chacha21
fa4fd48072 Drop best optimizations to reduce code size
Only keep the ICV_HLINE_X optimization to reduce code size.
2017-05-10 13:55:39 +02:00
Vadim Pisarevsky
833832ac91 Merge pull request #8391 from woodychow:warpAffine_avx2 2017-05-03 15:09:54 +00:00
Vadim Pisarevsky
e00d0528a4 Merge pull request #8397 from woodychow:initUndistortRectifyMap_avx2 2017-05-03 14:50:22 +00:00
Vadim Pisarevsky
8a16997b1e Merge pull request #8580 from terfendail:ovx_newperftest 2017-05-03 11:01:01 +00:00
Maksim Shabunin
f71f76add9 Merge pull request #8649 from saskatchewancatch:8647 2017-05-03 09:19:29 +00:00
saskatchewancatch
bbb785e43c Issue 8647: Updated doc for cv::matchTemplate to reflect current support for methods when mast template is supplied. 2017-04-30 20:22:47 -06:00
Michele Cancilla
9405c6d206 Improvement of array of equivalences’ upper bound + fix some wrong comments 2017-04-27 12:53:33 +02:00
Vitaly Tuzov
2492c299f3 Extended set of existing performance test to OpenVX HAL suitable execution modes 2017-04-27 12:32:29 +03:00
Alexander Alekhin
26be2402a3 Merge pull request #8629 from lupustr3:pvlasov/icv2017u2_update2 2017-04-26 10:45:37 +00:00
Pavel Vlasov
11c2ffaf1c Update for IPP for OpenCV 2017u2 integration;
Updated integrations for:
cv::split
cv::merge
cv::insertChannel
cv::extractChannel
cv::Mat::convertTo - now with scaled conversions support
cv::LUT - disabled due to performance issues
Mat::copyTo
Mat::setTo
cv::flip
cv::copyMakeBorder - currently disabled
cv::polarToCart
cv::pow - ipp pow function was removed due to performance issues
cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
cv::countNonZero
cv::minMaxIdx
cv::norm
cv::canny - new integration. Disabled for threaded;
cv::cornerHarris
cv::boxFilter
cv::bilateralFilter
cv::integral
2017-04-25 15:53:12 +03:00
Alexander Alekhin
41d55c5095 Merge pull request #8620 from saskatchewancatch:8457 2017-04-25 12:22:51 +00:00
saskatchewancatch
3fe18392ef Updated comments for cv::ellipse and cv::ellipse2Poly to clarify some behaviour that has confused some users.
Amend: Delete trailing whitespace to make doc tests happy
2017-04-25 14:31:46 +03:00
Alexander Alekhin
f1c8094f5f Merge pull request #8575 from lupustr3:pvlasov/icv2017u2_initial_update 2017-04-21 10:55:29 +00:00
Alessandro Gentilini
fc8f1890c0 Fix markdown format. 2017-04-21 10:29:13 +02:00
Pavel Vlasov
35c7216846 IPP for OpenCV 2017u2 initial enabling patch; 2017-04-20 20:26:30 +03:00
Vitaly Tuzov
4c0d833dec Disabled vxuConvolution call for sepFilter evaluation 2017-04-11 15:57:20 +03:00
Vitaly Tuzov
87bb74312b Disabled vxuConvolution call for Sobel, GaussianBlur and Box filter evaluation 2017-04-11 14:11:55 +03:00
Vitaly Tuzov
0f1a56da7c Changed restrictions for OpenVX HAL calls on small images 2017-04-09 01:17:57 +03:00
Vitaly Tuzov
bf5b7843e8 Extended set of OpenVX HAL calls disabled for small images 2017-04-06 18:17:32 +03:00
Maksim Shabunin
ce50df564c Fixed cvtColor OCL compilation issue (BGRA2mBGRA) 2017-04-05 11:48:29 +03:00
Alexander Alekhin
e93aa158cf Merge pull request #8496 from Sahloul:fixes/wrappers/imgproc/EMD 2017-04-03 20:55:51 +00:00
Tomoaki Teshima
6ab10fc3ac suppress warnings on GCC 4.9 series
- check boundary strictly
 - initialize the variable before using it
2017-04-01 20:53:50 +09:00
Hamdi Sahloul
6a856d677c Wraps cv::EMD for Python and Java 2017-04-01 17:20:03 +09:00
Artem Lukoyanov
84a0a91d16 Merge branch 'master' of https://github.com/opencv/opencv into interpMultichannelImg
Added assertios to remap and warpAffine functions

As @mshabunin said, remap and warpAffine functions do not support more than 4 channels in
Bicubic and Lanczos4 interpolation modes. Assertions were added. Appropriate test was chenged.
resolves #8272
2017-03-24 23:58:51 +03:00
Artem Lukoyanov
c4ae5c0ee5 Added assertios to remap and warpAffine functions
Remap and warpAffine functions do not support more than 4 channels in
Bicubic and Lanczos4 interpolation modes. Assertions were added.
resolve #8272
2017-03-24 11:19:58 +03:00
Vadim Pisarevsky
a57d144076 Merge pull request #7462 from alalek:cpu_multi_target 2017-03-21 19:51:32 +00:00
Alexander Alekhin
b3d128bb39 Merge pull request #8401 from avartenkov:multichannel_warp 2017-03-21 11:59:56 +00:00
Alexander Alekhin
741e51396b Merge pull request #8416 from berak:patch-2 2017-03-21 11:57:57 +00:00
Alexander Alekhin
e77a5d5f13 Merge pull request #8422 from berak:fix_shapematchmodes 2017-03-21 09:06:30 +00:00
vartenkov
3fbe1f8d64 Fix multichannel warping with BORDER_CONSTANT
Warping a matrix with more than 4 channels using BORDER_CONSTANT and
INTER_NEAREST, INTER_CUBIC or INTER_LANCZOS4 interpolation led to
undefined behaviour. This commit changes the behavior of these methods
to be similar to that of INTER_LINEAR. Changed the scope of some of the
variables to more local. Modified some tests to be able to detect the
error described.
2017-03-20 15:21:49 +03:00
berak
11f3c0741e imgproc: move ShapeMatchModes enum from c to c++ header 2017-03-20 09:59:19 +01:00
berak
0b31eca9c2 remove unnessecary print statement
#resolves: 7881

remove printf statement and associated variables from invMapPointSpherical() in undistort.cpp
2017-03-19 10:12:50 +01:00
Woody Chow
9eecb5a9fe Optimize RowVec_32f and SymmColumnVec_32f with AVX2 2017-03-16 15:42:58 +09:00
Woody Chow
05476d6604 Optimize initUndistortRectifyMap with AVX2 2017-03-16 13:50:24 +09:00
Woody Chow
9a29fc2ce1 Optimize WarpAffine using AVX2 2017-03-16 10:13:56 +09:00
Fangjun KUANG
246d3761ce Merge pull request #8383 from csukuangfj/patch-10
* Improve documentation.

* Update imgproc.hpp
2017-03-15 11:12:59 +00:00
Fangjun KUANG
2a30d8c9f9 Update documentation for cv::accumulate.
Make it more clear for the type of input argument.
2017-03-10 17:53:12 +01:00
chacha21
8c7d29e526 more minor changes to fix -Wunused-function warning on Apple platforms 2017-03-09 18:08:34 +01:00
chacha21
94c58e7347 minor changes to fix -Wunused-function warning on Apple platforms 2017-03-09 17:28:52 +01:00
Alexander Alekhin
ba8a6e3533 ocl: don't use vload4 for 3 channel images 2017-03-03 19:36:38 +03:00
chacha21
27cfe31b64 more ICV_HLINE specific cases
added ICV_HLINE custom implementations for element sizes up to 32
but timings show that it is not very relevant for sizes >= 12
2017-03-03 11:47:46 +01:00
Vadim Pisarevsky
b46364e436 Merge pull request #7996 from mshabunin:hal-filter-revert 2017-03-02 11:12:08 +00:00
chacha21
92a3dbe18f more ICV_HLINE optimization
added 64b optimization for 3 channels case
not added 64b optimization for 4 channels case since timings did not
show any improvement
split ICV_HLINE cases into inline functions instead of macro for code
size reduction, without significand speed drawback at first sight
2017-03-02 09:44:12 +01:00
Vadim Pisarevsky
408ef5c65b Merge pull request #8288 from Jejos:bugfix_medianBlur_accessviolation 2017-03-02 05:53:09 +00:00
Vadim Pisarevsky
5f990566c4 Merge pull request #8297 from csukuangfj:csukuangfj-patch 2017-03-02 05:47:33 +00:00
Alexander Alekhin
da0b1d8821 Merge pull request #8238 from PkLab:fix_doc_ellipse 2017-03-01 14:31:06 +00:00
Alexander Alekhin
47c4dcc8a3 Merge pull request #8204 from terfendail:ovx_tlcontext 2017-03-01 12:36:37 +00:00
Fangjun KUANG
34c70e7a1c Fix typos. 2017-03-01 11:13:46 +01:00
Jejos
5169c79978 fix medianBlur accessviolation
medianBlur called with "empty" source and ksize >= 7 crashes application with accessviolation. With this extra assert this is avoided and the application may normally catch the thrown exception.
2017-02-28 08:31:28 +01:00
Maksim Shabunin
c4c1c4c9bb Replaced several hal:: classes with functions, marked old variants deprecated 2017-02-27 12:13:31 +03:00
Maksim Shabunin
0d7666a012 Merge branch 'master' into master 2017-02-26 07:46:59 +03:00
Alexander Alekhin
dcbed8d676 Merge pull request #8250 from tomoaki0705:fixNonAsciiChar 2017-02-24 11:19:00 +00:00
Tomoaki Teshima
822c67fdee remove non ASCII character from comment 2017-02-24 01:31:32 +09:00
PkLab.net
e03c81d90a Change image e small fix to cv::ellipse() Doc 2017-02-23 09:10:48 +01:00
Tomoaki Teshima
aec59aba34 suppress warnings
- brush up the implementation
2017-02-23 09:11:12 +09:00
Vitaly Tuzov
9a4b5a4545 OpenVX calls updated to use single common OpenVX context per thread 2017-02-21 16:08:23 +03:00
chacha21
afbcc07184 Merge remote-tracking branch 'origin/drawing_performance' into drawing_performance
# Conflicts:
#	modules/imgproc/src/drawing.cpp
2017-02-21 12:03:15 +01:00
chacha21
91a0270432 try to fix Android compilation 2017-02-21 12:02:23 +01:00
Tomoaki Teshima
64cf206fb5 optimize blend using universal intrinsic
- add more channels/depth performance test for blend
2017-02-20 19:09:26 +09:00
Vadim Pisarevsky
9053839282 Merge pull request #8178 from tomoaki0705:addBayer2RGBA 2017-02-16 15:01:49 +00:00
Alexander Alekhin
4c7aa8645a ocl: validate arguments in KernelArgs constructor
- don't use undefined flag=0. It should be CONSTANT instead.
- don't allow 'UMat* m=NULL' argument (except LOCAL/CONSTANT flags).
  This case is not handled well to provide NULL __global pointers.
  It is better to use '-D' macro defines instead (at least for performance)
2017-02-14 16:10:32 +03:00
Alexander Alekhin
e16227b53c cmake: support multiple CPU targets 2017-02-13 19:52:59 +03:00
chacha21
16a9407fbf new try to adapt to iOS build bot 2017-02-11 11:26:55 +01:00
chacha21
e19000a56f adaptation for iOS buildbot 2017-02-11 11:07:00 +01:00
chacha21
7521bcc32c comment unused function
On MacOS and iOS, the unused opencvBigToHost32 is a warning for buildbot
2017-02-10 22:34:44 +01:00