Alexander Alekhin
13ecd5bb25
Merge pull request #15122 from pmur:fast-math-improvements
2019-08-14 19:28:05 +00:00
Alexander Alekhin
32772a5436
3.4: backported changes from 'master' branch
2019-08-14 16:36:08 +03:00
Paul E. Murphy
f38a61c66d
fast_math: implement optimized PPC routines
...
Implement cvRound using inline asm. No compiler support
exists today to properly optimize this. This results in
about a 4x speedup over the default rounding. Likewise,
simplify the growing number of rounding function overloads.
For P9 enabled targets, utilize the classification
testing instruction to test for Inf/Nan values. Operation
speedup is about 1.2x for FP32, and 1.5x for FP64 operands.
For P8 targets, fallback to the GCC nan inline. It provides
a 1.1/1.4x improvement for FP32/FP64 arguments.
2019-08-07 15:01:18 -05:00
Paul E. Murphy
3f92bcc11a
fast_math: selectively use GCC rounding builtins when available
...
Add a new macro definition OPENCV_USE_FASTMATH_GCC_BUILTINS to enable
usage of GCC inline math functions, if available and requested by the
user.
Likewise, enable it for POWER. This is nearly always a substantial
improvement over using integer manipulation as most operations can
be done in several instructions with no branching. The result is a
1.5-1.8x speedup in the ceil/floor operations.
1. As tested with AT 12.0-1 (GCC 8.3.1) compiler on P9 LE.
2019-08-07 15:01:18 -05:00
Alexander Alekhin
821f17d666
Merge pull request #15235 from pmur:vsx-v_signmask-vbpermq
2019-08-06 20:09:22 +00:00
Paul E. Murphy
1031b7f4bc
hal: vsx: further optimize v_signmask
...
Use the quadword bit permutation instruction to creatively move
the sign bits to create the mask. Note that values above 127 will
result in 0.
2019-08-05 09:00:22 -05:00
Alexander Alekhin
2693ed9b22
Merge tag '3.4.7'
2019-07-25 19:19:49 +00:00
Alexander Alekhin
4a7ca5a291
OpenCV version++ (3.4.7)
...
OpenCV 3.4.7
2019-07-25 19:01:19 +00:00
Alexander Alekhin
6158bd2afa
Merge pull request #15103 from alalek:simd_intrinsics_in_user_code
2019-07-25 11:36:36 +00:00
Hugo Lindström
2ee00e7f7d
Merge pull request #15059 from hugolm84:improved-support-for-wince
...
* Improve support for Windows Embedded Compact
* Remove redundant set(WINCE true) and format CMake
2019-07-24 23:12:09 +03:00
Alexander Alekhin
8bac8b513c
core: support SIMD intrinsics in user code
2019-07-19 20:33:32 +00:00
Alexander Alekhin
002904e445
Merge pull request #15050 from alalek:core_fix_base64_packed_struct
2019-07-18 19:07:06 +00:00
Alexander Alekhin
4ea8526e9f
core(persistence): fix writeRaw() / readRaw() struct support
...
- writeRaw(): support structs
- readRaw(): 'len' is buffer limit in bytes (documentation is fixed)
2019-07-16 14:03:39 +03:00
Hugo Lindström
245c256b1c
Support compiliation for <=VS13
2019-07-12 19:02:36 +02:00
Alexander Alekhin
69560588fe
Merge pull request #14953 from alalek:core_static_analysis_eval_expr
2019-07-02 09:44:29 +00:00
Vitaly Tuzov
9befb7a1d7
Merge pull request #14916 from terfendail:wsignmask_deprecated
...
* Avoid using v_signmask universal intrinsic and mark it as deprecated
* Renamed v_find_negative to v_scan_forward
2019-07-01 19:53:51 +03:00
Alexander Alekhin
44836c7f78
core: evaluate CV_Error() parameters during static scans
2019-07-01 18:17:03 +03:00
Stefan Brüns
e9a2e665b2
Explicitly default operator= for Vec<T, n>
...
Due to the explicitly declared copy constructor Vec<T, n>::Vec(Vec <T,n>&)
GCC 9 warns if there is no assignment operator, as having one typically
requires the other (rule-of-three, constructor/desctructor/assginment).
As the values are just a plain array the default assignment operator does
the right thing. Tell the compiler explicitly to default it.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
2019-06-29 22:11:00 +02:00
Rostislav Vasilikhin
f2f600f807
fixed multi instrumentations
2019-06-27 01:17:26 +03:00
Alexander Alekhin
e8a703a71d
core(intrin): v_load_low() workaround for aarch64+clang
2019-06-25 17:29:04 +03:00
Alexander Alekhin
779f59da6b
pre: OpenCV 3.4.7 (version++)
2019-06-21 16:57:17 +03:00
Alexander Alekhin
aa6c66aa54
Merge pull request #14848 from alalek:build_warnings_avx512
2019-06-21 13:53:52 +00:00
Alexander Alekhin
5ac55fc132
core: eliminate AVX512 build warnings
...
from MSVS2017 and GCC8 -O1 mode
2019-06-20 20:00:09 +03:00
Alexander Alekhin
681e0323f2
core: backport toLowerCase()/toUpperCase()
2019-06-20 17:48:18 +03:00
Vitaly Tuzov
a29e59a770
Rename parameters in AVX512 implementation of v_load_deinterleave and v_store_interleave
2019-06-14 14:16:30 +03:00
Vitaly Tuzov
d2aadabc5e
Merge pull request #14743 from terfendail:wui512_fixvswarn
...
Fix for MSVS2019 build warnings (#14743 )
* AVX512 arch support for MSVS
* Fix for MSVS2019 build warnings: updated integral() AVX512 implementation
* Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation
* fix indentation
2019-06-11 23:07:39 +03:00
Alexander Alekhin
52644f067e
Merge pull request #14764 from alalek:core_intrin_drop_hasSIMD_checks
2019-06-09 17:11:45 +00:00
Alexander Alekhin
6d916c5bb4
Merge pull request #14440 from alalek:async_array
2019-06-08 20:57:15 +00:00
Alexander Alekhin
1e9ad5476d
core(intrin): drop hasSIMD128 checks
...
- use compile-time checks instead (`#if CV_SIMD128`)
- runtime checks are useless
2019-06-08 19:20:20 +00:00
Alexander Alekhin
4a8fd71a2e
core: fix visibility handling
2019-06-07 07:23:15 +00:00
Alexander Alekhin
aab9ef4290
Merge pull request #14667 from asashour:javadoc
2019-06-06 10:57:39 +00:00
Ahmed Ashour
5c56b8ce92
java: generated code to have javadoc
2019-06-05 12:44:03 +02:00
Ahmed Ashour
1aca1d582e
Fix some typos
2019-06-05 12:24:13 +02:00
Vitaly Tuzov
3b015dfc7d
Merge pull request #14210 from terfendail:wui_512
...
AVX512 wide universal intrinsics (#14210 )
* Added implementation of 512-bit wide universal intrinsics(WIP)
* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks
* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros
* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines
* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.
* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build
* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Vitaly Tuzov
723165f878
fix for AVX2 version of v_reduce_min intrinsic
2019-05-31 16:14:54 +03:00
Vitaly Tuzov
f0fb91f2d4
Fixed v_signmask implementation for AVX2, updated universal intrinsics tests.
2019-05-24 19:34:54 +03:00
Alexander Alekhin
9340af1a8a
core: Async API / AsyncArray
2019-05-18 19:32:23 +00:00
catree
b5e2ec4ea4
Fix typo in NormTypes documentation.
2019-05-16 19:22:41 +02:00
Vitaly Tuzov
7a55f2af3b
Updated AVX2 implementation of v_popcount for u8.
2019-05-15 19:39:25 +03:00
Vitaly Tuzov
1220dd4877
Updated v_popcount description, reference implementation and test.
2019-05-14 18:59:40 +03:00
Vitaly Tuzov
96ab78dc4f
Reworked v_popcount implementation to provide number of bits in a single lane
2019-05-14 18:59:38 +03:00
Sayed Adel
5a77f4cee3
Merge pull request #14007 from seiko2plus:core_avx512_infa
...
* core: improve AVX512 infrastructure by adding more CPU features groups
* cmake: use groups for AVX512 optimization flags
* core: remove gap in CPU flags enumeration
* cmake: restore default CPU_DISPATCH
2019-05-05 14:19:49 +03:00
Sayed Adel
afb157df67
core:vsx fix sum of v_reduce_sad
2019-04-27 02:01:24 +02:00
Alexander Alekhin
b95fdc1992
Merge pull request #14394 from alalek:build_support_memory_sanitizers
2019-04-26 16:13:52 +00:00
Alexander Alekhin
d17699363c
Merge pull request #14385 from terfendail:intrin_sad
2019-04-26 15:34:02 +00:00
Vitaly Tuzov
18d10d6b86
Fixed v_reduce_sad intrinsics implementation and added tests
2019-04-24 14:53:59 +03:00
Alexander Alekhin
c1981f28ad
build: +OPENCV_ENABLE_MEMORY_SANITIZER flag
2019-04-22 21:35:25 +00:00
Vitaly Tuzov
4a54aa3fbd
Cleared up deprecated intrinsics for FP16
2019-04-22 10:35:37 +03:00
Alexander Alekhin
b38de57f9a
ts: test tags for flexible/reliable tests filtering
...
- added functionality to collect memory usage of OpenCL sybsystem
- memory usage of fastMalloc() (disabled by default):
* It is not accurate sometimes - external memory profiler is required.
- specify common `CV_TEST_TAG_` macros
- added applyTestTag() function
- write memory usage / enabled tags into Google Tests output file (.xml)
2019-04-08 19:12:49 +00:00
Alexander Alekhin
dad2247b56
Merge tag '3.4.6'
2019-04-07 11:02:40 +00:00
Alexander Alekhin
33b765d797
OpenCV version++ (3.4.6)
...
OpenCV 3.4.6
2019-04-06 21:43:23 +00:00
Alexander Alekhin
d6b82dcd65
Merge pull request #14162 from alalek:eliminate_coverity_scan_issues
...
core: eliminate coverity scan issues (#14162 )
* core(hal): avoid using of r,g,b,a parameters in interleave/deinterleave
- static analysis tools blame on possible parameters reordering
- align AVX parameters with corresponding SSE/NEO/VSX/cpp code
* core: avoid "i,j" parameters in Matx methods
- static analysis tools blame on possible parameters reordering
* core: resolve coverity scan issues
2019-03-27 15:48:00 +03:00
Alexander Alekhin
55366caecd
Merge pull request #14155 from alalek:fix_macos_ocl_warnings_3.4
2019-03-26 15:34:49 +00:00
Alexander Alekhin
6686559c70
ocl: define CL_SILENCE_DEPRECATION on MacOSX
2019-03-26 13:11:53 +03:00
Alexander Alekhin
cedd78d526
Merge pull request #14142 from mshabunin:fix-c-api-3.4
2019-03-25 18:58:28 +00:00
Maksim Shabunin
41da3ef1d2
Fixed cvdef.h for MSVC C users
2019-03-25 16:44:08 +03:00
Sayed Adel
f41359688b
core:vsx Add support for VSX3 half precision conversions
2019-03-20 10:19:42 +02:00
Sayed Adel
4fe2d9bdbc
core:vsx Several improvements(3)
...
* optimize v_lut_deinterleave
* optimize v_interleave_/pairs/quads/triplets
* optimize v_lut, use vec_extract instead of aligned store
2019-03-19 12:30:50 +02:00
Sayed Adel
872e7894b4
core:vsx working around gcc aligned memory access bug
...
- allow cmake to check sanity of vsx aligned ld/st
- force universal intrinsics v_load_aligned/v_store_aligned
to failback to unaligned ld/st if cmake runtime vsx aligned test fail
2019-03-14 01:55:40 +02:00
Alexander Alekhin
80e5642ca2
pre: OpenCV 3.4.6 (version++)
2019-03-12 13:29:42 +03:00
Alexander Alekhin
842c58a7d6
core(intrin): NEON v_load_expand_q() support unaligned addr
2019-03-11 12:06:05 +00:00
Alexander Alekhin
8b541e450b
imgproc: dispatch color*
...
Lab/XYZ modes have been postponed (color_lab.cpp):
- need to split code for tables initialization and for pixels processing first
- no significant performance improvements for switching between SSE42 / AVX2 code generation
2019-03-07 15:45:05 +03:00
Sayed Adel
5478165e16
core:vsx Fix narrowing warning on vector splats
2019-03-01 00:48:38 +00:00
Alexander Alekhin
a9f67c2d1d
Merge pull request #13905 from terfendail:pyr_wintr2
2019-02-28 14:53:42 +00:00
berak
20afae5a14
core: fix mat matx multiplication
2019-02-28 14:22:54 +01:00
Vitaly Tuzov
9548093b46
Horizontal line processing for pyrDown() reworked using wide universal intrinsics.
2019-02-28 00:12:57 +03:00
catree
576ab3df9a
Add division operators for Matx.
2019-02-27 19:36:23 +01:00
Vitaly Tuzov
334c4d62b5
Merge pull request #13781 from terfendail:warp_wintr
...
Resize reworked using wide universal intrinsics (#13781 )
* Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize
* Reworked linear resize using new wide LUT intrinsics
* Fix for VSX intrinsics
2019-02-20 14:30:28 +03:00
Alexander Alekhin
cd66f6e3db
core: dispatch matmul
...
- gemm: keep baseline only (lapack is 10x+ faster, lets reduce binary size)
- transform / distTransform
- scaleAdd (32f/64f only)
- Mahalanobis: keep baseline only (no perf tests)
- mulTransposed: keep baseline only (no perf tests)
- dot
2019-02-18 14:36:46 +03:00
klemens
5d9c6723ee
spelling fixes
...
backport 997b7b18af
2019-02-11 15:35:10 +03:00
Namgoo Lee
fb8e652c3f
Add CV_16UC1 support for cuda::CLAHE
...
Due to size limit of shared memory, histogram is built on
the global memory for CV_16UC1 case.
The amount of memory needed for building histogram is:
65536 * 4byte = 256KB
and shared memory limit is 48KB typically.
Added test cases for CV_16UC1 and various clip limits.
Added perf tests for CV_16UC1 on both CPU and CUDA code.
There was also a bug in CV_8UC1 case when redistributing
"residual" clipped pixels. Adding the test case where clip
limit is 5.0 exposes this bug.
2019-02-06 17:21:55 +00:00
Alexander Alekhin
dc5e69b4d4
Revert "Merge pull request #13586 from eightco:Core_bugfix3"
...
This reverts commit 3721c8bb06
except changes in modules/dnn/test/test_tf_importer.cpp
2019-01-18 18:29:12 +03:00
Lee Jaehwan
3721c8bb06
Merge pull request #13586 from eightco:Core_bugfix3
...
* Add Operator override for multi-channel Mat with literal constant.
* simple test
* Operator overloading channel constraint for primitive types
* fix some test for #13586
2019-01-17 17:23:09 +03:00
Vitaly Tuzov
ea882d58c6
Added CV_ALWAYS_INLINE macro
2019-01-11 22:40:35 +03:00
Alexander Alekhin
b11566bfc7
Merge pull request #13553 from luctowers:master
2019-01-09 13:33:45 +00:00
Lucas Towers
9cc12ff0ac
Fix improper defining of CV_XADD when using Intel C++
2019-01-09 14:41:21 +03:00
Namgoo Lee
4b4874e67a
Remove build warning msg with CUDA10.0
2019-01-08 10:57:12 +09:00
Vitaly Tuzov
c8f59bf1e0
Fixed operations on Mat and Matx simultaneously
2018-12-25 19:22:09 +03:00
Alexander Alekhin
f35e043cf9
Merge tag '3.4.5'
2018-12-21 21:48:03 +03:00
Alexander Alekhin
8f1356c3c5
OpenCV version++ (3.4.5)
...
OpenCV 3.4.5
2018-12-21 17:31:20 +03:00
Vitaly Tuzov
06f32e3b3e
Reworked separable filter to use wide universal intrinsics
2018-12-19 17:50:09 +03:00
Alexander Alekhin
f605898bae
core: fix eigen2cv() - don't change fixed type of 'dst'
2018-12-16 06:43:08 +00:00
Sayed Adel
4e16ae9a1f
core:vsx fix build failure on GCC<=6 due implementation of v_reduce_sum(v_float64x2)
2018-12-14 19:24:12 +00:00
Vitaly Tuzov
3903174f7c
Merge pull request #13334 from terfendail:histogram_wintr
...
* added performance test for compareHist
* compareHist reworked to use wide universal intrinsics
* Disabled vectorization for CV_COMP_CORREL and CV_COMP_BHATTACHARYYA if f64 is unsupported
2018-12-13 14:20:22 +03:00
Alexander Alekhin
a811059bfb
Merge pull request #13336 from sergiud:core_sse_immediates_gcc-5.4.0
2018-11-30 09:51:59 +00:00
Sergiu Deitsch
e43a5ff9be
fixed gcc 5.4.0 compilation errors
2018-11-30 08:48:19 +01:00
Vitaly Tuzov
00c9ab8c23
Merge pull request #13317 from terfendail:norm_wintr
...
* Added performance tests for hal::norm functions
* Added sum of absolute differences intrinsic
* norm implementation updated to use wide universal intrinsics
* improve and fix v_reduce_sad on VSX
2018-11-29 19:34:14 +03:00
Maksim Shabunin
89f0e0a8d1
Fixed misleading indentation in intrin_cpp.hpp
2018-11-27 15:29:37 +03:00
Etienne Brateau
736683ce2f
Fix missing check part (defined(__cplusplus)) in header types_c.h
2018-11-22 01:39:09 +01:00
Alexander Alekhin
6e67fd2752
Merge pull request #13224 from seiko2plus:core_ppc64le_infa
2018-11-20 21:26:05 +00:00
Sayed Adel
474a0dac49
core: several improves and fixes on ppc64le infrastructure
...
- add infrastructure support for Power9/VSX3
- fix missing VSX flags on GCC4.9 and CLANG4(#13210 , #13222 )
- fix disable VSX optimzation on GCC by using flag ENABLE_VSX
- flag ENABLE_VSX is deprecated now, use CPU_BASELINE, CPU_DISPATCH instead
- add VSX3 to arithmetic dispatchable flags
2018-11-20 15:28:46 +00:00
1over
b6367f5821
fixed operator- for Rect
2018-11-20 00:48:17 +01:00
Alexander Alekhin
605071e76f
Merge pull request #13146 from terfendail:bilateral_nan
2018-11-19 15:59:12 +00:00
Alexander Alekhin
183bc5c281
Merge tag '3.4.4'
...
OpenCV 3.4.4
2018-11-17 13:00:28 +00:00
Alexander Alekhin
a1fe8f754f
OpenCV version++ (3.4.4)
...
OpenCV 3.4.4
2018-11-17 10:22:17 +00:00
Alexander Alekhin
1d5a528107
Merge pull request #12354 from alalek:samples_find_file
2018-11-16 22:40:49 +03:00
Vitaly Tuzov
f5b6bea2d4
Raised bilateralFilter processing precision for CV_32F matrices containing NaNs
2018-11-16 12:07:04 +03:00
Alexander Alekhin
1c04a5ec47
Merge pull request #12965 from terfendail:medianBlur_wintr
2018-11-16 00:47:11 +00:00
Alexander Alekhin
2fa9bd221d
core: add utils::findDataFile() / samples::findFile()
2018-11-16 00:25:06 +00:00
Alexander Alekhin
96c71dd3d2
dnn: reduce set of ignored warnings
2018-11-15 13:15:59 +03:00
Vitaly Tuzov
28fd967148
Updated bilateralFilter implementations to use wide universal intrinsics
2018-11-09 15:27:30 +03:00
Alexander Alekhin
bb7cfcbcdb
Merge pull request #12064 from seiko2plus:coreUnvintrinArithm2
2018-11-08 14:02:40 +00:00
Vitaly Tuzov
e5d7f446d6
Merge pull request #13056 from terfendail:box_wintr
...
* Updated boxFilter implementations to use wide universal intrinsics
* boxFilter implementation moved to separate file
* Replaced ROUNDUP macro with roundUp() function
2018-11-07 23:59:36 +03:00
Alexander Alekhin
d4e3405db2
Merge pull request #13045 from LaurentBerger:kmeansdoc
...
typo in kmeans doc
2018-11-06 20:00:47 +03:00
LaurentBerger
5132102863
typo in kmeans doc
2018-11-04 21:30:31 +01:00
Alexander Alekhin
79dc0ed175
docs: intro formatting update, minor cleanup
2018-11-04 02:36:24 +00:00
Sayed Adel
93ffebc273
core: reimplement SIMD arithmetic, logic and comparison operations into wide universal intrinsics
...
- initialize arithmetic dispatcher
- add new universal intrinsic v_absdiffs
- add new universal intrinsic v_pack_b
- add accumulate version of universal intrinsic v_round
- fix sse/avx2:uint8 multiplication overflow
- reimplement arithmetic, logic and comparison operations into wide universal intrinsics
with full support for all types
- reimplement IPP arithmetic, logic and comparison operations in a sperate file arithm_ipp.hpp
- avoid scalar multiplication if scaling factor eq 1 and use integer multiplication
- move C arithmetic operations to precomp.hpp and delete [arithm_simd|arithm_core].hpp
- add compatibility with new opencv4 divide policy
2018-10-30 12:48:31 +02:00
Rostislav Vasilikhin
daff6e6484
_mm256_zeroupper replaced by zeroall
2018-10-26 18:12:07 +03:00
Alexander Alekhin
7f608db244
core: move compiler defines from base.hpp into cvdef.h
2018-10-25 03:02:01 +00:00
Alexander Alekhin
2c029aae46
Merge pull request #12914 from seiko2plus:issue12830
2018-10-24 13:15:23 +00:00
maver1
e397434cb6
Merge pull request #12877 from maver1:3.4
...
* Updated ICV packages and IPP integration
* core(test): minMaxIdx IPP regression test
* core(ipp): workaround minMaxIdx problem
* core(ipp): workaround meanStdDev() CV_32FC3 buffer overrun
* Returned semicolon after CV_INSTRUMENT_REGION_IPP()
2018-10-24 15:02:53 +03:00
Sayed Adel
8b26906d6d
core:vsx change behavior of v_round to rounding to nearest even
2018-10-24 06:31:31 +00:00
Mansoo Kim
4d1f0ef2d9
cuda: fix build with CUDA 10.x
2018-10-17 17:35:40 +00:00
Alexander Alekhin
f185640eda
Merge pull request #12799 from alalek:update_build_js
...
* js: update build script
- support emscipten 1.38.12 (wasm is ON by default)
- verbose build messages
* js: use builtin Math functions
* js: disable tracing code completelly
2018-10-15 17:35:21 +03:00
Alexander Alekhin
1cc3f7abbb
Merge pull request #12516 from seiko2plus:changeUnvMultiply16
2018-10-15 12:07:40 +00:00
Alexander Alekhin
0f41daeba5
Merge pull request #12641 from dkurt:dnn_samples_args_autofill
2018-10-13 12:28:08 +00:00
Sayed Adel
5771fd693d
Change behaviour of 16-bit multiply operator
...
- redefine 16-bit multiply operator to perform saturating multiply
instead of non-saturating multiply
- implement 8-bit multiply operator to perform saturating multiply
- implement v_mul_wrap() for 8-bit, 16-bit non-saturating multiply
- improve performance of v_mul_hi() for VSX
- update intrin tests with new changes
- replace unv 16-bit multiplication operator with v_mul_wrap due behavior changes
- Several improvements depend on vpisarev review
* initial forward declarations for universal intrinsics
* move emulating SSE intrinsics into separate file
* implement v_mul_expand for 8-bit
* reimplement saturating multiply using v_mul_expand + v_pack
* map v_expand, v_load_expand, v_load_expand_q to sse4.1
* fix overflow avx2::v_pack(uint32)
* implement two universal intrinsics v_expand_low and v_expand_high
2018-10-11 04:35:39 +02:00
Vitaly Tuzov
cc10e6b344
pyrDown and pyrUp SSE2 implementations replaced with wide universal intrinsics implementations
2018-10-10 21:12:47 +03:00
Alexander Alekhin
68fe37b008
Merge pull request #12755 from alalek:fix_allocSingleton
2018-10-08 15:30:17 +00:00
Alexander Alekhin
18bf91a08b
core: update allocSingleton implementation, valgrind suppression
2018-10-05 18:25:13 +03:00
Alexander Alekhin
c716e374c1
Merge pull request #12744 from alalek:issue_12736
2018-10-05 10:20:13 +00:00
Alexander Alekhin
aeec6e43eb
Merge pull request #12749 from powderluv:fix-clang-cl-tzcnt
2018-10-05 09:28:07 +00:00
Anush Elangovan
630a94b8b7
_tzcnt_u32() is undefined in clang-cl so use alternate impl
...
_tzcnt_u32() is not exported by clang-cl intrin.h so check for
clang-cl and enable an alterate for _tzcnt_u32()
Some discussions:
http://lists.llvm.org/pipermail/cfe-dev/2016-October/051329.html
https://bugs.llvm.org/show_bug.cgi?id=30506
TEST=Build with clang-cl
2018-10-04 14:04:22 -07:00
Rostislav Vasilikhin
da5e0ef461
ocl::KernelArg::Local(): added size argument
2018-10-04 17:19:09 +03:00
Alexander Alekhin
0926a84a45
cmake: define CV_ErrorNoReturn under CV_STATIC_ANALYSIS
...
to avoid build break without `__OPENCV_BUILD`
2018-10-04 14:43:43 +03:00
Maksim Shabunin
15632c6305
Added support for multi-path configuration parameter (env)
2018-10-01 17:50:47 +03:00
Alexander Alekhin
4b895a4d1f
Merge pull request #12657 from alalek:docs_repair_cuda_section
2018-09-28 09:45:50 +00:00
Rostislav Vasilikhin
be989b3b60
Merge pull request #12637 from savuor:fix/instr_ipp_ocl
...
Fixes for instrumentation of IPP and OCL (#12637 )
* fixed warning about re-declaring variable when both IPP and instrumentation are enabled
* fixed segfault when no funName provided
* compilation fixed when both OCL and instrumentation are enabled
2018-09-27 22:39:06 +03:00
Dmitry Kurtaev
24ab751547
Merge pull request #12565 from dkurt:dnn_non_intel_gpu
...
* Remove isIntel check from deep learning layers
* Remove fp16->fp32 fallbacks where it's not necessary
* Fix Kernel::run to prevent localsize > globalsize
2018-09-26 16:27:00 +03:00
Alexander Alekhin
962dc21f2b
docs: fix CUDA docs section
2018-09-26 15:36:55 +03:00
Dmitry Kurtaev
ad5898224d
Add a file with preprocessing parameters for deep learning networks
2018-09-25 18:28:37 +03:00
Hamdi Sahloul
5d54def264
Add semicolons after CV_INSTRUMENT
macros
2018-09-14 06:45:31 +09:00
Maksim Shabunin
78c500e97a
Removed unnecessary build-time MediaSDK detection
2018-09-13 13:43:11 +03:00
Hamdi Sahloul
03b3be0f51
MSVC: Slience external/meaningless warnings
2018-09-12 20:02:13 +09:00
Alexander Alekhin
492ef14550
Merge pull request #12494 from DEEPIR:3.4
2018-09-11 19:38:00 +00:00
cyy
8f78a1123b
fix uninitialized read errors reported by CUDA-INITCHECK
2018-09-11 14:47:39 +08:00
Alexander Alekhin
95dd4b3f27
bindings: add debug helpers for args conversions
2018-09-08 12:23:08 +00:00
cyy
286c2c236b
Merge pull request #12458 from DEEPIR:3.4
...
* may be an typo fix
* remove identical branch,may be paste error
* add parentheses around macro parameter
* simplify if condition
* check malloc fail
* change the condition of branch removed by commit 3041502861
2018-09-07 18:43:47 +03:00
Hamdi Sahloul
a39e0daacf
Utilize CV_UNUSED macro
2018-09-07 20:33:52 +09:00
Alexander Alekhin
f1f15841d7
Merge pull request #11630 from alalek:c_api_eliminate_constructors
2018-09-06 20:07:16 +00:00
Vadim Pisarevsky
80b62a41c6
Merge pull request #12411 from vpisarev:wide_convert
...
* rewrote Mat::convertTo() and convertScaleAbs() to wide universal intrinsics; added always-available and SIMD-optimized FP16<=>FP32 conversion
* fixed compile warnings
* fix some more compile errors
* slightly relaxed accuracy threshold for int->float conversion (since we now do it using single-precision arithmetics, not double-precision)
* fixed compile errors on iOS, Android and in the baseline C++ version (intrin_cpp.hpp)
* trying to fix ARM-neon builds
* trying to fix ARM-neon builds
* trying to fix ARM-neon builds
* trying to fix ARM-neon builds
2018-09-06 19:36:59 +03:00
Vadim Pisarevsky
54279523a3
Merge pull request #12437 from vpisarev:avx2_fixes
...
* trying to fix the custom AVX2 builder test failures (false alarms)
* fixed compile error with CPU_BASELINE=AVX2 on x86; raised tolerance thresholds in a couple of tests
* fixed compile error with CPU_BASELINE=AVX2 on x86; raised tolerance thresholds in a couple of tests
* fixed compile error with CPU_BASELINE=AVX2 on x86; raised tolerance thresholds in a couple of tests
* seemingly disabled false alarm warning in surf.cpp; increased tolerance thresholds in the tests for SolvePnP and in DNN/ENet
2018-09-06 18:56:55 +03:00
Alexander Alekhin
8a3c394d6a
don't use constructors for C API structures
2018-09-06 14:34:16 +03:00
Alexander Alekhin
ad146e5a6b
core: remove constructors from C API structures
...
POD structures can't have constructors.
2018-09-06 14:34:09 +03:00
Alexander Alekhin
acce95f446
backport fixes for static analyzer warnings
...
Commits:
- 09837928d9
- 10fb88d027
Excluded changes with std::atomic (C++98 requirement)
2018-09-04 16:49:42 +03:00
Alexander Alekhin
a0f86479e0
core: wrap custom types via _RawArray (raw() call)
...
- support passing of `std::vector<KeyPoint>` via InputArray
2018-09-03 18:41:48 +00:00
Vitaly Tuzov
0f2b535fcc
Bit-exact GaussianBlur reworked to use wide intrinsics ( #12073 )
...
* Bit-exact GaussianBlur reworked to use wide intrinsics
* Added v_mul_hi universal intrinsic
* Removed custom SSE2 branch from bit-exact GaussianBlur
* Removed loop unrolling for gaussianBlur horizontal smoothing
2018-08-31 17:04:59 +03:00
Alexander Alekhin
e86287d8ae
cleanup: IPP Async (IPP_A)
...
except header file with conversion routines (will be removed in OpenCV 4.0)
2018-08-30 18:53:07 +03:00
Alexander Alekhin
d13db35f31
Merge tag '3.4.3'
2018-08-28 16:03:08 +03:00
Alexander Alekhin
b38c50b3d0
OpenCV 3.4.3
2018-08-28 15:58:21 +03:00
Alexander Alekhin
2c42361ecd
build: fix build with defined CV_STATIC_ANALYSIS
2018-08-22 14:19:21 +03:00
Alexander Alekhin
6acabd1fd8
Merge pull request #12256 from alalek:core_intrin_fp16_fix
2018-08-21 12:47:08 +00:00
Alexander Alekhin
5ac9a2a7d0
Merge pull request #12219 from alalek:fix_assert_messages
2018-08-21 12:46:35 +00:00
Alexander Alekhin
67d46dfc6c
core(intrin): restrict FP16 operations
...
Intrinsics must be effective, so don't declare FP16 type/operations if there is no native support.
- CV_FP16: supports load/store into/from float32
- CV_SIMD_FP16: declares FP16 types and native FP16 operations
2018-08-20 19:24:33 +03:00
Alexander Alekhin
31fef14d76
Merge pull request #12136 from sturkmen72:update_documentation
2018-08-17 14:02:20 +00:00
Suleyman TURKMEN
c61bc3a0cb
Update documentation and samples
2018-08-17 14:21:29 +03:00
Alexander Alekhin
d2e08a524e
core: repair CV_Assert() messages
...
Multi-argument CV_Assert() is accessible via CV_Assert_N() (with malformed messages).
2018-08-15 17:43:10 +03:00
Alexander Alekhin
c1df9ad456
OpenCV version++
...
OpenCV 3.4.3
2018-08-14 14:10:37 +03:00
Alexander Alekhin
a56b221559
core: cv::Range() ostream write operator
...
remove from DNN module headers
2018-08-07 20:03:21 +03:00
Alexander Alekhin
5c3880d302
core(intrin): avoid symbols duplication from SIMD128/256 cases
...
All vx_call() must be wrapped into own simd128/simd256/simd512 namespace
```
namespace CV__SIMD_NAMESPACE {
... vx_call declaration is here ...
}
```
2018-08-06 19:29:46 +00:00
Vadim Pisarevsky
23022f3ffb
Merge pull request #12121 from maver1:amatyuko/sse2_convert_with_saturation_fix
2018-08-06 14:26:37 +00:00
Alexander Alekhin
b4cea8d6d1
Merge pull request #12120 from alalek:core_test_intrin_dispatched
2018-08-03 17:07:17 +00:00
amatyuko
3ea2586a5a
Fix for SSE2 intrinsics problem in the part of saturation arithmetic processing during 32s->16u packed conversion -
...
for some big negative values less than -INT_MAX+32767 the sign of the numbers is lost due to overflow that leads to
incorrect saturation to MAX value, instead of zero.
The issue is not reproduced with CV_ENABLED_INTRINSICS=OFF
2018-08-01 16:04:08 +03:00
Alexander Alekhin
3f302cabb8
core(test): intrinsic tests for all dispatched CPU optimizations
...
- tests for both SIMD128 / SIMD256
- different dispatched + baseline(SIMD128) intrinsics
2018-08-01 13:50:42 +03:00
luz.paz
2003eb1b9b
Misc. typos
...
Found via `codespell -q 3 -I ../opencv-whitelist.txt --skip="./3rdparty"`
2018-07-31 18:44:23 +03:00
Alexander Alekhin
dbf3362c4d
Merge pull request #12056 from seiko2plus:coreExpandTests
2018-07-30 16:23:11 +00:00
Sayed Adel
6499263b41
core:test Expand hal_intrin tests to support SIMD256
2018-07-30 08:50:50 +02:00
Sayed Adel
47202b3349
core:avx2 fix unaligned store for v_store_interleave v_uint32x8-3ch
2018-07-29 18:22:46 +02:00
Maksim Shabunin
597db69151
ts: test case list is printed after cmd line parsing, refactored
2018-07-26 16:43:43 +03:00
Vadim Pisarevsky
43820d89b4
further improvements in split & merge; started using non-temporary store instructions ( #12063 )
...
* 1. changed static const __m128/256 to const __m128/256 to avoid wierd instructions and calls inserted by compiler.
2. added universal intrinsics that wrap MOVNTPS and other such (non-temporary or "no cache" store) instructions. v_store_interleave() and v_store() got respective flags/overloaded variants
3. rewrote split & merge to use the "no cache" store instructions. It resulted in dramatic performance improvement when processing big arrays
* hopefully, fixed some test failures where 4-channel v_store_interleave() is used
* added missing implementation of the new universal intrinsics (v_store_aligned_nocache() etc.)
* fixed silly typo in the new intrinsics in intrin_vsx.hpp
* still trying to fix VSX compiler errors
* still trying to fix VSX compiler errors
* still trying to fix VSX compiler errors
* still trying to fix VSX compiler errors
2018-07-26 12:04:28 +03:00
Vadim Pisarevsky
9c7040802c
converted split() & merge() to wide univ intrinsics ( #12044 )
...
* fixed/updated v_load_deinterleave and v_store_interleave intrinsics; modified split() and merge() functions to use those intrinsics
* fixed a few compile errors and bug in v_load_deinterleave(ptr, v_uint32x4& a, v_uint32x4& b)
* fixed few more compile errors
2018-07-24 17:27:56 +03:00
Maksim Shabunin
e0603bb45f
Fixed several issues found by static analysis tools
2018-07-23 17:22:47 +03:00
Alexander Alekhin
767b31cfbf
Merge pull request #12029 from tomoaki0705:fixBuildVS2013BinarySuffix
2018-07-20 11:27:27 +00:00
Tomoaki Teshima
18abe54497
fix build error on Visual Studio 2013
...
* replace binary literal prefix to hexadecimal literal prefix
2018-07-20 18:09:17 +09:00
Maksim Shabunin
fe806878be
Enable debug assertions for static analysis builds
2018-07-18 15:53:16 +03:00
Alexander Alekhin
7cc84ce8ab
Merge pull request #11984 from mshabunin:fix-static-1
2018-07-17 15:40:48 +00:00
Alexander Alekhin
0a41b3df45
Merge pull request #11990 from alalek:clone_nodiscard_attribute
2018-07-17 15:34:08 +00:00
Maksim Shabunin
1da46fe6fb
Fixed issues found by static analysis (mostly DBZ)
2018-07-17 16:14:54 +03:00
Vadim Pisarevsky
f058b5fb1e
Wide univ intrinsics ( #11953 )
...
* core:OE-27 prepare universal intrinsics to expand (#11022 )
* core:OE-27 prepare universal intrinsics to expand (#11022 )
* core: Add universal intrinsics for AVX2
* updated implementation of wide univ. intrinsics; converted several OpenCV HAL functions: sqrt, invsqrt, magnitude, phase, exp to the wide universal intrinsics.
* converted log to universal intrinsics; cleaned up the code a bit; added v_lut_deinterleave intrinsics.
* core: Add universal intrinsics for AVX2
* fixed multiple compile errors
* fixed many more compile errors and hopefully some test failures
* fixed some more compile errors
* temporarily disabled IPP to debug exp & log; hopefully fixed Doxygen complains
* fixed some more compile errors
* fixed v_store(short*, v_float16&) signatures
* trying to fix the test failures on Linux
* fixed some issues found by alalek
* restored IPP optimization after the patch with AVX wide intrinsics has been properly tested
* restored IPP optimization after the patch with AVX wide intrinsics has been properly tested
2018-07-16 18:57:24 +03:00
Alexander Alekhin
481829a81b
Merge pull request #11957 from alalek:issue_11956
2018-07-16 15:54:34 +00:00
Alexander Alekhin
b0ee5d9023
core: CV_NODISCARD macro with semantic of [[nodiscard]] attr
...
[[nodiscard]] is defined in C++17.
There is fallback alias for modern GCC / Clang compilers.
2018-07-16 18:03:32 +03:00
Alexander Alekhin
a5e8ae2183
Merge pull request #11969 from alalek:core_Matx_inv_solve_templates
2018-07-16 14:18:12 +00:00
Alexander Alekhin
50751ae6ff
Merge pull request #11967 from catree:add_tutorial_ml_java_python
2018-07-16 09:24:32 +00:00
Alexander Alekhin
3c74fde349
core: eliminate 'if' logic from Matx::inv()/solve()
...
- 'if' logic is moved into templates.
- removed unnecessary cv::Mat objects creation.
- fixed inv() test (invA * A == eye)
- added more Matx tests to cover all defined template specializations
2018-07-13 20:09:01 +03:00
catree
4dc7e617a4
Add overloaded cv::PCACompute() that returns also the eigenvalues. Useful for Java and Python OpenCV where PCA is not available.
2018-07-13 15:05:54 +02:00
Alexander Alekhin
33b7028be2
core: use "explicit" for Matx() ctor
2018-07-12 19:50:56 +00:00
Vitaly Tuzov
850a8577b2
Fixed unreachable code warnings for Matx::solve()
2018-07-12 19:19:51 +03:00
Vitaly Tuzov
d0a3686812
Merge pull request #11904 from terfendail/matx_solve_fix
...
Fixed Matx::solve function for non-square matrixes (#11904 )
2018-07-11 22:00:57 +03:00
catree
d7bd662c95
Add a note in the documentation about Mat::ones and mat::eye. With multi-channels type (e.g. CV_8UC3), only the first channel is treated.
2018-07-10 15:35:46 +02:00
Alexander Alekhin
7ba66a1682
Merge pull request #11703 from alalek:c_api_calib3d_chessboard_detector
2018-07-09 15:37:26 +00:00
Alexander Alekhin
06fc77610b
core(hal): eliminate build warnings
2018-07-06 13:00:41 +03:00
Alexander Alekhin
c7fc563dc0
calib3d: chessboard detector - replace OpenCV C API
2018-07-05 13:09:10 +03:00
Alexander Alekhin
b09a4a98d4
opencv: Use cv::AutoBuffer<>::data()
2018-07-04 19:11:29 +03:00
Alexander Alekhin
135ea264ef
core: align cv::AutoBuffer API with std::vector/std::array
...
- added .data() methods
- added operator[] (int i)
- extend checks support to generic and debug-only cases
- deprecate existed operator* ()
2018-07-04 19:10:38 +03:00
Alexander Alekhin
bd8c8e720e
Merge tag '3.4.2'
2018-07-04 14:08:11 +03:00
Alexander Alekhin
d69a327d6d
OpenCV version++
...
OpenCV 3.4.2
2018-06-10 10:20:38 +03:00
Vadim Pisarevsky
ccbc0b91ea
Merge pull request #11654 from alalek:issue_11648
2018-06-04 10:22:44 +00:00
Paul Jurczak
bd7bad02a0
convertFp16 documentation edit (2)
...
If this seems too wordy, take into account a new user who tries to find out extent of FP16 support in OpenCV.
2018-06-01 04:15:21 -06:00
Alexander Alekhin
03edddba47
core: drop unnecessary duplicate check
2018-06-01 12:31:48 +03:00
Vadim Pisarevsky
7d19bd6c19
Merge pull request #11634 from vpisarev:empty_mat_with_types_2
...
fixes handling of empty matrices in some functions (#11634 )
* a part of PR #11416 by Yuki Takehara
* moved the empty mat check in Mat::copyTo()
* fixed some test failures
2018-05-31 16:36:39 +00:00