Commit Graph

136 Commits

Author SHA1 Message Date
Alexander Alekhin
a41394c885 core(parallel): fix plugins handling if no filesystem available 2021-03-18 23:05:12 +00:00
Alexander Alekhin
65eb946756 core: rework code locality
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)
2021-03-02 11:27:58 +00:00
Alexander Alekhin
3dd55d284d core(libva): use dynamic loader 2021-02-19 10:32:59 +00:00
Alexander Alekhin
cc73c36e32 core(parallel): plugins support 2021-02-15 17:07:36 +00:00
Alexander Alekhin
37c12db366 Merge pull request #19365 from alalek:parallel_api 2021-01-27 18:12:15 +00:00
Alexander Alekhin
b73bf03bfc core: parallel backends API
- allow to replace parallel_for() backend
2021-01-27 14:15:33 +00:00
Alexander Alekhin
cd68cc1f46 Merge pull request #19195 from diablodale:win32AlignAlloc 2020-12-23 17:33:58 +00:00
Dale Phurrough
109255a730
add windows native aligned malloc + unit test case
* implements https://github.com/opencv/opencv/issues/19147
* CAUTION: this PR will only functions safely in the
  4+ branches that already include PR 19029
* CAUTION: this PR requires thread-safe startup of the alloc.cpp
  translation unit as implemented in PR 19029
2020-12-23 14:59:28 +01:00
Alexander Alekhin
de385009ae Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-12-09 18:09:00 +00:00
Alexander Alekhin
26e8048a0a core: update handling of allocator stats type
- don't use OPENCV_ALLOCATOR_STATS_COUNTER_TYPE definition in non C++11 builds
- don't use with MinGW
2020-12-05 20:54:47 +00:00
Odianosen Ejale
862fc06b6f Fixed and updated OpenCL-VA interoperability 2020-09-25 16:11:50 +03:00
Giles Payne
02385472b6
Merge pull request #17165 from komakai:objc-binding
Objc binding

* Initial work on Objective-C wrapper

* Objective-C generator script; update manually generated wrappers

* Add Mat tests

* Core Tests

* Imgproc wrapper generation and tests

* Fixes for Imgcodecs wrapper

* Miscellaneous fixes. Swift build support

* Objective-C wrapper build/install

* Add Swift wrappers for videoio/objdetect/feature2d

* Framework build;iOS support

* Fix toArray functions;Use enum types whenever possible

* Use enum types where possible;prepare test build

* Update test

* Add test runner scripts for iOS and macOS

* Add test scripts and samples

* Build fixes

* Fix build (cmake 3.17.x compatibility)

* Fix warnings

* Fix enum name conflicting handling

* Add support for document generation with Jazzy

* Swift/Native fast accessor functions

* Add Objective-C wrapper for calib3d, dnn, ml, photo and video modules

* Remove IntOut/FloatOut/DoubleOut classes

* Fix iOS default test platform value

* Fix samples

* Revert default framework name to opencv2

* Add converter util functions

* Fix failing test

* Fix whitespace

* Add handling for deprecated methods;fix warnings;define __OPENCV_BUILD

* Suppress cmake warnings

* Reduce severity of "jazzy not found" log message

* Fix incorrect #include of compatibility header in ios.h

* Use explicit returns in subscript/get implementation

* Reduce minimum required cmake version to 3.15 for Objective-C/Swift binding
2020-06-08 18:32:53 +00:00
Alexander Alekhin
ca23c0e630 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-03-17 13:23:33 +03:00
Alexander Alekhin
4e56c1326f core: adjust type of allocator_stats counter, allow to disable 2020-03-11 20:12:29 +03:00
Alexander Alekhin
ba7b0f4c54 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-15 11:23:46 +00:00
Alexander Alekhin
a45928045a
Merge pull request #16150 from alalek:cmake_avoid_deprecated_link_private
* cmake: avoid deprecated LINK_PRIVATE/LINK_PUBLIC

see CMP0023 (CMake 2.8.12+)

* cmake: fix 3rdparty list

- don't include OpenCV modules
2019-12-13 17:52:40 +03:00
Alexander Alekhin
b6a58818bb Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-11 20:25:42 +00:00
Alexander Alekhin
657c17bb8c cmake: fix ITT define condition 2019-11-01 15:07:49 +03:00
Alexander Alekhin
65573784c4 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-09 19:46:18 +00:00
Sayed Adel
f2fe6f40c2 Merge pull request #15510 from seiko2plus:issue15506
* core: rework and optimize SIMD implementation of dotProd

  - add new universal intrinsics v_dotprod[int32], v_dotprod_expand[u&int8, u&int16, int32], v_cvt_f64(int64)
  - add a boolean param for all v_dotprod&_expand intrinsics that change the behavior of addition order between
    pairs in some platforms in order to reach the maximum optimization when the sum among all lanes is what only matters
  - fix clang build on ppc64le
  - support wide universal intrinsics for dotProd_32s
  - remove raw SIMD and activate universal intrinsics for dotProd_8
  - implement SIMD optimization for dotProd_s16&u16
  - extend performance test data types of dotprod
  - fix GCC VSX workaround of vec_mule and vec_mulo (in little-endian it must be swapped)
  - optimize v_mul_expand(int32) on VSX

* core: remove boolean param from v_dotprod&_expand and implement v_dotprod_fast&v_dotprod_expand_fast

  this changes made depend on "terfendail" review
2019-10-07 22:01:35 +03:00
Alexander Alekhin
19a4b51371 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-08-16 18:48:08 +03:00
Alexander Alekhin
5ef548a985 cmake: update initialization 2019-08-08 15:23:16 +03:00
Alexander Alekhin
f3de2b4be7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-06-05 19:11:52 +03:00
Vitaly Tuzov
3b015dfc7d Merge pull request #14210 from terfendail:wui_512
AVX512 wide universal intrinsics (#14210)

* Added implementation of 512-bit wide universal intrinsics(WIP)

* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks

* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16

* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros

* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines

* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.

* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.

* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()

* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build

* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Alexander Alekhin
142a524d2f cmake: fix CUDA world build 2019-03-31 19:56:30 +00:00
Alexander Alekhin
c7c987843c cmake: emit error if CUDA is enabled without opencv_contrib 2019-03-28 16:51:11 +03:00
Alexander Alekhin
8c25a8eb7b Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-03-22 19:31:31 +03:00
Sayed Adel
f41359688b core:vsx Add support for VSX3 half precision conversions 2019-03-20 10:19:42 +02:00
Alexander Alekhin
c3cf35ab63 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-26 17:34:42 +03:00
Alexander Alekhin
fd49ee5f39 core: dispatch merge.cpp 2019-02-23 15:42:26 +00:00
Alexander Alekhin
91d152e2c2 core: dispatch split.cpp 2019-02-22 09:54:31 +00:00
Alexander Alekhin
8bde6aea4b Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-19 19:49:13 +00:00
Alexander Alekhin
dc84cf9914 core: dispatch mean.cpp 2019-02-19 16:58:32 +03:00
Alexander Alekhin
cd66f6e3db core: dispatch matmul
- gemm: keep baseline only (lapack is 10x+ faster, lets reduce binary size)
- transform / distTransform
- scaleAdd (32f/64f only)
- Mahalanobis: keep baseline only (no perf tests)
- mulTransposed: keep baseline only (no perf tests)
- dot
2019-02-18 14:36:46 +03:00
Alexander Alekhin
e3633ec4a2 core: dispatch count_non_zero 2019-02-14 13:16:20 +03:00
Alexander Alekhin
b40a7ffbe4 core: dispatch sum 2019-02-13 18:17:38 +03:00
Alexander Alekhin
dfef04b325 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-12 17:54:40 +03:00
Alexander Alekhin
d32d576d6d core: dispatch convert_scale 2019-02-08 18:32:10 +03:00
Alexander Alekhin
39b90ae9fb core: dispatch convert 2019-02-08 18:32:10 +03:00
Alexander Alekhin
7fa7fa0226 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-21 08:33:39 +00:00
Sayed Adel
474a0dac49 core: several improves and fixes on ppc64le infrastructure
- add infrastructure support for Power9/VSX3
  - fix missing VSX flags on GCC4.9 and CLANG4(#13210, #13222)
  - fix disable VSX optimzation on GCC by using flag ENABLE_VSX
  - flag ENABLE_VSX is deprecated now, use CPU_BASELINE, CPU_DISPATCH instead
  - add VSX3 to arithmetic dispatchable flags
2018-11-20 15:28:46 +00:00
Alexander Alekhin
22dbcf98c5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-17 14:17:35 +00:00
Alexander Alekhin
2fa9bd221d core: add utils::findDataFile() / samples::findFile() 2018-11-16 00:25:06 +00:00
Alexander Alekhin
1913482cf5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-10 20:50:26 +00:00
Sayed Adel
93ffebc273 core: reimplement SIMD arithmetic, logic and comparison operations into wide universal intrinsics
- initialize arithmetic dispatcher
  - add new universal intrinsic v_absdiffs
  - add new universal intrinsic v_pack_b
  - add accumulate version of universal intrinsic v_round
  - fix sse/avx2:uint8 multiplication overflow
  - reimplement arithmetic, logic and comparison operations into wide universal intrinsics
    with full support for all types
  - reimplement IPP arithmetic, logic and comparison operations in a sperate file arithm_ipp.hpp
  - avoid scalar multiplication if scaling factor eq 1 and use integer multiplication
  - move C arithmetic operations to precomp.hpp and delete [arithm_simd|arithm_core].hpp
  - add compatibility with new opencv4 divide policy
2018-10-30 12:48:31 +02:00
Alexander Alekhin
8f6695acc7 CUDA: drop OPENCV_TRAITS_ENABLE_DEPRECATED requirement 2018-09-29 02:47:59 +09:00
Alexander Alekhin
00cbb894ec CUDA: drop OPENCV_TRAITS_ENABLE_DEPRECATED requirement 2018-09-03 18:41:48 +00:00
Jakub Golinowski
9f1218b00b Merge pull request #11897 from Jakub-Golinowski:hpx_backend
* Add HPX backend for OpenCV implementation
Adds hpx backend for cv::parallel_for_() calls respecting the nstripes chunking parameter. C++ code for the backend is added to modules/core/parallel.cpp. Also, the necessary changes to cmake files are introduced.
Backend can operate in 2 versions (selectable by cmake build option WITH_HPX_STARTSTOP): hpx (runtime always on) and hpx_startstop (start and stop the backend for each cv::parallel_for_() call)

* WIP: Conditionally include hpx_main.hpp to tests in core module
Header hpx_main.hpp is included to both core/perf/perf_main.cpp and core/test/test_main.cpp.
The changes to cmake files for linking hpx library to above mentioned test executalbles are proposed but have issues.

* Add coditional iclusion of hpx_main.hpp to cpp cpu modules

* Remove start/stop version of hpx backend
2018-08-31 16:23:26 +03:00
Maksim Shabunin
f84eb3dde6 Fixed core headers installation in world builds 2018-08-16 17:16:02 +03:00
Alexander Alekhin
3f302cabb8 core(test): intrinsic tests for all dispatched CPU optimizations
- tests for both SIMD128 / SIMD256
- different dispatched + baseline(SIMD128) intrinsics
2018-08-01 13:50:42 +03:00