Commit Graph

141 Commits

Author SHA1 Message Date
Alexander Alekhin
d24befa0bc Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-12-11 15:18:57 +00:00
Alexander Alekhin
65392d5e6b cmake: fix OPENGL_LIBRARIES handling 2021-12-07 12:12:42 +00:00
Francesco Petrogalli
d29c7e7871
Merge pull request #20392 from fpetrogalli:aarch64-semihosting
AArch64 semihosting

* [ts] Disable filesystem support in the TS module.

Because of this change, all the tests loading data will file, but tat
least the core module can be tested with the following line:

    opencv_test_core --gtest_filter=-"*Core_InputOutput*:*Core_globbing.accuracy*"

* [aarch64] Build OpenCV for AArch64 semihosting.

This patch provide a toolchain file that allows to build the library
for semihosting applications [1]. Minimal changes have been applied to
the code to be able to compile with a baremetal toolchain.

[1] https://developer.arm.com/documentation/100863/latest

The option `CV_SEMIHOSTING` is used to guard the bits in the code that
are specific to the target.

To build the code:

    cmake ../opencv/ \
        -DCMAKE_TOOLCHAIN_FILE=../opencv/platforms/semihosting/aarch64-semihosting.toolchain.cmake \
        -DSEMIHOSTING_TOOLCHAIN_PATH=/path/to/baremetal-toolchain/bin/ \
        -DBUILD_EXAMPLES=ON -GNinja

A barematel toolchain for targeting aarch64 semihosting can be found
at [2], under `aarch64-none-elf`.

[2] https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads

The folder `samples/semihosting` provides two example semihosting
applications.

The two binaries can be executed on the host platform with:

    qemu-aarch64 ./bin/example_semihosting_histogram
    qemu-aarch64 ./bin/example_semihosting_norm

Similarly, the test and perf executables of the modules can be run
with:

    qemu-aarch64 ./bin/opecv_[test|perf]_<module>

Notice that filesystem support is disabled by the toolchain file,
hence some of the test that depend on filesystem support will fail.

* [semihosting] Remove blank like at the end of file. [NFC]

The spurious blankline was reported by
https://pullrequest.opencv.org/buildbot/builders/precommit_docs/builds/31158.

* [semihosting] Make the raw pixel file generation OS independent.

Use the facilities provided by Cmake to generate the header file
instead of a shell script, so that the build doesn't fail on systems
that do not have a unix shell.

* [semihosting] Rename variable for semihosting compilation.

* [semihosting] Move the cmake configuration to a variable file.

* [semihosting] Make the guard macro private for the core module.

* [semihosting] Remove space. [NFC]

* [semihosting] Improve comment with information about semihosting. [NFC]

* [semihosting] Update license statement on top of sourvce file. [NFC]

* [semihosting] Replace BM_SUFFIX with SEMIHOSTING_SUFFIX. [NFC]

* [semihosting] Remove double space. [NFC]

* [semihosting] Add some text output to the sample applications.

* [semihosting] Remove duplicate entry in cmake configuration. [NFCI]

* [semihosting] Replace `long` with `int` in sample apps. [NFCI]

* [semihosting] Use `configure_file` to create the random pixels. [NFCI]

* [semihosting][bugfix] Fix name of cmakedefine variable.

* [semihosting][samples] Use CV_8UC1 for grayscale images. [NFCI]

* [semihosting] Add readme file.

* [semihosting] Remove blank like at the end of README. [NFC]

This fixes the failure at
https://pullrequest.opencv.org/buildbot/builders/precommit_docs/builds/31272.
2021-07-21 18:46:05 +03:00
Francesco Petrogalli
b928ebdd53
Merge pull request #19985 from fpetrogalli:disable_threads
* [build][option] Introduce `OPENCV_DISABLE_THREAD_SUPPORT` option.

The option forces the library to build without thread support.

* update handling of OPENCV_DISABLE_THREAD_SUPPORT

- reduce amount of #if conditions

* [to squash] cmake: apply mode vars in toolchains too

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2021-07-08 20:21:21 +00:00
Alexander Alekhin
a41394c885 core(parallel): fix plugins handling if no filesystem available 2021-03-18 23:05:12 +00:00
Alexander Alekhin
cbfd38bd41 core: rework code locality
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)

backport is done to improve merge experience (less conflicts)
backport of commit: 65eb946756
2021-03-02 23:24:28 +00:00
Alexander Alekhin
65eb946756 core: rework code locality
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)
2021-03-02 11:27:58 +00:00
Alexander Alekhin
3dd55d284d core(libva): use dynamic loader 2021-02-19 10:32:59 +00:00
Alexander Alekhin
cc73c36e32 core(parallel): plugins support 2021-02-15 17:07:36 +00:00
Alexander Alekhin
37c12db366 Merge pull request #19365 from alalek:parallel_api 2021-01-27 18:12:15 +00:00
Alexander Alekhin
b73bf03bfc core: parallel backends API
- allow to replace parallel_for() backend
2021-01-27 14:15:33 +00:00
Alexander Alekhin
cd68cc1f46 Merge pull request #19195 from diablodale:win32AlignAlloc 2020-12-23 17:33:58 +00:00
Dale Phurrough
109255a730
add windows native aligned malloc + unit test case
* implements https://github.com/opencv/opencv/issues/19147
* CAUTION: this PR will only functions safely in the
  4+ branches that already include PR 19029
* CAUTION: this PR requires thread-safe startup of the alloc.cpp
  translation unit as implemented in PR 19029
2020-12-23 14:59:28 +01:00
Alexander Alekhin
de385009ae Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-12-09 18:09:00 +00:00
Alexander Alekhin
26e8048a0a core: update handling of allocator stats type
- don't use OPENCV_ALLOCATOR_STATS_COUNTER_TYPE definition in non C++11 builds
- don't use with MinGW
2020-12-05 20:54:47 +00:00
Odianosen Ejale
862fc06b6f Fixed and updated OpenCL-VA interoperability 2020-09-25 16:11:50 +03:00
Giles Payne
02385472b6
Merge pull request #17165 from komakai:objc-binding
Objc binding

* Initial work on Objective-C wrapper

* Objective-C generator script; update manually generated wrappers

* Add Mat tests

* Core Tests

* Imgproc wrapper generation and tests

* Fixes for Imgcodecs wrapper

* Miscellaneous fixes. Swift build support

* Objective-C wrapper build/install

* Add Swift wrappers for videoio/objdetect/feature2d

* Framework build;iOS support

* Fix toArray functions;Use enum types whenever possible

* Use enum types where possible;prepare test build

* Update test

* Add test runner scripts for iOS and macOS

* Add test scripts and samples

* Build fixes

* Fix build (cmake 3.17.x compatibility)

* Fix warnings

* Fix enum name conflicting handling

* Add support for document generation with Jazzy

* Swift/Native fast accessor functions

* Add Objective-C wrapper for calib3d, dnn, ml, photo and video modules

* Remove IntOut/FloatOut/DoubleOut classes

* Fix iOS default test platform value

* Fix samples

* Revert default framework name to opencv2

* Add converter util functions

* Fix failing test

* Fix whitespace

* Add handling for deprecated methods;fix warnings;define __OPENCV_BUILD

* Suppress cmake warnings

* Reduce severity of "jazzy not found" log message

* Fix incorrect #include of compatibility header in ios.h

* Use explicit returns in subscript/get implementation

* Reduce minimum required cmake version to 3.15 for Objective-C/Swift binding
2020-06-08 18:32:53 +00:00
Alexander Alekhin
ca23c0e630 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2020-03-17 13:23:33 +03:00
Alexander Alekhin
4e56c1326f core: adjust type of allocator_stats counter, allow to disable 2020-03-11 20:12:29 +03:00
Alexander Alekhin
ba7b0f4c54 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-12-15 11:23:46 +00:00
Alexander Alekhin
a45928045a
Merge pull request #16150 from alalek:cmake_avoid_deprecated_link_private
* cmake: avoid deprecated LINK_PRIVATE/LINK_PUBLIC

see CMP0023 (CMake 2.8.12+)

* cmake: fix 3rdparty list

- don't include OpenCV modules
2019-12-13 17:52:40 +03:00
Alexander Alekhin
b6a58818bb Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-11-11 20:25:42 +00:00
Alexander Alekhin
657c17bb8c cmake: fix ITT define condition 2019-11-01 15:07:49 +03:00
Alexander Alekhin
65573784c4 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-10-09 19:46:18 +00:00
Sayed Adel
f2fe6f40c2 Merge pull request #15510 from seiko2plus:issue15506
* core: rework and optimize SIMD implementation of dotProd

  - add new universal intrinsics v_dotprod[int32], v_dotprod_expand[u&int8, u&int16, int32], v_cvt_f64(int64)
  - add a boolean param for all v_dotprod&_expand intrinsics that change the behavior of addition order between
    pairs in some platforms in order to reach the maximum optimization when the sum among all lanes is what only matters
  - fix clang build on ppc64le
  - support wide universal intrinsics for dotProd_32s
  - remove raw SIMD and activate universal intrinsics for dotProd_8
  - implement SIMD optimization for dotProd_s16&u16
  - extend performance test data types of dotprod
  - fix GCC VSX workaround of vec_mule and vec_mulo (in little-endian it must be swapped)
  - optimize v_mul_expand(int32) on VSX

* core: remove boolean param from v_dotprod&_expand and implement v_dotprod_fast&v_dotprod_expand_fast

  this changes made depend on "terfendail" review
2019-10-07 22:01:35 +03:00
Alexander Alekhin
19a4b51371 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-08-16 18:48:08 +03:00
Alexander Alekhin
5ef548a985 cmake: update initialization 2019-08-08 15:23:16 +03:00
Alexander Alekhin
f3de2b4be7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-06-05 19:11:52 +03:00
Vitaly Tuzov
3b015dfc7d Merge pull request #14210 from terfendail:wui_512
AVX512 wide universal intrinsics (#14210)

* Added implementation of 512-bit wide universal intrinsics(WIP)

* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave

* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks

* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16

* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16

* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros

* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()

* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces

* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines

* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.

* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.

* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()

* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build

* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Alexander Alekhin
142a524d2f cmake: fix CUDA world build 2019-03-31 19:56:30 +00:00
Alexander Alekhin
c7c987843c cmake: emit error if CUDA is enabled without opencv_contrib 2019-03-28 16:51:11 +03:00
Alexander Alekhin
8c25a8eb7b Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-03-22 19:31:31 +03:00
Sayed Adel
f41359688b core:vsx Add support for VSX3 half precision conversions 2019-03-20 10:19:42 +02:00
Alexander Alekhin
c3cf35ab63 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-26 17:34:42 +03:00
Alexander Alekhin
fd49ee5f39 core: dispatch merge.cpp 2019-02-23 15:42:26 +00:00
Alexander Alekhin
91d152e2c2 core: dispatch split.cpp 2019-02-22 09:54:31 +00:00
Alexander Alekhin
8bde6aea4b Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-19 19:49:13 +00:00
Alexander Alekhin
dc84cf9914 core: dispatch mean.cpp 2019-02-19 16:58:32 +03:00
Alexander Alekhin
cd66f6e3db core: dispatch matmul
- gemm: keep baseline only (lapack is 10x+ faster, lets reduce binary size)
- transform / distTransform
- scaleAdd (32f/64f only)
- Mahalanobis: keep baseline only (no perf tests)
- mulTransposed: keep baseline only (no perf tests)
- dot
2019-02-18 14:36:46 +03:00
Alexander Alekhin
e3633ec4a2 core: dispatch count_non_zero 2019-02-14 13:16:20 +03:00
Alexander Alekhin
b40a7ffbe4 core: dispatch sum 2019-02-13 18:17:38 +03:00
Alexander Alekhin
dfef04b325 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2019-02-12 17:54:40 +03:00
Alexander Alekhin
d32d576d6d core: dispatch convert_scale 2019-02-08 18:32:10 +03:00
Alexander Alekhin
39b90ae9fb core: dispatch convert 2019-02-08 18:32:10 +03:00
Alexander Alekhin
7fa7fa0226 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-21 08:33:39 +00:00
Sayed Adel
474a0dac49 core: several improves and fixes on ppc64le infrastructure
- add infrastructure support for Power9/VSX3
  - fix missing VSX flags on GCC4.9 and CLANG4(#13210, #13222)
  - fix disable VSX optimzation on GCC by using flag ENABLE_VSX
  - flag ENABLE_VSX is deprecated now, use CPU_BASELINE, CPU_DISPATCH instead
  - add VSX3 to arithmetic dispatchable flags
2018-11-20 15:28:46 +00:00
Alexander Alekhin
22dbcf98c5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-17 14:17:35 +00:00
Alexander Alekhin
2fa9bd221d core: add utils::findDataFile() / samples::findFile() 2018-11-16 00:25:06 +00:00
Alexander Alekhin
1913482cf5 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2018-11-10 20:50:26 +00:00
Sayed Adel
93ffebc273 core: reimplement SIMD arithmetic, logic and comparison operations into wide universal intrinsics
- initialize arithmetic dispatcher
  - add new universal intrinsic v_absdiffs
  - add new universal intrinsic v_pack_b
  - add accumulate version of universal intrinsic v_round
  - fix sse/avx2:uint8 multiplication overflow
  - reimplement arithmetic, logic and comparison operations into wide universal intrinsics
    with full support for all types
  - reimplement IPP arithmetic, logic and comparison operations in a sperate file arithm_ipp.hpp
  - avoid scalar multiplication if scaling factor eq 1 and use integer multiplication
  - move C arithmetic operations to precomp.hpp and delete [arithm_simd|arithm_core].hpp
  - add compatibility with new opencv4 divide policy
2018-10-30 12:48:31 +02:00