[GSoC] New universal intrinsic backend for RVV
* Add new rvv backend (partially implemented).
* Modify the framework of Universal Intrinsic.
* Add CV_SIMD macro guards to current UI code.
* Use vlanes() instead of nlanes.
* Modify the UI test.
* Enable the new RVV (scalable) backend.
* Remove whitespace.
* Rename and some others modify.
* Update intrin.hpp but still not work on AVX/SSE
* Update conditional compilation macros.
* Use static variable for vlanes.
* Use max_nlanes for array defining.
Add conditional compilation directives to enable uses of std::chrono on supported compilers. Use std::chrono::steady_clock as a source to retrieve current tick count and clock frequency.
Fixesopencv/opencv#6902.
Replaced sprintf with safer snprintf
* Straightforward replacement of sprintf with safer snprintf
* Trickier replacement of sprintf with safer snprintf
Some functions were changed to take another parameter: the size of the buffer, so that they can pass that size on to snprintf.
* Added NEON support in builds for Windows on ARM
* Fixed `HAVE_CPU_NEON_SUPPORT` display broken during compiler test
* Fixed a build error prior to Visual Studio 2022
Thread Sanitizer identified an incorrect implementation of double checked locking.
Replaced it with a static, which therefore can only be created once.
Per intel docs for libva, when vaDeriveImage fails vaCreateImage +
vaPutImage should be tried. This is important as mesa with AMD HW
will always fail because the image is interlaced so a indirect
method must be used to get the surface to/from and image
Fixes https://github.com/opencv/opencv/issues/21536
* Fix wrong MSAN errors.
Because Fortran is called in Lapack, MSAN does not think the memory
has been written even though it is the case.
MSAN does no support well cross-language memory analysis.
* Make a dedicated check.
* Fix compile against lapack-3.10.0
Fix compilation against lapack >= 3.9.1 and 3.10.0 while not breaking older versions
OpenCVFindLAPACK.cmake & CMakeLists.txt: determine OPENCV_USE_LAPACK_PREFIX from LAPACK_VERSION
hal_internal.cpp : Only apply LAPACK_FUNC to functions whose number of inputs depends on LAPACK_FORTRAN_STR_LEN in lapack >= 3.9.1
lapack_check.cpp : remove LAPACK_FUNC which is not OK as function are not used with input parameters (so lapack.h preprocessing of "LAPACK_xxxx(...)" is not applicable with lapack >= 3.9.1
If not removed lapack_check fails so LAPACK is deactivated in build (not want we want)
use OCV_ prefix and don't use Global, instead generate OCV_LAPACK_FUNC depending on CMake Conditions
Remove CONFIG from find_package(LAPACK) and use LAPACK_GLOBAL and LAPACK_NAME to figure out if using netlib's reference LAPACK implementation and how to #define OCV_LAPACK_FUNC(f)
* Fix typos and grammar in comments
1. Code uses PPC_FEATURE_HAS_VSX, but it's not checked similarly to
PPC_FEATURE2_ARCH_3_00 and PPC_FEATURE2_ARCH_3_00 for availability. FreeBSD has
those macros in machine/cpu.h, but I went with the way chosen for
PPC_FEATURE2_ARCH_3_00 and PPC_FEATURE2_ARCH_3_00. Other than that, FreeBSD also
has sys/auxv.h and that's where elf_aux_info() is defined.
2. getauxval() is actually Linux-only, but code checked for __unix__. It won't
work on all UNIX, so change it back to __linux__. Add another code variant
strictly for FreeBSD.
3. Update comment. This commit adds code for FreeBSD, but recently there
appeared support for powerpc64 in OpenBSD.
`PyObject*` to `std::vector<T>` conversion logic:
- If user passed Numpy Array
- If array is planar and T is a primitive type (doesn't require
constructor call) that matches with the element type of array, then
copy element one by one with the respect of the step between array
elements. If compiler is lucky (or brave enough) copy loop can be
vectorized.
For classes that require constructor calls this path is not
possible, because we can't begin an object lifetime without hacks.
- Otherwise fall-back to general case
- Otherwise - execute the general case:
If PyObject* corresponds to Sequence protocol - iterate over the
sequence elements and invoke the appropriate `pyopencv_to` function.
`std::vector<T>` to `PyObject*` conversion logic:
- If `std::vector<T>` is empty - return empty tuple.
- If `T` has a corresponding `Mat` `DataType` than return
Numpy array instance of the matching `dtype` e.g.
`std::vector<cv::Rect>` is returned as `np.ndarray` of shape `Nx4` and
`dtype=int`.
This branch helps to optimize further evaluations in user code.
- Otherwise - execute the general case:
Construct a tuple of length N = `std::vector::size` and insert
elements one by one.
Unnecessary functions were removed and code was rearranged to allow
compiler select the appropriate conversion function specialization.
AArch64 semihosting
* [ts] Disable filesystem support in the TS module.
Because of this change, all the tests loading data will file, but tat
least the core module can be tested with the following line:
opencv_test_core --gtest_filter=-"*Core_InputOutput*:*Core_globbing.accuracy*"
* [aarch64] Build OpenCV for AArch64 semihosting.
This patch provide a toolchain file that allows to build the library
for semihosting applications [1]. Minimal changes have been applied to
the code to be able to compile with a baremetal toolchain.
[1] https://developer.arm.com/documentation/100863/latest
The option `CV_SEMIHOSTING` is used to guard the bits in the code that
are specific to the target.
To build the code:
cmake ../opencv/ \
-DCMAKE_TOOLCHAIN_FILE=../opencv/platforms/semihosting/aarch64-semihosting.toolchain.cmake \
-DSEMIHOSTING_TOOLCHAIN_PATH=/path/to/baremetal-toolchain/bin/ \
-DBUILD_EXAMPLES=ON -GNinja
A barematel toolchain for targeting aarch64 semihosting can be found
at [2], under `aarch64-none-elf`.
[2] https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads
The folder `samples/semihosting` provides two example semihosting
applications.
The two binaries can be executed on the host platform with:
qemu-aarch64 ./bin/example_semihosting_histogram
qemu-aarch64 ./bin/example_semihosting_norm
Similarly, the test and perf executables of the modules can be run
with:
qemu-aarch64 ./bin/opecv_[test|perf]_<module>
Notice that filesystem support is disabled by the toolchain file,
hence some of the test that depend on filesystem support will fail.
* [semihosting] Remove blank like at the end of file. [NFC]
The spurious blankline was reported by
https://pullrequest.opencv.org/buildbot/builders/precommit_docs/builds/31158.
* [semihosting] Make the raw pixel file generation OS independent.
Use the facilities provided by Cmake to generate the header file
instead of a shell script, so that the build doesn't fail on systems
that do not have a unix shell.
* [semihosting] Rename variable for semihosting compilation.
* [semihosting] Move the cmake configuration to a variable file.
* [semihosting] Make the guard macro private for the core module.
* [semihosting] Remove space. [NFC]
* [semihosting] Improve comment with information about semihosting. [NFC]
* [semihosting] Update license statement on top of sourvce file. [NFC]
* [semihosting] Replace BM_SUFFIX with SEMIHOSTING_SUFFIX. [NFC]
* [semihosting] Remove double space. [NFC]
* [semihosting] Add some text output to the sample applications.
* [semihosting] Remove duplicate entry in cmake configuration. [NFCI]
* [semihosting] Replace `long` with `int` in sample apps. [NFCI]
* [semihosting] Use `configure_file` to create the random pixels. [NFCI]
* [semihosting][bugfix] Fix name of cmakedefine variable.
* [semihosting][samples] Use CV_8UC1 for grayscale images. [NFCI]
* [semihosting] Add readme file.
* [semihosting] Remove blank like at the end of README. [NFC]
This fixes the failure at
https://pullrequest.opencv.org/buildbot/builders/precommit_docs/builds/31272.
Improves support for Unix non-Linux systems, including QNX
* Fixes#20395. Improves support for Unix non-Linux systems. Focus on QNX Neutrino.
Signed-off-by: promero <promero@mathworks.com>
* Update system.cpp
* [build][option] Introduce `OPENCV_DISABLE_THREAD_SUPPORT` option.
The option forces the library to build without thread support.
* update handling of OPENCV_DISABLE_THREAD_SUPPORT
- reduce amount of #if conditions
* [to squash] cmake: apply mode vars in toolchains too
Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
* Support cl_image conversion for CL_HALF_FLOAT (float16)
* Support cl_image conversion for additional channel orders:
CL_A, CL_INTENSITY, CL_LUMINANCE, CL_RG, CL_RA
* Comment on why cl_image conversion is unsupported for CL_RGB
* Predict optimal vector width for float16
* ocl::kernelToStr: support float16
* ocl::Device::halfFPConfig: drop artificial requirement for OpenCL
version >= 1.2. Even OpenCL 1.0 supports the underlying config
property, CL_DEVICE_HALF_FP_CONFIG.
* dumpOpenCLInformation: provide info on OpenCL half-float support
and preferred half-float vector width
* randu: support default range [-1.0, 1.0] for float16
* TestBase::warmup: support float16
There can be an int overflow.
cv::norm( InputArray _src, int normType, InputArray _mask ) is fine,
not cv::norm( InputArray _src1, InputArray _src2, int normType, InputArray _mask ).
Fix dynamic loading of clBLAS and clFFT (formerly, clAmdBlas and clAmdFft)
* Fix dynamic loading of clBLAS and clFFT
* Update filenames and function names for clBLAS (formerly, clAmdBlas)
* Update filenames and function names for clFFT (formerly, clAmdFft)
* Uncomment teardown of clFFT; tear down clFFT in same way as clBLAS
* Fix generators for clBLAS and clFFT headers
* Update generators to parse recent clBLAS and clFFT library headers
* Update generators to be compatible with Python 3
* Re-generate OpenCV's clBLAS and clFFT headers
* Update function calls to match names in newly generated headers
* Disable (and comment on) teardown code for clBLAS and clFFT
* Renaming *clamd* files
* Renaming *clamdblas* files to *clblas*
* Renaming *clamdfft* files to *clfft*
* Update generator for CL headers
* Update generator to be compatible with Python 3
* Add the support for riscv64 vector 0.7.1.
* fixed GCC warnings
* cleaned whitespaces
* Remove the worning by the use of internal API of compiler.
* Update the license header.
* removed trailing whitespaces
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@me.com>
Co-authored-by: yulj <linjie.ylj@alibaba-inc.com>
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
* Add Python Bindings for getCacheDirectory function
* Added getCacheDirectory interop test with image codecs.
Co-authored-by: Sergey Slashchinin <sergei.slashchinin@xperience.ai>
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)
backport is done to improve merge experience (less conflicts)
backport of commit: 65eb946756
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)