opencv

mirror of https://github.com/opencv/opencv.git synced 2024-11-27 12:40:05 +08:00

Author	SHA1	Message	Date
Maksim Shabunin	4c81e174bf	Merge pull request #25901 from mshabunin:fix-riscv-aarch-baseline RISC-V/AArch64: disable CPU features detection #25901 This PR is the first step in fixing current issues with NEON/RVV, FP16, BF16 and other CPU features on AArch64 and RISC-V platforms. On AArch64 and RISC-V platforms we usually have the platform set by default in the toolchain when we compile it or in the cmake toolchain file or in CMAKE_CXX_FLAGS by user. Then, there are two ways to set platform options: a) "-mcpu=<some_cpu>" ; b) "-march=<arch description>" (e.g. "rv64gcv"). Furthermore, there are no similar "levels" of optimizations as for x86_64, instead we have features (RVV, FP16,...) which can be enabled or disabled. So, for example, if a user has "rv64gc" set by the toolchain and we want to enable RVV. Then we need to somehow parse their current feature set and append "v" (vector optimizations) to this string. This task is quite hard and the whole procedure is prone to errors. I propose to use "CPU_BASELINE=DETECT" by default on AArch64 and RISC-V platforms. And somehow remove other features or make them read-only/detect-only, so that OpenCV wouldn't add any extra "-march" flags to the default configuration. We would rely only on the flags provided by the compiler and cmake toolchain file. We can have some predefined configurations in our cmake toolchain files. Changes made by this PR: - `CMakeLists.txt`: - use `CMAKE_CROSSCOMPILING` instead of `CMAKE_TOOLCHAIN_FILE` to detect cross-compilation. This might be useful in cases of native compilation with a toolchain file - removed obsolete variables `ENABLE_NEON` and `ENABLE_VFPV3`, the first one have been turned ON by default on AArch64 platform which caused setting `CPU_BASELINE=NEON` - raise minimum cmake version allowed to 3.7 to allow using `CMAKE_CXX_FLAGS_INIT` in toolchain files - added separate files with arch flags for native compilation on AArch64 and RISC-V, these files will be used in our toolchain files and in regular cmake - use `DETECT` as default value for `CPU_BASELINE` also allow `NATIVE`, warn user if other values were used (only for AArch64 and RISC-V) - for each feature listed in `CPU_DISPATCH` check if corresponding `CPU_${opt}_FLAGS_ON` has been provided, warn user if it is empty (only for AArch64 and RISC-V) - use `CPU_BASELINE_DISABLE` variable to actually turn off macros responsible for corresponding features even if they are enabled by compiler - removed Aarch64 feature merge procedure (it didn't support `-mcpu` and built-in `-march`) - reworked AArch64 and two RISC-V cmake toolchain files (does not affect Android/OSX/iOS/Win): - use `CMAKE_CXX_FLAGS_INIT` to set compiler flags - use variables `ENABLE_BF16`, `ENABLE_DOTPROD`, `ENABLE_RVV`, `ENABLE_FP16` to control `-march` - AArch64: removed other compiler and linker flags - `-fdata-sections`, `-fsigned-char`, `-Wl,--no-undefined`, `-Wl,--gc-sections` - already set by OpenCV - `-Wa,--noexecstack`, `-Wl,-z,noexecstack`, `-Wl,-z,relro`, `-Wl,-z,now` - can be enabled by OpenCV via `ENABLE_HARDENING` - `-Wno-psabi` - this option used to disable some warnings on older ARM platforms, shouldn't harm - ARM: removed same common flags as for AArch64, but left `-mthumb` and `--fix-cortex-a8`, `-z nocopyreloc`	2024-09-12 18:07:24 +03:00
Alexander Alekhin	b79b366859	Merge pull request #25930 from opencv-pushbot:gitee/alalek/cmake_try_detect_feature_without_flags	2024-08-01 20:09:49 +00:00
Alexander Alekhin	938b9e4bb7	cmake: try baseline optimization feature check without extra flags first	2024-07-29 13:21:23 +00:00
HAN Liutong	b5ea32158a	Merge pull request #25883 from hanliutong:rvv-intrin-upgrade Upgrade RISC-V Vector intrinsic and cleanup the obsolete RVV backend. #25883 This patch upgrade RISC-V Vector intrinsic from `v0.10` to `v0.12`/`v1.0`: - Update cmake check and options; - Upgrade RVV implement for Universal Intrinsic; - Upgrade RVV optimized DNN kernel. - Cleanup the obsolete RVV backend (`intrin_rvv.hpp`) and compatable header file. With this patch, RVV backend require Clang 17+ or GCC 14+ (which means `__riscv_v_intrinsic >= 12000`, see https://godbolt.org/z/es7ncETE3) This patch is test with Clang 17.0.6 (require extra `-DWITH_PNG=OFF` due to ICE), Clang 18.1.8 and GCC 14.1.0 on QEMU and k230 (with `--gtest_filter="hal_"`). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-07-19 11:41:42 +03:00
Tomoaki Teshima	cd00575257	brush up	2023-12-22 18:24:23 +09:00
Tomoaki Teshima	b6ec9b9d8c	prepare to build for ARM64 on Windows with Visual Studio	2023-12-19 09:40:35 +09:00
Alexander Smorkalov	7892517b3d	Merge pull request #24642 from tomoaki0705:merge_features_aarch64 build: merge multiple features specified at once	2023-12-18 19:57:19 +03:00
Tomoaki Teshima	bc12e4fe55	merge multiple features specified at once * follow the comment	2023-12-18 22:02:33 +09:00
Alexander Smorkalov	3893936243	Merge pull request #24565 from CNClareChen:4.x Change the lsx to baseline features.	2023-11-30 15:27:49 +03:00
Hao Chen	c19adb4953	Change the lsx to baseline features. This patch change lsx to baseline feature, and lasx to dispatch feature. Additionally, the runtime detection methods for lasx and lsx have been modified.	2023-11-21 11:51:22 +08:00
zihaomu	b913e73d04	DNN: add the Winograd fp16 support (#23654 ) * add Winograd FP16 implementation * fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime * fixed some warnings * hopefully fixed winograd on x64 (and maybe other platforms) --------- Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>	2023-11-20 13:45:37 +03:00
CNClareChen	d142a796d8	Merge pull request #23929 from CNClareChen:4.x * Optimize some function with lasx. Optimize some function with lasx. #23929 This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-10-20 14:20:09 +03:00
Vadim Pisarevsky	ba4d6c859d	added detection & dispatching of some modern NEON instructions (NEON_FP16, NEON_BF16) (#24420 ) * added more or less cross-platform (based on POSIX signal() semantics) method to detect various NEON extensions, such as FP16 SIMD arithmetics, BF16 SIMD arithmetics, SIMD dotprod etc. It could be propagated to other instruction sets if necessary. * hopefully fixed compile errors * continue to fix CI * another attempt to fix build on Linux aarch64 * * reverted to the original method to detect special arm neon instructions without signal() * renamed FP16_SIMD & BF16_SIMD to NEON_FP16 and NEON_BF16, respectively * removed extra whitespaces	2023-10-18 22:06:20 +03:00
ashadrina	3889dcf3f8	Merge pull request #24286 from ashadrina:intel_icx_compiler_support Add Intel® oneAPI DPC++/C++ Compiler (icx) #24286 Intel® C++ Compiler Classic (icc) is deprecated and will be removed in a oneAPI release in the second half of 2023 ([deprecation notice](https://community.intel.com/t5/Intel-oneAPI-IoT-Toolkit/DEPRECATION-NOTICE-Intel-C-Compiler-Classic/m-p/1412267#:~:text=Intel%C2%AE%20C%2B%2B%20Compiler%20Classic%20(icc)%20is%20deprecated%20and%20will,the%20second%20half%20of%202023.)). This commit is intended to add support for the next-generation compiler, Intel® oneAPI DPC++/C++ Compiler (icx) (the documentation for the compiler is available on the [link](https://www.intel.com/content/www/us/en/docs/dpcpp-cpp-compiler/developer-guide-reference/2023-2/overview.html)). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-09-22 17:09:58 +03:00
Alexander Alekhin	941d89e06d	cmake: fix RISC-V toolchains - RVV options are moved to configuration scripts instead of toolchains	2022-12-09 12:02:28 +00:00
Alexander Alekhin	762481411d	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-10-15 16:44:47 +00:00
Alexander Alekhin	d480e2e51b	cmake(opt): force separate targets for dispatched code - PCH may not pass compilation flags properly	2022-10-05 21:54:46 +03:00
wxsheng	4154bd0667	Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX * Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX * Add resize.lasx.cpp for Loongson SIMD acceleration * Add imgwarp.lasx.cpp for Loongson SIMD acceleration * Add LASX acceleration support for dnn/conv * Add CV_PAUSE(v) for Loongarch * Set LASX by default on Loongarch64 * LoongArch: tune test threshold for Core/HAL.mat_decomp/15 Co-authored-by: shengwenxue <shengwenxue@loongson.cn>	2022-09-10 09:39:43 +03:00
Alexander Alekhin	2ebdc04787	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-08-14 15:50:42 +00:00
Tomoaki Teshima	b3269b08a1	neon: add dotprod dispatch implementation * read vector at runtime * add enum	2022-07-20 19:25:39 +09:00
Alexander Alekhin	7b57df02a7	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2021-10-30 14:26:17 +00:00
Alexander Alekhin	5dfe65d53a	cmake: fix popcnt detection with Intel Compiler	2021-10-28 05:37:23 +00:00
Zhangyin	ff4c3873f2	Added cmake toolchain for RISC-V with clang. - Added cross compile cmake file for target riscv64-clang - Extended cmake for RISC-V and added instruction checks - Created intrin_rvv.hpp with C++ version universal intrinsics	2020-08-03 20:18:56 +08:00
Alexander Alekhin	6ea29a7696	cmake: prefer using CMAKE_SYSTEM_PROCESSOR / CMAKE_SIZEOF_VOID_P Drop: - discouraged CMAKE_CL_64 - MSVC64 - MINGW64	2019-12-11 00:21:10 +00:00
Alexander Alekhin	21c38bbdaf	cmake(cpu optmizations): fix cleanup of OPENCV_DEPENDANT_TARGETS_* vars	2019-11-02 10:34:54 +00:00
Fei Wu	90af2835a2	Fix issue 15730.	2019-10-19 00:36:18 +08:00
Alexander Alekhin	bdc097495a	fix avx512 detection - renamed Cascade Lake AVX512_CEL => AVX512_CLX (align with Intel SDE tool) - fixed CLX instruction sets (no IFMA/VBMI) - added flag to bypass CPU baseline check: OPENCV_SKIP_CPU_BASELINE_CHECK	2019-10-05 11:03:57 +00:00
mipsopen-fwu	b1ea91d8bd	Merge pull request #15422 from mipsopen-fwu:msa-dev * Added MSA implementations for mips platforms. Intrinsics for MSA and build scripts for MIPS platforms are added. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed some unused code in mips.toolchain.cmake. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Added comments for mips toolchain configuration and disabled compiling warnings for libpng. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Fixed the build error of unsupported opcode 'pause' when mips isa_rev is less than 2. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed FP16 related item in MSA option defines in OpenCVCompilerOptimizations.cmake. 2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h. 3. Removed hasSIMD128() in intrin_msa.hpp. 4. Define CPU_MSA as 150. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp. 2. Removed unnecessary CV_MSA related code block in dotProd_8u(). Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Defined CPU_MSA_FLAGS_ON as "-mmsa". 2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h. Signed-off-by: Fei Wu <fwu@wavecomp.com>	2019-09-20 19:52:48 +03:00
luz.paz	57ccf14952	FIx misc. source and comment typos Found via `codespell -q 3 -S ./3rdparty,./modules -L amin,ang,atleast,dof,endwhile,hist,uint` backporting of commit: `32aba5e64b`	2019-08-15 13:09:52 +03:00
Tomoaki Teshima	db6a6ccaba	re-enable CPU_BASELINE=FP16 on Armv7 platform	2019-07-02 21:57:15 +09:00
Vitaly Tuzov	d2aadabc5e	Merge pull request #14743 from terfendail:wui512_fixvswarn Fix for MSVS2019 build warnings (#14743) * AVX512 arch support for MSVS * Fix for MSVS2019 build warnings: updated integral() AVX512 implementation * Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation * fix indentation	2019-06-11 23:07:39 +03:00
Alexander Alekhin	d8b42792a6	cmake: update ENABLE_FAST_MATH option	2019-05-30 19:44:35 +03:00
Alexander Alekhin	6b6222bfbb	cmake: support CPU_DISPATCH=ALL, fix misused CPU_DISPATCH - CPU_DISPATCH_FINAL should be used for filtering	2019-05-05 11:24:21 +00:00
Sayed Adel	5a77f4cee3	Merge pull request #14007 from seiko2plus:core_avx512_infa * core: improve AVX512 infrastructure by adding more CPU features groups * cmake: use groups for AVX512 optimization flags * core: remove gap in CPU flags enumeration * cmake: restore default CPU_DISPATCH	2019-05-05 14:19:49 +03:00
Alexander Alekhin	fab0eb0d75	cmake: fix compiler flags (CPU_BASELINE_REQUIRED=xxx + CPU_BASELINE=DETECT)	2018-11-28 14:04:03 +03:00
Sayed Adel	474a0dac49	core: several improves and fixes on ppc64le infrastructure - add infrastructure support for Power9/VSX3 - fix missing VSX flags on GCC4.9 and CLANG4(#13210, #13222) - fix disable VSX optimzation on GCC by using flag ENABLE_VSX - flag ENABLE_VSX is deprecated now, use CPU_BASELINE, CPU_DISPATCH instead - add VSX3 to arithmetic dispatchable flags	2018-11-20 15:28:46 +00:00
Alexander Alekhin	c54676d625	cmake: fix supporting of legacy flags	2018-11-12 14:11:57 +03:00
Alexander Alekhin	7e2c65ce3c	Merge pull request #12925 from alalek:fix_cmake_conditions	2018-10-25 11:52:39 +00:00
Alexander Alekhin	0f07edded6	cmake: don't change baseline compiler flags in 'detection' mode	2018-10-24 03:54:31 +00:00
Alexander Alekhin	d6a8e08acc	cmake: fix variable expand in CMake conditions	2018-10-21 15:02:40 +00:00
Alexander Alekhin	3f302cabb8	core(test): intrinsic tests for all dispatched CPU optimizations - tests for both SIMD128 / SIMD256 - different dispatched + baseline(SIMD128) intrinsics	2018-08-01 13:50:42 +03:00
Maksim Shabunin	597db69151	ts: test case list is printed after cmd line parsing, refactored	2018-07-26 16:43:43 +03:00
Dmitry Kurtaev	0c4d5ffecd	Do not copy cv_cpu_helper.h to parent if OpenCV is a submodule	2018-07-24 09:36:28 +03:00
Alexander Alekhin	56222f35bb	cmake: fix CPU_BASELINE_FINAL filling - remove duplicates - restore "always on" missing entries - fix FP16 detection on MSVC	2018-04-26 17:13:42 +03:00
Alexander Alekhin	ff6ce6cd01	cmake: change CPU_BASELINE=DETECT for MacOSX	2018-04-23 19:42:49 +03:00
Alexander Alekhin	97882d03cc	core: fix FP16 conversion with CV_DISABLE_OPTIMIZATION option Reproducer: cmake -DCPU_BASELINE=AVX2 -DCV_DISABLE_OPTIMIZATION=ON ...	2018-04-18 14:13:03 +03:00
Alexander Alekhin	5b867b6f1f	cmake: fix CPU_BASELINE=NATIVE on MSVS	2018-04-17 19:34:35 +03:00
Alexander Alekhin	8388b630ac	Merge pull request #11167 from alalek:cmake_compiler_vars	2018-03-28 12:38:31 +00:00
Alexander Alekhin	08941b7890	cmake: avoid amending of CMAKE_COMPILER_IS_[GNUCXX\|CLANGCXX\|CCACHE] vars - Recommended compiler checks: - GCC: CV_GCC - Clang: CV_CLANG - fixed problem with CMAKE_CXX_COMPILER_ID=Clang/AppleClang mess on MacOSX Details: cmake --help-policy CMP0025 - do not declare Clang as GCC compiler	2018-03-27 16:16:59 +03:00
Alexander Alekhin	6c051a55e5	cmake: don't add include <module>/src directory to avoid conflicts during opencv_world builds	2018-03-19 11:14:15 +03:00

1 2

70 Commits