Commit Graph

62 Commits

Author SHA1 Message Date
Alexander Smorkalov
3893936243
Merge pull request from CNClareChen:4.x
Change the lsx to baseline features.
2023-11-30 15:27:49 +03:00
Hao Chen
c19adb4953 Change the lsx to baseline features.
This patch change lsx to baseline feature, and lasx to dispatch
feature. Additionally, the runtime detection methods for lasx and
lsx have been modified.
2023-11-21 11:51:22 +08:00
zihaomu
b913e73d04
DNN: add the Winograd fp16 support ()
* add Winograd FP16 implementation

* fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime

* fixed some warnings

* hopefully fixed winograd on x64 (and maybe other platforms)

---------

Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
2023-11-20 13:45:37 +03:00
CNClareChen
d142a796d8
Merge pull request from CNClareChen:4.x
* Optimize some function with lasx.

Optimize some function with lasx. 

This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-10-20 14:20:09 +03:00
Vadim Pisarevsky
ba4d6c859d
added detection & dispatching of some modern NEON instructions (NEON_FP16, NEON_BF16) ()
* added more or less cross-platform (based on POSIX signal() semantics) method to detect various NEON extensions, such as FP16 SIMD arithmetics, BF16 SIMD arithmetics, SIMD dotprod etc. It could be propagated to other instruction sets if necessary.

* hopefully fixed compile errors

* continue to fix CI

* another attempt to fix build on Linux aarch64

* * reverted to the original method to detect special arm neon instructions without signal()
* renamed FP16_SIMD & BF16_SIMD to NEON_FP16 and NEON_BF16, respectively

* removed extra whitespaces
2023-10-18 22:06:20 +03:00
ashadrina
3889dcf3f8
Merge pull request from ashadrina:intel_icx_compiler_support
Add Intel® oneAPI DPC++/C++ Compiler (icx) 

Intel® C++ Compiler Classic (icc) is deprecated and will be removed in a oneAPI release in the second half of 2023 ([deprecation notice](https://community.intel.com/t5/Intel-oneAPI-IoT-Toolkit/DEPRECATION-NOTICE-Intel-C-Compiler-Classic/m-p/1412267#:~:text=Intel%C2%AE%20C%2B%2B%20Compiler%20Classic%20(icc)%20is%20deprecated%20and%20will,the%20second%20half%20of%202023.)). This commit is intended to add support for the next-generation compiler, Intel® oneAPI DPC++/C++ Compiler (icx) (the documentation for the compiler is available on the [link](https://www.intel.com/content/www/us/en/docs/dpcpp-cpp-compiler/developer-guide-reference/2023-2/overview.html)). 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-22 17:09:58 +03:00
Alexander Alekhin
941d89e06d cmake: fix RISC-V toolchains
- RVV options are moved to configuration scripts instead of toolchains
2022-12-09 12:02:28 +00:00
Alexander Alekhin
762481411d Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-10-15 16:44:47 +00:00
Alexander Alekhin
d480e2e51b cmake(opt): force separate targets for dispatched code
- PCH may not pass compilation flags properly
2022-10-05 21:54:46 +03:00
wxsheng
4154bd0667
Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add resize.lasx.cpp for Loongson SIMD acceleration
* Add imgwarp.lasx.cpp for Loongson SIMD acceleration
* Add LASX acceleration support for dnn/conv
* Add CV_PAUSE(v) for Loongarch
* Set LASX by default on Loongarch64
* LoongArch: tune test threshold for Core/HAL.mat_decomp/15

Co-authored-by: shengwenxue <shengwenxue@loongson.cn>
2022-09-10 09:39:43 +03:00
Alexander Alekhin
2ebdc04787 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2022-08-14 15:50:42 +00:00
Tomoaki Teshima
b3269b08a1 neon: add dotprod dispatch implementation
* read vector at runtime
     * add enum
2022-07-20 19:25:39 +09:00
Alexander Alekhin
7b57df02a7 Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-10-30 14:26:17 +00:00
Alexander Alekhin
5dfe65d53a cmake: fix popcnt detection with Intel Compiler 2021-10-28 05:37:23 +00:00
Zhangyin
ff4c3873f2 Added cmake toolchain for RISC-V with clang.
- Added cross compile cmake file for target riscv64-clang
- Extended cmake for RISC-V and added instruction checks
- Created intrin_rvv.hpp with C++ version universal intrinsics
2020-08-03 20:18:56 +08:00
Alexander Alekhin
6ea29a7696 cmake: prefer using CMAKE_SYSTEM_PROCESSOR / CMAKE_SIZEOF_VOID_P
Drop:
- discouraged CMAKE_CL_64
- MSVC64
- MINGW64
2019-12-11 00:21:10 +00:00
Alexander Alekhin
21c38bbdaf cmake(cpu optmizations): fix cleanup of OPENCV_DEPENDANT_TARGETS_* vars 2019-11-02 10:34:54 +00:00
Fei Wu
90af2835a2 Fix issue 15730. 2019-10-19 00:36:18 +08:00
Alexander Alekhin
bdc097495a fix avx512 detection
- renamed Cascade Lake AVX512_CEL => AVX512_CLX (align with Intel SDE tool)
- fixed CLX instruction sets (no IFMA/VBMI)
- added flag to bypass CPU baseline check: OPENCV_SKIP_CPU_BASELINE_CHECK
2019-10-05 11:03:57 +00:00
mipsopen-fwu
b1ea91d8bd Merge pull request from mipsopen-fwu:msa-dev
* Added MSA implementations for mips platforms. Intrinsics for MSA and build scripts for MIPS platforms are added.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Removed some unused code in mips.toolchain.cmake.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Added comments for mips toolchain configuration and disabled compiling warnings for libpng.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Fixed the build error of unsupported opcode 'pause' when mips isa_rev is less than 2.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* 1. Removed FP16 related item in MSA option defines in OpenCVCompilerOptimizations.cmake.
2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h.
3. Removed hasSIMD128() in intrin_msa.hpp.
4. Define CPU_MSA as 150.
Signed-off-by: Fei Wu <fwu@wavecomp.com>

* 1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp.
2. Removed unnecessary CV_MSA related code block in dotProd_8u().

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* 1. Defined CPU_MSA_FLAGS_ON as "-mmsa".
2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

* Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h.

Signed-off-by: Fei Wu <fwu@wavecomp.com>
2019-09-20 19:52:48 +03:00
luz.paz
57ccf14952 FIx misc. source and comment typos
Found via `codespell -q 3 -S ./3rdparty,./modules -L amin,ang,atleast,dof,endwhile,hist,uint`

backporting of commit: 32aba5e64b
2019-08-15 13:09:52 +03:00
Tomoaki Teshima
db6a6ccaba re-enable CPU_BASELINE=FP16 on Armv7 platform 2019-07-02 21:57:15 +09:00
Vitaly Tuzov
d2aadabc5e Merge pull request from terfendail:wui512_fixvswarn
Fix for MSVS2019 build warnings ()

* AVX512 arch support for MSVS

* Fix for MSVS2019 build warnings: updated integral() AVX512 implementation

* Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation

* fix indentation
2019-06-11 23:07:39 +03:00
Alexander Alekhin
d8b42792a6 cmake: update ENABLE_FAST_MATH option 2019-05-30 19:44:35 +03:00
Alexander Alekhin
6b6222bfbb cmake: support CPU_DISPATCH=ALL, fix misused CPU_DISPATCH
- CPU_DISPATCH_FINAL should be used for filtering
2019-05-05 11:24:21 +00:00
Sayed Adel
5a77f4cee3 Merge pull request from seiko2plus:core_avx512_infa
* core: improve AVX512 infrastructure by adding more CPU features groups

* cmake: use groups for AVX512 optimization flags

* core: remove gap in CPU flags enumeration

* cmake: restore default CPU_DISPATCH
2019-05-05 14:19:49 +03:00
Alexander Alekhin
fab0eb0d75 cmake: fix compiler flags (CPU_BASELINE_REQUIRED=xxx + CPU_BASELINE=DETECT) 2018-11-28 14:04:03 +03:00
Sayed Adel
474a0dac49 core: several improves and fixes on ppc64le infrastructure
- add infrastructure support for Power9/VSX3
  - fix missing VSX flags on GCC4.9 and CLANG4(, )
  - fix disable VSX optimzation on GCC by using flag ENABLE_VSX
  - flag ENABLE_VSX is deprecated now, use CPU_BASELINE, CPU_DISPATCH instead
  - add VSX3 to arithmetic dispatchable flags
2018-11-20 15:28:46 +00:00
Alexander Alekhin
c54676d625 cmake: fix supporting of legacy flags 2018-11-12 14:11:57 +03:00
Alexander Alekhin
7e2c65ce3c Merge pull request from alalek:fix_cmake_conditions 2018-10-25 11:52:39 +00:00
Alexander Alekhin
0f07edded6 cmake: don't change baseline compiler flags in 'detection' mode 2018-10-24 03:54:31 +00:00
Alexander Alekhin
d6a8e08acc cmake: fix variable expand in CMake conditions 2018-10-21 15:02:40 +00:00
Alexander Alekhin
3f302cabb8 core(test): intrinsic tests for all dispatched CPU optimizations
- tests for both SIMD128 / SIMD256
- different dispatched + baseline(SIMD128) intrinsics
2018-08-01 13:50:42 +03:00
Maksim Shabunin
597db69151 ts: test case list is printed after cmd line parsing, refactored 2018-07-26 16:43:43 +03:00
Dmitry Kurtaev
0c4d5ffecd Do not copy cv_cpu_helper.h to parent if OpenCV is a submodule 2018-07-24 09:36:28 +03:00
Alexander Alekhin
56222f35bb cmake: fix CPU_BASELINE_FINAL filling
- remove duplicates
- restore "always on" missing entries
- fix FP16 detection on MSVC
2018-04-26 17:13:42 +03:00
Alexander Alekhin
ff6ce6cd01 cmake: change CPU_BASELINE=DETECT for MacOSX 2018-04-23 19:42:49 +03:00
Alexander Alekhin
97882d03cc core: fix FP16 conversion with CV_DISABLE_OPTIMIZATION option
Reproducer:
    cmake -DCPU_BASELINE=AVX2 -DCV_DISABLE_OPTIMIZATION=ON ...
2018-04-18 14:13:03 +03:00
Alexander Alekhin
5b867b6f1f cmake: fix CPU_BASELINE=NATIVE on MSVS 2018-04-17 19:34:35 +03:00
Alexander Alekhin
8388b630ac Merge pull request from alalek:cmake_compiler_vars 2018-03-28 12:38:31 +00:00
Alexander Alekhin
08941b7890 cmake: avoid amending of CMAKE_COMPILER_IS_[GNUCXX|CLANGCXX|CCACHE] vars
- Recommended compiler checks:
  - GCC: CV_GCC
  - Clang: CV_CLANG
- fixed problem with CMAKE_CXX_COMPILER_ID=Clang/AppleClang mess on MacOSX
  Details: cmake --help-policy CMP0025
- do not declare Clang as GCC compiler
2018-03-27 16:16:59 +03:00
Alexander Alekhin
6c051a55e5 cmake: don't add include <module>/src directory to avoid conflicts
during opencv_world builds
2018-03-19 11:14:15 +03:00
Alexander Alekhin
0b4428e92f cmake: AVX512 with clang 2018-02-27 15:46:09 +03:00
Alexander Alekhin
14032c6653 cmake: reset __content variable if file doesn't exist
Resolves CMake error after relaunch with updated source code:
Cannot find source file:
    modules/dnn/layers/layers_common.avx512_skx.cpp
2018-02-12 17:30:53 +03:00
Alexander Alekhin
5a791e6e06 cmake: update reporting of excluded dispatching files ()
* cmake: add ocv_get_smart_file_name() macro

* cmake: avoid adding files for unavailable dispatch modes
2018-02-12 14:48:20 +03:00
Alexander Alekhin
73891d619a
Merge pull request from alalek:cpu_dispatch_axv512
* cmake: enable CPU dispatching for AVX512 (SKX)

* cmake: update handling of unsupported flags/modes
2018-01-26 21:59:47 +03:00
Alexander Alekhin
7d67d60fb1 cmake(opt): AVX512_SKX 2017-12-29 07:18:11 +00:00
Alexander Alekhin
898ca38257 cmake: AVX512 -> AVX_512F 2017-12-28 15:20:27 +00:00
Arjan van de Ven
fc8e848a54 Add basic plumbing for AVX512 support
The opencv infrastructure mostly has the basics for supporting avx512 math functions,
but it wasn't hooked up (likely due to lack of users)

In order to compile the DNN functions for AVX512, a few things need to be hooked up
and this patch does that

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-25 21:06:52 +00:00
Alexander Alekhin
89d855c0b7 cmake: update optimization filter 2017-11-27 18:18:19 +03:00