opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-15 18:09:11 +08:00

Author	SHA1	Message	Date
Tomoaki Teshima	b6ec9b9d8c	prepare to build for ARM64 on Windows with Visual Studio	2023-12-19 09:40:35 +09:00
Tomoaki Teshima	b13514e33c	reenable fp16 compile in old compiler	2023-12-09 10:32:21 +09:00
zihaomu	b913e73d04	DNN: add the Winograd fp16 support (#23654 ) * add Winograd FP16 implementation * fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime * fixed some warnings * hopefully fixed winograd on x64 (and maybe other platforms) --------- Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>	2023-11-20 13:45:37 +03:00
Anatoliy Talamanov	0e151e3c88	Merge pull request #24060 from TolyaTalamanov:at/advanced-device-selection-onnxrt-directml G-API: Advanced device selection for ONNX DirectML Execution Provider #24060 ### Overview Extend `cv::gapi::onnx::ep::DirectML` to accept `adapter name` as `ctor` parameter in order to select execution device by `name`. E.g: ``` pp.cfgAddExecutionProvider(cv::gapi::onnx::ep::DirectML("Intel Graphics")); ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-11-16 08:49:53 +03:00
CNClareChen	d142a796d8	Merge pull request #23929 from CNClareChen:4.x * Optimize some function with lasx. Optimize some function with lasx. #23929 This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-10-20 14:20:09 +03:00
Vadim Pisarevsky	ba4d6c859d	added detection & dispatching of some modern NEON instructions (NEON_FP16, NEON_BF16) (#24420 ) * added more or less cross-platform (based on POSIX signal() semantics) method to detect various NEON extensions, such as FP16 SIMD arithmetics, BF16 SIMD arithmetics, SIMD dotprod etc. It could be propagated to other instruction sets if necessary. * hopefully fixed compile errors * continue to fix CI * another attempt to fix build on Linux aarch64 * * reverted to the original method to detect special arm neon instructions without signal() * renamed FP16_SIMD & BF16_SIMD to NEON_FP16 and NEON_BF16, respectively * removed extra whitespaces	2023-10-18 22:06:20 +03:00
Maksim Shabunin	b12c14514a	RISC-V: allow building scalable RVV support with GCC, LLVM 16 support	2023-04-05 14:18:58 +03:00
Xxfore	ef0fcb9238	Merge pull request #22938 from Xxfore:4.x Use reinterpret instead of c-style casting for GCC Co-authored-by: Xu Zhang <xu.zhang@hexintek.com> Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com>	2023-01-11 14:11:16 +00:00
Yuantao Feng	a2b3acfc6e	dnn: add the CANN backend (#22634 ) * cann backend impl v1 * cann backend impl v2: use opencv parsers to build models for cann * adjust fc according to the new transA and transB * put cann net in cann backend node and reuse forwardLayer * use fork() to create a child process and compile cann model * remove legacy code * remove debug code * fall bcak to CPU backend if there is one layer not supoorted by CANN backend * fix netInput forward	2022-12-21 09:04:41 +03:00
Biswapriyo Nath	6cf0910842	Merge pull request #22462 from Biswa96:fix-directx-check * cmake: Fix DirectX detection in mingw The pragma comment directive is valid for MSVC only. So, the DirectX detection fails in mingw. The failure is fixed by adding the required linking library (here d3d11) in the try_compile() function in OpenCVDetectDirectX.cmake file. Also add a message if the first DirectX check fails. * gapi: Fix compilation with mingw These changes remove MSVC specific pragma directive. The compilation fails at linking time due to absence of proper linking library. The required libraries are added in corresponding CMakeLists.txt file. * samples: Fix compilation with mingw These changes remove MSVC specific pragma directive. The compilation fails at linking time due to absence of proper linking library. The required libraries are added in corresponding CMakeLists.txt file.	2022-10-03 08:37:36 +03:00
wxsheng	4154bd0667	Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX * Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX * Add resize.lasx.cpp for Loongson SIMD acceleration * Add imgwarp.lasx.cpp for Loongson SIMD acceleration * Add LASX acceleration support for dnn/conv * Add CV_PAUSE(v) for Loongarch * Set LASX by default on Loongarch64 * LoongArch: tune test threshold for Core/HAL.mat_decomp/15 Co-authored-by: shengwenxue <shengwenxue@loongson.cn>	2022-09-10 09:39:43 +03:00
Alexander Alekhin	2ebdc04787	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2022-08-14 15:50:42 +00:00
Tomoaki Teshima	b3269b08a1	neon: add dotprod dispatch implementation * read vector at runtime * add enum	2022-07-20 19:25:39 +09:00
Tatsuro Shibamura	d354ad1c34	Merge pull request #21630 from shibayan:arm64-msvc-neon * Added NEON support in builds for Windows on ARM * Fixed `HAVE_CPU_NEON_SUPPORT` display broken during compiler test * Fixed a build error prior to Visual Studio 2022	2022-02-26 17:35:03 +00:00
Hanxi Guo	1fcf7ba5bc	Merge pull request #20406 from MarkGHX:gsoc_2021_webnn [GSoC] OpenCV.js: Accelerate OpenCV.js DNN via WebNN * Add WebNN backend for OpenCV DNN Module Update dnn.cpp Update dnn.cpp Update dnn.cpp Update dnn.cpp Add WebNN head files into OpenCV 3rd partiy files Create webnn.hpp update cmake Complete README and add OpenCVDetectWebNN.cmake file add webnn.cpp Modify webnn.cpp Can successfully compile the codes for creating a MLContext Update webnn.cpp Update README.md Update README.md Update README.md Update README.md Update cmake files and update README.md Update OpenCVDetectWebNN.cmake and README.md Update OpenCVDetectWebNN.cmake Fix OpenCVDetectWebNN.cmake and update README.md Add source webnn_cpp.cpp and libary libwebnn_proc.so Update dnn.cpp Update dnn.cpp Update dnn.cpp Update dnn.cpp update dnn.cpp update op_webnn update op_webnn Update op_webnn.hpp update op_webnn.cpp & hpp Update op_webnn.hpp Update op_webnn update the skeleton Update op_webnn.cpp Update op_webnn Update op_webnn.cpp Update op_webnn.cpp Update op_webnn.hpp update op_webnn update op_webnn Solved the problems of released variables. Fixed the bugs in op_webnn.cpp Implement op_webnn Implement Relu by WebNN API Update dnn.cpp for better test Update elementwise_layers.cpp Implement ReLU6 Update elementwise_layers.cpp Implement SoftMax using WebNN API Implement Reshape by WebNN API Implement PermuteLayer by WebNN API Implement PoolingLayer using WebNN API Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Implement poolingLayer by WebNN API and add more detailed logs Update dnn.cpp Update dnn.cpp Remove redundant codes and add more logs for poolingLayer Add more logs in the pooling layer implementation Fix the indent issue and resolve the compiling issue Fix the build problems Fix the build issue FIx the build issue Update dnn.cpp Update dnn.cpp * Fix the build issue * Implement BatchNorm Layer by WebNN API * Update convolution_layer.cpp This is a temporary file for Conv2d layer implementation * Integrate some general functions into op_webnn.cpp&hpp * Update const_layer.cpp * Update convolution_layer.cpp Still have some bugs that should be fixed. * Update conv2d layer and fc layer still have some problems to be fixed. * update constLayer, conv layer, fc layer There are still some bugs to be fixed. * Fix the build issue * Update concat_layer.cpp Still have some bugs to be fixed. * Update conv2d layer, fully connected layer and const layer * Update convolution_layer.cpp * Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron) * Delete bib19450.aux * Add WebNN backend for OpenCV DNN Module Update dnn.cpp Update dnn.cpp Update dnn.cpp Update dnn.cpp Add WebNN head files into OpenCV 3rd partiy files Create webnn.hpp update cmake Complete README and add OpenCVDetectWebNN.cmake file add webnn.cpp Modify webnn.cpp Can successfully compile the codes for creating a MLContext Update webnn.cpp Update README.md Update README.md Update README.md Update README.md Update cmake files and update README.md Update OpenCVDetectWebNN.cmake and README.md Update OpenCVDetectWebNN.cmake Fix OpenCVDetectWebNN.cmake and update README.md Add source webnn_cpp.cpp and libary libwebnn_proc.so Update dnn.cpp Update dnn.cpp Update dnn.cpp Update dnn.cpp update dnn.cpp update op_webnn update op_webnn Update op_webnn.hpp update op_webnn.cpp & hpp Update op_webnn.hpp Update op_webnn update the skeleton Update op_webnn.cpp Update op_webnn Update op_webnn.cpp Update op_webnn.cpp Update op_webnn.hpp update op_webnn update op_webnn Solved the problems of released variables. Fixed the bugs in op_webnn.cpp Implement op_webnn Implement Relu by WebNN API Update dnn.cpp for better test Update elementwise_layers.cpp Implement ReLU6 Update elementwise_layers.cpp Implement SoftMax using WebNN API Implement Reshape by WebNN API Implement PermuteLayer by WebNN API Implement PoolingLayer using WebNN API Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Update pooling_layer.cpp Implement poolingLayer by WebNN API and add more detailed logs Update dnn.cpp Update dnn.cpp Remove redundant codes and add more logs for poolingLayer Add more logs in the pooling layer implementation Fix the indent issue and resolve the compiling issue Fix the build problems Fix the build issue FIx the build issue Update dnn.cpp Update dnn.cpp * Fix the build issue * Implement BatchNorm Layer by WebNN API * Update convolution_layer.cpp This is a temporary file for Conv2d layer implementation * Integrate some general functions into op_webnn.cpp&hpp * Update const_layer.cpp * Update convolution_layer.cpp Still have some bugs that should be fixed. * Update conv2d layer and fc layer still have some problems to be fixed. * update constLayer, conv layer, fc layer There are still some bugs to be fixed. * Update conv2d layer, fully connected layer and const layer * Update convolution_layer.cpp * Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron) * Update dnn.cpp * Fix Error in dnn.cpp * Resolve duplication in conditions in convolution_layer.cpp * Fixed the issues in the comments * Fix building issue * Update tutorial * Fixed comments * Address the comments * Update CMakeLists.txt * Offer more accurate perf test on native * Add better perf tests for both native and web * Modify per tests for better results * Use more latest version of Electron * Support latest WebNN Clamp op * Add definition of HAVE_WEBNN macro * Support group convolution * Implement Scale_layer using WebNN * Add Softmax option for native classification example * Fix comments * Fix comments	2021-11-23 21:15:31 +00:00
Zhang Yin	3a15a3821a	Update RISC-V back-end to RVV 0.10	2021-06-18 15:44:38 +08:00
Zhangyin	ff4c3873f2	Added cmake toolchain for RISC-V with clang. - Added cross compile cmake file for target riscv64-clang - Extended cmake for RISC-V and added instruction checks - Created intrin_rvv.hpp with C++ version universal intrinsics	2020-08-03 20:18:56 +08:00
Alexander Alekhin	ca9756f6a1	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2020-04-13 20:00:12 +00:00
Alexander Alekhin	f0ffc52435	fix files permissions	2020-04-13 04:29:55 +00:00
Alexander Alekhin	bf2f7b0f8b	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2020-02-01 17:26:00 +00:00
Sayed Adel	bd531bd828	core:vsx fix inline asm constraints generalize constraints to 'wa' for VSX registers	2020-01-28 15:48:00 +02:00
Alexander Alekhin	c6c8783c60	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-12-16 21:30:30 +00:00
Tatsuro Shibamura	971ae00942	Merge pull request #16027 from shibayan:arm64-windows10 * Support ARM64 Windows 10 platform * Fixed detection issue for ARM64 Windows 10 * Try enabling ARM NEON intrin * build: disable NEON with MSVC compiler * samples(directx): gdi32 dependency	2019-12-17 00:23:30 +03:00
Alexander Alekhin	65573784c4	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-10-09 19:46:18 +00:00
Alexander Alekhin	42ac089e12	build: update AVX2 check - _mm256_bslli_epi128() works in GCC 4.9.3+ only - Android NDK r10 doesn't support this instruction	2019-10-08 13:12:02 +03:00
Alexander Alekhin	626bfbf309	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-10-05 15:45:31 +00:00
Alexander Alekhin	bdc097495a	fix avx512 detection - renamed Cascade Lake AVX512_CEL => AVX512_CLX (align with Intel SDE tool) - fixed CLX instruction sets (no IFMA/VBMI) - added flag to bypass CPU baseline check: OPENCV_SKIP_CPU_BASELINE_CHECK	2019-10-05 11:03:57 +00:00
Alexander Alekhin	a74fe2ec01	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-09-20 21:11:49 +00:00
mipsopen-fwu	b1ea91d8bd	Merge pull request #15422 from mipsopen-fwu:msa-dev * Added MSA implementations for mips platforms. Intrinsics for MSA and build scripts for MIPS platforms are added. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed some unused code in mips.toolchain.cmake. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Added comments for mips toolchain configuration and disabled compiling warnings for libpng. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Fixed the build error of unsupported opcode 'pause' when mips isa_rev is less than 2. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed FP16 related item in MSA option defines in OpenCVCompilerOptimizations.cmake. 2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h. 3. Removed hasSIMD128() in intrin_msa.hpp. 4. Define CPU_MSA as 150. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp. 2. Removed unnecessary CV_MSA related code block in dotProd_8u(). Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Defined CPU_MSA_FLAGS_ON as "-mmsa". 2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h. Signed-off-by: Fei Wu <fwu@wavecomp.com>	2019-09-20 19:52:48 +03:00
Alexander Alekhin	c657c6cbac	cmake: use 'long long' for atomic check	2019-09-18 15:18:09 +00:00
Alexander Alekhin	a7b954f655	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-08-23 19:24:37 +03:00
Alexander Alekhin	464972855e	cmake: add libatomic check	2019-08-21 13:02:36 +03:00
Alexander Alekhin	ddcf388270	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-06-07 19:02:55 +03:00
Maksim Shabunin	65919ed839	AVX 512 detection: workaround for older GCC	2019-06-07 13:58:09 +03:00
Alexander Alekhin	b2abd8ca41	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-05-07 16:04:54 +00:00
Sayed Adel	5a77f4cee3	Merge pull request #14007 from seiko2plus:core_avx512_infa * core: improve AVX512 infrastructure by adding more CPU features groups * cmake: use groups for AVX512 optimization flags * core: remove gap in CPU flags enumeration * cmake: restore default CPU_DISPATCH	2019-05-05 14:19:49 +03:00
Alexander Alekhin	8c25a8eb7b	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-03-22 19:31:31 +03:00
Sayed Adel	0df607e5dd	cmake:vsx Fix compilation fail on aligned runtime test when c++11 enabled	2019-03-21 11:21:00 +02:00
Sayed Adel	f41359688b	core:vsx Add support for VSX3 half precision conversions	2019-03-20 10:19:42 +02:00
Alexander Alekhin	26087e28ad	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2019-03-15 22:42:57 +00:00
Sayed Adel	872e7894b4	core:vsx working around gcc aligned memory access bug - allow cmake to check sanity of vsx aligned ld/st - force universal intrinsics v_load_aligned/v_store_aligned to failback to unaligned ld/st if cmake runtime vsx aligned test fail	2019-03-14 01:55:40 +02:00
Kartik Mohta	80a3d7bffa	Fix comment marker in OpenCVDetectCudaArch.cu	2018-12-04 11:47:28 -08:00
Alexander Alekhin	7fa7fa0226	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2018-11-21 08:33:39 +00:00
Sayed Adel	474a0dac49	core: several improves and fixes on ppc64le infrastructure - add infrastructure support for Power9/VSX3 - fix missing VSX flags on GCC4.9 and CLANG4(#13210, #13222) - fix disable VSX optimzation on GCC by using flag ENABLE_VSX - flag ENABLE_VSX is deprecated now, use CPU_BASELINE, CPU_DISPATCH instead - add VSX3 to arithmetic dispatchable flags	2018-11-20 15:28:46 +00:00
Alexander Alekhin	5869415a57	videoio: drop obsolete backends - VFW - QuickTime/QtKit - Unicap - GPL, no active support: https://github.com/unicap/unicap - DC1394 (1st version) / CMU1394	2018-11-07 19:49:09 +03:00
WuZhiwen	6e3ea8b49d	Merge pull request #12703 from wzw-intel:vkcom * dnn: Add a Vulkan based backend This commit adds a new backend "DNN_BACKEND_VKCOM" and a new target "DNN_TARGET_VULKAN". VKCOM means vulkan based computation library. This backend uses Vulkan API and SPIR-V shaders to do the inference computation for layers. The layer types that implemented in DNN_BACKEND_VKCOM include: Conv, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling, AvePooling, Permute This is just a beginning work for Vulkan in OpenCV DNN, more layer types will be supported and performance tuning is on the way. Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com> * dnn/vulkan: Add FindVulkan.cmake to detect Vulkan SDK In order to build dnn with Vulkan support, need installing Vulkan SDK and setting environment variable "VULKAN_SDK" and add "-DWITH_VULKAN=ON" to cmake command. You can download Vulkan SDK from: https://vulkan.lunarg.com/sdk/home#linux For how to install, see https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.html https://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.html https://vulkan.lunarg.com/doc/sdk/latest/mac/getting_started.html respectively for linux, windows and mac. To run the vulkan backend, also need installing mesa driver. On Ubuntu, use this command 'sudo apt-get install mesa-vulkan-drivers' To test, use command '$BUILD_DIR/bin/opencv_test_dnn --gtest_filter=VkCom' Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com> * dnn/Vulkan: dynamically load Vulkan runtime No compile-time dependency on Vulkan library. If Vulkan runtime is unavailable, fallback to CPU path. Use environment "OPENCL_VULKAN_RUNTIME" to specify path to your own vulkan runtime library. Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com> * dnn/Vulkan: Add a python script to compile GLSL shaders to SPIR-V shaders The SPIR-V shaders are in format of text-based 32-bit hexadecimal numbers, and inserted into .cpp files as unsigned int32 array. * dnn/Vulkan: Put Vulkan headers into 3rdparty directory and some other fixes Vulkan header files are copied from https://github.com/KhronosGroup/Vulkan-Docs/tree/master/include/vulkan to 3rdparty/include Fix the Copyright declaration issue. Refine OpenCVDetectVulkan.cmake * dnn/Vulkan: Add vulkan backend tests into existing ones. Also fixed some test failures. - Don't use bool variable as uniform for shader - Fix dispathed group number beyond max issue - Bypass "group > 1" convolution. This should be support in future. * dnn/Vulkan: Fix multiple initialization in one thread.	2018-10-29 17:51:26 +03:00
Hamdi Sahloul	e136c11c7c	Avoid detecting dublicate CUDA archs	2018-08-16 17:40:09 +09:00
Alexander Alekhin	000a13b6a3	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2018-05-03 14:30:38 +00:00
Alexander Alekhin	56222f35bb	cmake: fix CPU_BASELINE_FINAL filling - remove duplicates - restore "always on" missing entries - fix FP16 detection on MSVC	2018-04-26 17:13:42 +03:00
Alexander Alekhin	cd2b188c9a	Merge remote-tracking branch 'upstream/3.4' into merge-3.4	2018-04-24 18:13:06 +03:00

1 2

81 Commits