opencv

mirror of https://github.com/opencv/opencv.git synced 2024-11-27 20:50:25 +08:00

Author	SHA1	Message	Date
Maksim Shabunin	9d64e2959f	dnn: use dispatcher for Winograd	2024-11-07 10:51:16 +03:00
Maksim Shabunin	04818d6dd5	build: made environment access a separate feature	2024-10-30 18:37:22 +03:00
Dmitry Kurtaev	0e80a97f87	Hotfix ie_ngraph.cpp in Debug	2024-10-29 10:20:51 +03:00
Dmitry Kurtaev	d193554a5f	OpenVINO friendly output names from non-compiled Model	2024-10-23 09:29:05 +03:00
Maksim Shabunin	305b57e622	C-API cleanup: backport videoio changes from 5.x	2024-10-01 17:06:08 +03:00
Alexander Smorkalov	209802c9f6	Leaky RELU support for TFLite.	2024-09-09 12:40:35 +03:00
alexlyulkov	766bad0035	Merge pull request #26053 from alexlyulkov:al/opencl-conformance-tests DNN(ONNX): Enabled several OpenCL conformance tests #26053 The tests also work in 5.x ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-08-27 17:23:11 +03:00
Abduragim Shtanchaev	e5b871fa7e	Merge pull request #26059 from Abdurrahheem:ash/fix-einsum-allocation Einsum buffer allocation fix #26059 This PR fixed buffer allocation issue in Einsum layer that causes segmentation fault on 32bit platforms. Related issue #26008 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-08-23 15:25:48 +03:00
Alexander Smorkalov	6c6d5cd7b2	Merge pull request #25986 from asmorkalov:as/js_for_contrib Split Javascript white-list to support contrib modules #25986 Single whitelist converted to several per-module json files. They are concatenated automatically and can be overriden by user config. Related to https://github.com/opencv/opencv/pull/25656 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-08-23 10:49:08 +03:00
Yuantao Feng	347d673a87	Merge pull request #23279 from fengyuentau:add_topk dnn: add ONNX TopK #23279 Merge with https://github.com/opencv/opencv_extra/pull/1200 Partially fixes #22890 and #20258 To-do: - [x] TopK forward impl - [x] add tests - [x] support Opset 1 & 10 if possible - [ ] ~Support other backends~ (TopK has two outputs, which is not supported by other backends, such as openvino) Perf: M1 (time in millisecond) \| input shape \| axis \| dnn \| ort \| \| --------------- \| ---- \| ---- \| ---- \| \| (1000, 100) \| 0 \| 1.68 \| 4.07 \| \| (1000, 100) K5 \| 0 \| 1.13 \| 0.12 \| \| (1000, 100) \| 1 \| 0.96 \| 0.77 \| \| (100, 100, 100) \| 0 \| 10.00 \| 31.13 \| \| (100, 100, 100) \| 1 \| 7.33 \| 9.17 \| \| (100, 100, 100) \| 2 \| 7.52 \| 9.48 \| M2 (time in milisecond) \| input shape \| axis \| dnn \| ort \| \| --------------- \| ---- \| ---- \| ---- \| \| (1000, 100) \| 0 \| 0.76 \| 2.44 \| \| (1000, 100) K5 \| 0 \| 0.68 \| 0.07 \| \| (1000, 100) \| 1 \| 0.41 \| 0.50 \| \| (100, 100, 100) \| 0 \| 4.83 \| 17.52\| \| (100, 100, 100) \| 1 \| 3.60 \| 5.08 \| \| (100, 100, 100) \| 2 \| 3.73 \| 5.10 \| ONNXRuntime performance testing script: https://gist.github.com/fengyuentau/a119f94fd16721ec9974b8c7b0a45d4c ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-08-21 17:03:24 +03:00
Yuantao Feng	93e0c7e53f	fix matmul crash	2024-08-15 16:10:40 +08:00
Alexander Smorkalov	67d0338c9c	Merge pull request #26004 from ericmariasis:eric-mariasis-issue-26000 got rid of std prefix	2024-08-07 10:26:24 +03:00
ericmariasis	3f92884520	got rid of std prefix	2024-08-07 00:17:04 -04:00
Aven	796974cccc	fix compilation errors caused by namespace related: #25199	2024-08-04 05:04:03 +08:00
Daniele Affinita	2a333a6c86	Merge pull request #25644 from DaniAffCH:blockwise-quantization [GSoC] dnn: Blockwise quantization support #25644 This PR introduces blockwise quantization in DNN allowing the parsing of ONNX models quantized in blockwise style. In particular it modifies the `Quantize` and `Dequantize` operations. The related PR opencv/opencv_extra#1181 contains the test data. Additional notes: - The original quantization issue has been fixed. Previously, for 1D scale and zero-point, the operation applied was $y = int8(x/s - z)$ instead of $y = int8(x/s + z)$. Note that the operation was already correctly implemented when the scale and zero-point were scalars. The previous implementation failed the ONNX test cases, but now all have passed successfully. [Reference](https://github.com/onnx/onnx/blob/main/docs/Operators.md#QuantizeLinear) - the function `block_repeat` broadcasts scale and zero-point to the input shape. It repeats all the elements of a given axis n times. This function generalizes the behavior of `repeat` from the core module which is defined just for 2 axis assuming `Mat` has 2 dimensions. If appropriate and useful, you might consider moving `block_repeat` to the core module. - Now, the scale and zero-point can be taken as layer inputs. This increases the ONNX layers' coverage and enables us to run the ONNX test cases (previously disabled) being fully compliant with ONNX standards. Since they are now supported, I have enabled the test cases for: `test_dequantizelinear`, `test_dequantizelinear_axis`, `test_dequantizelinear_blocked`, `test_quantizelinear`, `test_quantizelinear_axis`, `test_quantizelinear_blocked` just in CPU backend. All of them pass successfully. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-07-30 14:16:08 +03:00
Yuantao Feng	23b244d3a3	Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp dnn: optimize activations with v_exp #25881 Merge with https://github.com/opencv/opencv_extra/pull/1191. This PR optimizes the following activations: - [x] Swish - [x] Mish - [x] Elu - [x] Celu - [x] Selu - [x] HardSwish ### Performance (Updated on 2024-07-18) #### AmLogic A311D2 (ARM Cortex A73 + A53) ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 115.859 27.930 4.15 Elu::Layer_Elementwise::OCV/CPU 27.846 27.003 1.03 Gelu::Layer_Elementwise::OCV/CPU 0.657 0.602 1.09 HardSwish::Layer_Elementwise::OCV/CPU 31.885 6.781 4.70 Mish::Layer_Elementwise::OCV/CPU 35.729 32.089 1.11 Selu::Layer_Elementwise::OCV/CPU 61.955 27.850 2.22 Swish::Layer_Elementwise::OCV/CPU 30.819 26.688 1.15 ``` #### Apple M1 ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 16.184 2.118 7.64 Celu::Layer_Elementwise::OCV/CPU_FP16 16.280 2.123 7.67 Elu::Layer_Elementwise::OCV/CPU 9.123 1.878 4.86 Elu::Layer_Elementwise::OCV/CPU_FP16 9.085 1.897 4.79 Gelu::Layer_Elementwise::OCV/CPU 0.089 0.081 1.11 Gelu::Layer_Elementwise::OCV/CPU_FP16 0.086 0.074 1.17 HardSwish::Layer_Elementwise::OCV/CPU 1.560 1.555 1.00 HardSwish::Layer_Elementwise::OCV/CPU_FP16 1.536 1.523 1.01 Mish::Layer_Elementwise::OCV/CPU 6.077 2.476 2.45 Mish::Layer_Elementwise::OCV/CPU_FP16 5.990 2.496 2.40 Selu::Layer_Elementwise::OCV/CPU 11.351 1.976 5.74 Selu::Layer_Elementwise::OCV/CPU_FP16 11.533 1.985 5.81 Swish::Layer_Elementwise::OCV/CPU 4.687 1.890 2.48 Swish::Layer_Elementwise::OCV/CPU_FP16 4.715 1.873 2.52 ``` #### Intel i7-12700K ``` Geometric mean (ms) Name of Test activations activations.patch activations.patch vs activations (x-factor) Celu::Layer_Elementwise::OCV/CPU 17.106 3.560 4.81 Elu::Layer_Elementwise::OCV/CPU 5.064 3.478 1.46 Gelu::Layer_Elementwise::OCV/CPU 0.036 0.035 1.04 HardSwish::Layer_Elementwise::OCV/CPU 2.914 2.893 1.01 Mish::Layer_Elementwise::OCV/CPU 3.820 3.529 1.08 Selu::Layer_Elementwise::OCV/CPU 10.799 3.593 3.01 Swish::Layer_Elementwise::OCV/CPU 3.651 3.473 1.05 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-07-19 16:03:19 +03:00
HAN Liutong	b5ea32158a	Merge pull request #25883 from hanliutong:rvv-intrin-upgrade Upgrade RISC-V Vector intrinsic and cleanup the obsolete RVV backend. #25883 This patch upgrade RISC-V Vector intrinsic from `v0.10` to `v0.12`/`v1.0`: - Update cmake check and options; - Upgrade RVV implement for Universal Intrinsic; - Upgrade RVV optimized DNN kernel. - Cleanup the obsolete RVV backend (`intrin_rvv.hpp`) and compatable header file. With this patch, RVV backend require Clang 17+ or GCC 14+ (which means `__riscv_v_intrinsic >= 12000`, see https://godbolt.org/z/es7ncETE3) This patch is test with Clang 17.0.6 (require extra `-DWITH_PNG=OFF` due to ICE), Clang 18.1.8 and GCC 14.1.0 on QEMU and k230 (with `--gtest_filter="hal_"`). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-07-19 11:41:42 +03:00
zihaomu	1125755345	Merge pull request #25931 from zihaomu:clean_code code clean #25931 Align code and remove redundant CMake code ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-07-18 17:18:37 +03:00
Aliaksei Urbanski	35ca2f78d6	Merge pull request #25880 from Jamim:fix/cuda-no-fp16 Fix CUDA for old GPUs without FP16 support #25880 Fixes #21461 ~This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.~ ~We shouldn't add an invalid target while building with `CUDA_ARCH_BIN` < 53.~ _(please see [this discussion](https://github.com/opencv/opencv/pull/25880#discussion_r1668074505))_ This is a run-time solution that basically reverts [these lines](`d0fe6ad109 (diff-757c5ab6ddf2f99cdd09f851e3cf17abff203aff4107d908c7ad3d0466f39604L245-R245)`). I've debugged these changes, [coupled with other fixes](https://github.com/gentoo/gentoo/pull/37479), on [Gentoo Linux](https://www.gentoo.org/) and [related tests passed](https://github.com/user-attachments/files/16135391/opencv-4.10.0.20240708-224733.log.gz) on my laptop with `GeForce GTX 960M`. Alternative solution: - #21462 _Best regards!_ ### Pull Request Readiness Checklist - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] `n/a` There is accuracy test, performance test and test data in opencv_extra repository, if applicable - [ ] `n/a` The feature is well documented and sample code can be built with the project CMake	2024-07-10 12:39:30 +03:00
Yuantao Feng	e3858cc5a3	Merge pull request #25147 from fengyuentau:dnn/elementwise_layers/speedup * added v_erf and implemented gelu acceleration via vectorization * remove anonymous v_erf and use v_erf from intrin_math * enable perf for ov and cuda backend	2024-07-08 14:24:36 +03:00
Abduragim Shtanchaev	efbc9f0b66	Merge pull request #25861 from Abdurrahheem:ash/torch-attention-export-fix-4x Merge pull request #25861 from Abdurrahheem:ash/torch-attention-export-fix-4x Support for Unflatten operation requred by Attention layer - 4.x #25861 ### Pull Request Readiness Checklist All test data and models for PR are located [#1190](https://github.com/opencv/opencv_extra/pull/1190) This PR fixes issue reised when importing batched vanilla `Attention` layer from `PyTorch` via ONNX. Currently batched version of `Attention` layer in PyTorch [has unflatten operation inside](`e3b3431c42/torch/nn/functional.py (L5500C17-L5500C31)`). `unflatten` operation causes issue in `reshape` layer (see the Reshape_2 in the graph below) due to incorrect output of `slice` layer. This PR particularly fixes `slice` and `concat` layers to handle `unflatten` operation. <img width="673" alt="image" src="https://github.com/opencv/opencv/assets/44877829/5b612b31-657a-47f1-83a4-0ac35a950abd"> See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-07-04 16:25:31 +03:00
Yuantao Feng	5510718381	Merge pull request #25810 from fengyuentau:python/fix_parsing_3d_mat_in_dnn python: attempts to fix 3d mat parsing problem for dnn #25810 Fixes https://github.com/opencv/opencv/issues/25762 https://github.com/opencv/opencv/issues/23242 Relates https://github.com/opencv/opencv/issues/25763 https://github.com/opencv/opencv/issues/19091 Although `cv.Mat` has already been introduced to workaround this problem, people do not know it and it kind of leads to confusion with `numpy.array`. This patch adds a "switch" to turn off the auto multichannel feature when the API is from cv::dnn::Net (more specifically, `setInput`) and the parameter is of type `Mat`. This patch only leads to changes of three places in `pyopencv_generated_types_content.h`: ```.diff static PyObject* pyopencv_cv_dnn_dnn_Net_setInput(PyObject* self, PyObject* py_args, PyObject* kw) { ... - pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 0)) && + pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 8)) && ... } // I guess we also need to change this as one-channel blob is expected for param static PyObject* pyopencv_cv_dnn_dnn_Net_setParam(PyObject* self, PyObject* py_args, PyObject* kw) { ... - pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 0)) ) + pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 8)) ) ... - pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 0)) ) + pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 8)) ) ... } ``` Others are unchanged, e.g. `dnn_SegmentationModel` and stuff like that. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-07-04 08:33:20 +03:00
Alexander Smorkalov	25fb55601b	Fixed narrowing conversion warning with MSVC compiler.	2024-07-03 12:10:31 +03:00
Yuantao Feng	a7fd9446cf	Merge pull request #25630 from fengyuentau:nary-multi-thread dnn: parallelize nary elementwise forward implementation & enable related conformance tests #25630 This PR introduces the following changes: - [x] Parallelize binary forward impl - [x] Parallelize ternary forward impl (Where) - [x] Parallelize nary (Operator that can take >=1 operands) - [x] Enable conformance tests if workable ## Performance ### i7-12700K, RAM 64GB, Ubuntu 22.04 ``` Geometric mean (ms) Name of Test opencv opencv opencv perf perf perf core.x64.0606 core.x64.0606 core.x64.0606 vs opencv perf core.x64.0606 (x-factor) NCHW_C_sum::Layer_NaryEltwise::OCV/CPU 16.116 11.161 1.44 NCHW_NCHW_add::Layer_NaryEltwise::OCV/CPU 17.469 11.446 1.53 NCHW_NCHW_div::Layer_NaryEltwise::OCV/CPU 17.531 11.469 1.53 NCHW_NCHW_equal::Layer_NaryEltwise::OCV/CPU 28.653 13.682 2.09 NCHW_NCHW_greater::Layer_NaryEltwise::OCV/CPU 21.899 13.422 1.63 NCHW_NCHW_less::Layer_NaryEltwise::OCV/CPU 21.738 13.185 1.65 NCHW_NCHW_max::Layer_NaryEltwise::OCV/CPU 16.172 11.473 1.41 NCHW_NCHW_mean::Layer_NaryEltwise::OCV/CPU 16.309 11.565 1.41 NCHW_NCHW_min::Layer_NaryEltwise::OCV/CPU 16.166 11.454 1.41 NCHW_NCHW_mul::Layer_NaryEltwise::OCV/CPU 16.157 11.443 1.41 NCHW_NCHW_pow::Layer_NaryEltwise::OCV/CPU 163.459 15.234 10.73 NCHW_NCHW_ref_div::Layer_NaryEltwise::OCV/CPU 10.880 10.868 1.00 NCHW_NCHW_ref_max::Layer_NaryEltwise::OCV/CPU 10.947 11.058 0.99 NCHW_NCHW_ref_min::Layer_NaryEltwise::OCV/CPU 10.948 10.910 1.00 NCHW_NCHW_ref_mul::Layer_NaryEltwise::OCV/CPU 10.874 10.871 1.00 NCHW_NCHW_ref_sum::Layer_NaryEltwise::OCV/CPU 10.971 10.920 1.00 NCHW_NCHW_sub::Layer_NaryEltwise::OCV/CPU 17.546 11.462 1.53 NCHW_NCHW_sum::Layer_NaryEltwise::OCV/CPU 16.175 11.475 1.41 NHWC_C::Layer_NaryEltwise::OCV/CPU 11.339 11.333 1.00 NHWC_H::Layer_NaryEltwise::OCV/CPU 16.154 11.102 1.46 ``` ### Apple M1, RAM 16GB, macOS 14.4.1 ``` Geometric mean (ms) Name of Test opencv opencv opencv perf perf perf core.m1.0606 core.m1.0606.patch core.m1.0606.patch vs opencv perf core.m1.0606 (x-factor) NCHW_C_sum::Layer_NaryEltwise::OCV/CPU 28.418 3.768 7.54 NCHW_NCHW_add::Layer_NaryEltwise::OCV/CPU 6.942 5.679 1.22 NCHW_NCHW_div::Layer_NaryEltwise::OCV/CPU 5.822 5.653 1.03 NCHW_NCHW_equal::Layer_NaryEltwise::OCV/CPU 5.751 5.628 1.02 NCHW_NCHW_greater::Layer_NaryEltwise::OCV/CPU 5.797 5.599 1.04 NCHW_NCHW_less::Layer_NaryEltwise::OCV/CPU 7.272 5.578 1.30 NCHW_NCHW_max::Layer_NaryEltwise::OCV/CPU 5.777 5.562 1.04 NCHW_NCHW_mean::Layer_NaryEltwise::OCV/CPU 5.819 5.559 1.05 NCHW_NCHW_min::Layer_NaryEltwise::OCV/CPU 5.830 5.574 1.05 NCHW_NCHW_mul::Layer_NaryEltwise::OCV/CPU 5.759 5.567 1.03 NCHW_NCHW_pow::Layer_NaryEltwise::OCV/CPU 342.260 74.655 4.58 NCHW_NCHW_ref_div::Layer_NaryEltwise::OCV/CPU 8.338 8.280 1.01 NCHW_NCHW_ref_max::Layer_NaryEltwise::OCV/CPU 8.359 8.309 1.01 NCHW_NCHW_ref_min::Layer_NaryEltwise::OCV/CPU 8.412 8.295 1.01 NCHW_NCHW_ref_mul::Layer_NaryEltwise::OCV/CPU 8.380 8.297 1.01 NCHW_NCHW_ref_sum::Layer_NaryEltwise::OCV/CPU 8.356 8.323 1.00 NCHW_NCHW_sub::Layer_NaryEltwise::OCV/CPU 6.818 5.561 1.23 NCHW_NCHW_sum::Layer_NaryEltwise::OCV/CPU 5.805 5.570 1.04 NHWC_C::Layer_NaryEltwise::OCV/CPU 3.834 4.817 0.80 NHWC_H::Layer_NaryEltwise::OCV/CPU 28.402 3.771 7.53 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-07-03 10:09:05 +03:00
Abduragim Shtanchaev	a8d1373919	Merge pull request #25794 from Abdurrahheem:ash/yolov10-support Add sample support of YOLOv9 and YOLOv10 in OpenCV #25794 This PR adds sample support of [`YOLOv9`](https://github.com/WongKinYiu/yolov9) and [`YOLOv10`](https://github.com/THU-MIG/yolov10/tree/main)) in OpenCV. Models for this test are located in this [PR](https://github.com/opencv/opencv_extra/pull/1186). Running YOLOv10 using OpenCV. 1. In oder to run `YOLOv10` one needs to cut off postporcessing with dynamic shapes from torch and then convert it to ONNX. If someone is looking for ready solution, there is [this forked branch](https://github.com/Abdurrahheem/yolov10/tree/ash/opencv-export) from official YOLOv10. Particularty follow this proceduce. ```bash git clone git@github.com:Abdurrahheem/yolov10.git conda create -n yolov10 python=3.9 conda activate yolov10 pip install -r requirements.txt python export_opencv.py --model=<model-name> --imgsz=<input-img-size> ``` By default `model="yolov10s"` and `imgsz=(480,640)`. This will generate file `yolov10s.onnx`, which can be use for inference in OpenCV 2. For inference part on OpenCV. one can use `yolo_detector.cpp` [sample](https://github.com/opencv/opencv/blob/4.x/samples/dnn/yolo_detector.cpp). If you have followed above exporting procedure, then you can use following command to run the model. ``` bash build opencv from source cd build ./bin/example_dnn_yolo_detector --model=<path-to-yolov10s.onnx-file> --yolo=yolov10 --width=640 --height=480 --input=<path-to-image> --scale=0.003921568627 --padvalue=114 ``` If you do not specify `--input` argument, OpenCV will grab first camera that is avaliable on your platform. For more deatils on how to run the `yolo_detector.cpp` file see this [guide](https://docs.opencv.org/4.x/da/d9d/tutorial_dnn_yolo.html#autotoc_md443) Running YOLOv9 using OpenCV 1. Export model following [official guide](https://github.com/WongKinYiu/yolov9)of the YOLOv9 repository. Particularly you can do following for converting. ```bash git clone https://github.com/WongKinYiu/yolov9.git cd yolov9 conda create -n yolov9 python=3.9 conda activate yolov9 pip install -r requirements.txt wget https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-t-converted.pt python export.py --weights=./yolov9-t-converted.pt --include=onnx --img-size=(480,640) ``` This will generate <yolov9-t-converted.onnx> file. 2. Inference on OpenCV. ```bash build opencv from source cd build ./bin/example_dnn_yolo_detector --model=<path-to-yolov9-t-converted.onnx> --yolo=yolov9 --width=640 --height=480 --scale=0.003921568627 --padvalue=114 --path=<path-to-image> ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-07-02 18:26:34 +03:00
Wanli	6e1864e3fc	Merge pull request #24941 from WanliZhong:v_exp Add support for v_exp (exponential) #24941 This PR aims to implement `v_exp(v_float16 x)`, `v_exp(v_float32 x)` and `v_exp(v_float64 x)`. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-07-02 12:32:49 +03:00
Alexander Smorkalov	3d74d646d8	Fixed CuDNN runtime version check for CuDNN 9+.	2024-07-01 17:33:24 +03:00
Yuantao Feng	3f13ce797b	Merge pull request #25779 from fengyuentau:dnn/fix_onnx_depthtospace dnn: add DepthToSpace and SpaceToDepth #25779 We are working on updating WeChat QRCode module. One of the new models is a fully convolutional model and hence it should be able to run with different input shapes. However, it has an operator `DepthToSpace`, which is parsed as a subgraph of `Reshape -> Permute -> Reshape` with a fixed shape getting during parsing. The subgraph itself is not a problem, but the true problem is the subgraph with a fixed input and output shape regardless input changes. This does not allow the model to run with different input shapes. Solution is to add a dedicated layer for DepthtoSpace and SpaceToDepth. Backend support: - [x] CPU - [x] CUDA - [x] OpenCL - [x] OpenVINO - [x] CANN - [x] TIMVX - ~Vulkan~ (missing fundamental tools, like permutation and reshape) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-06-21 19:28:22 +03:00
Dmitry Kurtaev	3700f9e1e9	Merge pull request #25709 from dkurt:wrap_addLayer * Wrap dnn addLayer * Add typing stubs	2024-06-07 20:39:44 +03:00
Kumataro	1bd5ca1ebe	Merge pull request #25686 from Kumataro:fix25674 Suppress build warnings for GCC14 #25686 Close #25674 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-06-02 14:14:04 +03:00
CNOCycle	98b8825031	Merge pull request #25613 from CNOCycle:tflite/ops Support Global_Pool_2D ops in .tflite model #25613 ### Pull Request Readiness Checklist Merge with extra: https://github.com/opencv/opencv_extra/pull/1180 This PR adds support for `GlobalAveragePooling2D` and `GlobalMaxPool2D` on the TFlite backend. When the k`eep_dims` option is enabled, the output is a 2D tensor, necessitating the inclusion of an additional flatten layer. Additionally, the names of these layers have been updated to match the output tensor names generated by `generate.py` from the opencv_extra repository. - [X] I agree to contribute to the project under Apache 2 License. - [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [X] The feature is well documented and sample code can be built with the project CMake	2024-05-31 19:31:21 +03:00
Abduragim Shtanchaev	d7f04a9d33	Merge pull request #25660 from Abdurrahheem:ash/fix-slice-empty-input Slice layer parser fix to support empty input case #25660 This PR fixes Slice Layer's parser to handle empty input cases (cases with initializer) It fixed the issue rased in #24838 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-05-31 13:13:36 +03:00
Danial Javady	05e48605a0	Merge pull request #25412 from ZelboK:update-cudnn-to-9 Refactor DNN module to build with cudnn 9 #25412 A lot of APIs that are currently being used in the dnn module have been removed in cudnn 9. They were deprecated in 8. This PR updates said code accordingly to the newer API. Some key notes: 1) This is my first PR. I am new to openCV. 2) `opencv_test_core` tests pass 3) On a 3080, cuda 12.4(should be irrelevant since I didn't build the `opencv_modules`, gcc 11.4, WSL 2. 4) For brevity I will avoid including macro code that will allow for older versions of cudnn to build. I was unable to get the tests working for `opencv_test_dnn` and `opencv_perf_dnn`. The errors I get are of the following: ``` OpenCV tests: Can't find required data file: dnn/onnx/conformance/node/test_reduce_prod_default_axes_keepdims_example/model.onnx in function 'findData' " thrown in the test body. ``` So before I spend more time investigating I was hoping to get a maintainer to point me in the right direction here. I would like to run these tests and confirm things are working as intended. I may have missed some details. ### Pull Request Readiness Checklist relevant issue (https://github.com/opencv/opencv/issues/24983 - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-05-28 09:54:08 +03:00
Alexander Smorkalov	0b39a51be8	pre: OpenCV 4.10.0 (version++).	2024-05-21 11:37:05 +03:00
Alexander Smorkalov	5f509e2ec1	Skip Test_Caffe_layers.Concat with Vulkan due to sporadic failures.	2024-05-17 11:54:25 +03:00
Yuantao Feng	bc0618b688	Merge pull request #25582 from fengyuentau:dnn/dump_pbtxt Current net exporter `dump` and `dumpToFile` exports the network structure (and its params) to a .dot file which works with `graphviz`. This is hard to use and not friendly to new user. What's worse, the produced picture is not looking pretty. dnn: better net exporter that works with netron #25582 This PR introduces new exporter `dumpToPbtxt` and uses this new exporter by default with environment variable `OPENCV_DNN_NETWORK_DUMP`. It mimics the string output of a onnx model but modified with dnn-specific changes, see below for an example. ![image](https://github.com/opencv/opencv/assets/17219438/0644bed1-da71-4019-8466-88390698e4df) ## Usage Call `cv::dnn::Net::dumpToPbtxt`: ```cpp TEST(DumpNet, dumpToPbtxt) { std::string path = "/path/to/model.onnx"; auto net = readNet(path); Mat input(std::vector<int>{1, 3, 640, 480}, CV_32F); net.setInput(input); net.dumpToPbtxt("yunet.pbtxt"); } ``` Set `export OPENCV_DNN_NETWORK_DUMP=1` ```cpp TEST(DumpNet, env) { std::string path = "/path/to/model.onnx"; auto net = readNet(path); Mat input(std::vector<int>{1, 3, 640, 480}, CV_32F); net.setInput(input); net.forward(); } ``` --- Note: - `pbtxt` is registered as one of the ONNX model suffix in netron. So you can see `module: ai.onnx` and such in the model. - We can get the string output of an ONNX model with the following script ```python import onnx net = onnx.load("/path/to/model.onnx") net_str = str(net) file = open("/path/to/model.pbtxt", "w") file.write(net_str) file.close() ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-05-17 11:07:05 +03:00
Alexander Smorkalov	78ed6de518	Merge pull request #25594 from LaurentBerger:I25587 typo	2024-05-16 08:46:56 +03:00
CNOCycle	7713c84465	Merge pull request #25297 from CNOCycle:tflite/transpose Support Transpose op in TFlite #25297 Merge with extra: https://github.com/opencv/opencv_extra/pull/1168 The purpose of this PR is to introduce support for the Transpose op in TFlite format and to add a shape comparison between the output tensors and the references. In some occasional cases, the shape of the output tensor is `[1,4,1,1]`, while the shape of the reference tensor is `[1,4]`. Consequently, the norm check incorrectly reports that the test has passed, as the residual is zero. Below is a Python script for generating testing data. The generated data can be integrated into the repo `opencv_extra`. ```python import numpy as np import tensorflow as tf PREFIX_TFL = '/path/to/opencv_extra/testdata/dnn/tflite/' def generator(input_tensor, model, saved_name): # convert keras model to .tflite format converter = tf.lite.TFLiteConverter.from_keras_model(model) #converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.optimizations = [None] tflite_model = converter.convert() with open(f'{PREFIX_TFL}/{saved_name}.tflite', 'wb') as f: f.write(tflite_model) # save the input tensor to .npy if input_tensor.ndim == 4: opencv_tensor = np.transpose(input_tensor, (0,3,1,2)) else: opencv_tensor = input_tensor opencv_tensor = np.copy(opencv_tensor, order='C').astype(np.float32) np.save(f'{PREFIX_TFL}/{saved_name}_inp.npy', opencv_tensor) # generate output tenosr and save it to .npy mat_out = model(input_tensor).numpy() mat_out = np.copy(mat_out, order='C').astype(np.float32) if mat_out.ndim == 4: mat_out = np.transpose(mat_out, (0,3,1,2)) interpreter = tf.lite.Interpreter(model_content=tflite_model) out_name = interpreter.get_output_details()[0]['name'] np.save(f'{PREFIX_TFL}/{saved_name}_out_{out_name}.npy', mat_out) def build_transpose(): model_name = "keras_permute" mat_in = np.array([[[1,2,3], [4,5,6]]], dtype=np.float32) model = tf.keras.Sequential() model.add(tf.keras.Input(shape=(2,3))) model.add(tf.keras.layers.Permute((2,1))) model.summary() generator(mat_in, model, model_name) if __name__ == '__main__': build_transpose() ``` ### Pull Request Readiness Checklist - [x] I agree to contribute to the project under Apache 2 License. - [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [X] The feature is well documented and sample code can be built with the project CMake	2024-05-15 20:07:25 +03:00
unknown	5009109167	typo	2024-05-15 16:16:07 +02:00
Laurent Berger	76d9f7aaeb	Merge pull request #25591 from LaurentBerger:I25589 Remove dnn::layer::allocate in doc #25591 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work #25589 - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-05-15 17:08:52 +03:00
alexlyulkov	03507e06b4	Merge pull request #25518 from alexlyulkov:al/fixed-gemm-openvino Fixed OpenVINO gemm layer #25518 Fixed OpenVINO gemm layer The problem was that our layer didn't properly handle all the possible gemm options in OpenVINO mode Fixes #25472 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-05-14 17:41:19 +03:00
Alexander Smorkalov	d8e18f4576	Made fcn-resnet50-12.onnx model optional.	2024-05-03 16:14:22 +03:00
Alexander Smorkalov	ac9a858377	Merge pull request #25524 from alexlyulkov:al/openvino-layers Added more OpenVINO layers to dnn	2024-05-03 13:16:56 +03:00
Wanli	ed47cce1c5	change fcn8s-heavy-pascal tests from caffe to onnx	2024-05-03 00:15:09 +08:00
Alexander Lyulkov	f3f29fa62c	Added more OpenVINO layers to dnn	2024-05-02 14:37:40 +03:00
alexlyulkov	f9dd20eb07	Merge pull request #25414 from alexlyulkov:al/range-fixed Fixed ONNX range layer #25414 Partially address https://github.com/opencv/opencv/issues/25363 Fixed ONNX range layer. It should support any input type. Added tests (extra [PR](https://github.com/opencv/opencv_extra/pull/1170)) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-04-17 09:38:21 +03:00
Alexander Smorkalov	ecbfc1bfd8	Merge pull request #25395 from susumu-iino:fix-dnn-plugin-build-win32 Fix dnn plugin build win32	2024-04-12 11:05:34 +03:00
Yuantao Feng	197626a5bf	Merge pull request #25387 from fengyuentau:complete-float16_t-renaming Rename remaining float16_t for future proof #25387 Resolves comment: https://github.com/opencv/opencv/pull/25217#discussion_r1547733187. `std::float16_t` and `std::bfloat16_t` are introduced since c++23: https://en.cppreference.com/w/cpp/types/floating-point. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-04-11 14:02:44 +03:00
Alexander Smorkalov	e4677fbf64	Merge pull request #25361 from hanliutong:rvv-f32 Further optimize fastDepthwiseConv for RISC-V Vector.	2024-04-09 16:04:02 +03:00
ecchen	e63690a2d9	Add a shape checker for tflite models	2024-04-08 13:28:05 +00:00

1 2 3 4 5 ...

2313 Commits