Commit Graph

2429 Commits

Author SHA1 Message Date
alexlyulkov
a2fa1d49a4
Merge pull request #26208 from alexlyulkov:al/new-engine-caffe-parser
Modified Caffe parser to support the new dnn engine #26208

Now the Caffe parser supports both the old and the new engine. It can be selected using newEngine argument in PopulateNet.

All cpu Caffe tests work fine except:

- Test_Caffe_nets.Colorization
- Test_Caffe_layers.FasterRCNN_Proposal

Both these tests doesn't work because of the bug in the new net.forward function. The function takes the name of the desired target last layer, but uses this name as the name of the desired output tensor.
Also Colorization test contains a strange model with a Silence layer in the end, so it doesn't have outputs. The old parser just ignored it. I think, the proper solution is to run this model until the (number_of_layers - 2) layer using proper net.forward arguments in the test.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-28 11:32:07 +03:00
Alexander Lyulkov
3a4c88c33e Added exception when calling forward to specified layer with the new dnn engine 2024-10-25 13:00:15 +03:00
alexlyulkov
a40ceff215
Merge pull request #26330 from alexlyulkov:al/new-engine-tflite-parser2
Modified TFLite parser for the new dnn engine #26330

The new dnn graph is creating just by defining input and output names of each layer.
Some TFLite layers has fused activation, which doesn't have layer name and input and output names. Also some layers require additional preprocessing layers (e.g. NHWC -> NCHW). All these layers should be added to the graph with some unique layer and input and output names. 

I solve this problem by adding additionalPreLayer and additionalPostLayer layers.

If a layer has a fused activation, I add additionalPostLayer and change input and output names this way:
**original**: conv_relu(conv123, conv123_input, conv123_output)
**new**: conv(conv123, conv123_input, conv123_output_additional_post_layer) + relu(conv123_relu,  conv1_output_additional_post_layer, conv123_output)

If a layer has additional preprocessing layer, I change input and output names this way:
**original**: permute_reshape(reshape345, reshape345_input, reshape345_output)
**new**: permute(reshape345_permute, reshape345_input, reshape345_input_additional_pre_layer) + reshape(reshape345, reshape345_input_additional_pre_layer, reshape345_output)


### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-22 09:05:58 +03:00
Vadim Pisarevsky
3cd57ea09e
Merge pull request #26056 from vpisarev:new_dnn_engine
New dnn engine #26056

This is the 1st PR with the new engine; CI is green and PR is ready to be merged, I think.
Merge together with https://github.com/opencv/opencv_contrib/pull/3794

---

**Known limitations:**
* [solved] OpenVINO is temporarily disabled, but is probably easy to restore (it's not a deal breaker to merge this PR, I guess)
* The new engine does not support any backends nor any targets except for the default CPU implementation. But it's possible to choose the old engine when loading a model, then all the functionality is available.
* [Caffe patch is here: #26208] The new engine only supports ONNX. When a model is constructed manually or is loaded from a file of different format (.tf, .tflite, .caffe, .darknet), the old engine is used.
* Even in the case of ONNX some layers are not supported by the new engine, such as all quantized layers (including DequantizeLinear, QuantizeLinear, QLinearConv etc.), LSTM, GRU, .... It's planned, of course, to have full support for ONNX by OpenCV 5.0 gold release. When a loaded model contains unsupported layers, we switch to the old engine automatically  (at ONNX parsing time, not at `forward()` time).
* Some layers , e.g. Expat, are only partially supported by the new engine. In the case of unsupported flavours it switches to the old engine automatically (at ONNX parsing time, not at `forward()` time).
* 'Concat' graph optimization is disabled. The optimization eliminates Concat layer and instead makes the layers that generate tensors to be concatenated to write the outputs to the final destination. Of course, it's only possible when `axis=0` or `axis=N=1`. The optimization is not compatible with dynamic shapes since we need to know in advance where to store the tensors. Because some of the layer implementations have been modified to become more compatible with the new engine, the feature appears to be broken even when the old engine is used.
* Some `dnn::Net` API is not available with the new engine. Also, shape inference may return false if some of the output or intermediate tensors' shapes cannot be inferred without running the model. Probably this can be fixed by a dummy run of the model with zero inputs.
* Some overloads of `dnn::Net::getFLOPs()` and `dnn::Net::getMemoryConsumption()` are not exposed any longer in wrapper generators; but the most useful overloads are exposed (and checked by Java tests).
* [in progress] A few Einsum tests related to empty shapes have been disabled due to crashes in the tests and in Einsum implementations. The code and the tests need to be repaired.
* OpenCL implementation of Deconvolution is disabled. It's very bad and very slow anyway; need to be completely revised.
* Deconvolution3D test is now skipped, because it was only supported by CUDA and OpenVINO backends, both of which are not supported by the new engine.
* Some tests, such as FastNeuralStyle, checked that the in the case of CUDA backend there is no fallback to CPU. Currently all layers in the new engine are processed on CPU, so there are many fallbacks. The checks, therefore, have been temporarily disabled.

---

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-10-16 15:28:19 +03:00
Alexander Smorkalov
cb3af0a08f Merge branch 4.x 2024-09-23 14:18:25 +03:00
Maksim Shabunin
7bdc618697
Merge pull request #26025 from mshabunin:cpp-videoio-highgui
 Potential conflicts with #25958
C-API cleanup: highgui, videoio #26025

  Merge with: opencv/opencv_contrib#3780

This PR removes usage of C-API from highgui and videoio modules. Only source code is affected, tests were not using obsolete API.

It should be possible to backport these changes to 4.x branch preserving removed public headers and source files (`*_c.h` and `*_c.cpp`).


#### Checklist

I tried to verify as many backends as possible, though these checks were not as thorough as I'd like them to be. Below is the checklist covering all modified backends with their statuses.

> 🔹 - small changes
> 🟢 - consider working
>  - considered untested

##### highgui

Pass | Backend | Local check | CI check
-----|---------|-------------|---------
🟢 | GTK2 | build + test, plugin build | build + test  
🟢 | GTK3 | build + test, plugin build | build + test
🟢 | QT | build + test, plugin build |
 | Wayland 🔹 | |
🟢 | WIN32 🔹 | | build + test
🟢 | Cocoa 🔹 | | build + test
 | WinRT | | 

##### videoio 

Pass | Backend | Local check | CI check
-----|---------|-------------|---------
🟢 | Android Camera/MediaNDK 🔹 | | build
🟢 | Aravis | build |
🟢 | AVFoundation OSX | | build + test
 | AVFoundation iOS | | build
🟢 | DC1394 | build |
🟢 | DShow 🔹 | | build
🟢 | FFMpeg | build, plugin build | build + test
🟢 | GPhoto 🔹 | build |
🟢 | GStreamer | build, plugin build | build + test
🟢 | Images | build | build + test
🟢 | MSMF 🔹 | | build + test
🟢 | OpenNI | build |
🟢 | PVAPI | build |
🟢 | V4L | build + test | build
🟢 | XIMEA | build |
🟢 | XINE 🔹 | build |

#### Notes

- local linux build checks performed using [this framework](https://github.com/mshabunin/opencv-videoio-build-check)
- minor extra changes made in both `cap_avfoundation*.mm` to make them slightly more synchronized - it would be better to combine them into a single one in the future
- configurations with plugins have been build but not tested
- **moved unrelated changes to separate PRs** ~two issues have been fixed in separate commits:~
  - ~imgproc: missing `cv::hal::` color conversion functions has been used in MediaSDK backend~
  - ~videoio/V4L: wrong color conversion mode caused bad colors for NV12 camera input format (RGB instead of BGR)~

It would be nice to check following functionality manually:
- [ ] OSX: camera input
- [ ] iOS: camera and file input
- [ ] WinRT: build, some testing
- [x] Linux/Wayland: build
2024-09-09 16:42:44 +03:00
Yuantao Feng
ce5823c5eb
Merge pull request #26124 from fengyuentau:dnn/topk_dtype
dnn(5.x): handle topk data type #26124

Resolves https://github.com/opencv/opencv/issues/26076

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-09-09 15:15:04 +03:00
Alexander Smorkalov
209802c9f6 Leaky RELU support for TFLite. 2024-09-09 12:40:35 +03:00
Abduragim Shtanchaev
8263c804de
Merge pull request #26106 from Abdurrahheem:ash/add-gatherND
Support for GatherND layer #26106

This PR adds support for GatherND layer. The layer was in comformance deny list initially.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-09-09 11:37:33 +03:00
Abduragim Shtanchaev
a18d793dbd
Merge pull request #26110 from Abdurrahheem:ash/hardmax-backend-fix
Switch off OpenVINO backend for Hardmax #26110

Fix Hardmax Issue with backend mentioned in #26107 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-09-06 13:14:25 +03:00
Gursimar Singh
f8fb3a7f55
Merge pull request #25515 from gursimarsingh:improved_edge_detection_sample
#25006 #25314 
This pull request removes hed_pretrained caffe model to the SOTA dexined onnx model for edge detection. Usage of conventional methods like canny has also been added

The obsolete cpp and python sample has been removed

TODO:
- [  ]  Remove temporary hack for quantized models. Refer issue https://github.com/opencv/opencv_zoo/issues/273

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-09-06 12:47:04 +03:00
Abduragim Shtanchaev
0f8bbf4677
Merge pull request #26079 from Abdurrahheem:ash/hardmax-support
Add Support for Hardmax Layer #26079

This PR add support for `Hardmax` layer, which as previously listed in conformance deny list. 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-09-02 12:32:55 +03:00
Alexander Smorkalov
100db1bc0b Merge branch 4.x 2024-08-28 15:06:19 +03:00
alexlyulkov
766bad0035
Merge pull request #26053 from alexlyulkov:al/opencl-conformance-tests
DNN(ONNX): Enabled several OpenCL conformance tests #26053

The tests also work in 5.x

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-08-27 17:23:11 +03:00
Abduragim Shtanchaev
e5b871fa7e
Merge pull request #26059 from Abdurrahheem:ash/fix-einsum-allocation
Einsum buffer allocation fix #26059

This PR fixed buffer allocation issue in Einsum layer that causes segmentation fault on 32bit platforms. Related issue #26008 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-08-23 15:25:48 +03:00
Alexander Smorkalov
6c6d5cd7b2
Merge pull request #25986 from asmorkalov:as/js_for_contrib
Split Javascript white-list to support contrib modules #25986

Single whitelist converted to several per-module json files. They are concatenated automatically and can be overriden by user config.

Related to https://github.com/opencv/opencv/pull/25656

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-08-23 10:49:08 +03:00
Yuantao Feng
347d673a87
Merge pull request #23279 from fengyuentau:add_topk
dnn: add ONNX TopK #23279

Merge with https://github.com/opencv/opencv_extra/pull/1200

Partially fixes #22890 and #20258

To-do:

- [x] TopK forward impl
- [x] add tests
- [x] support Opset 1 & 10 if possible
- [ ] ~Support other backends~ (TopK has two outputs, which is not supported by other backends, such as openvino)


Perf:

M1 (time in millisecond)

| input shape     | axis | dnn  | ort  |
| --------------- | ---- | ---- | ---- |
| (1000, 100)     | 0    | 1.68 | 4.07 |
| (1000, 100) K5  | 0    | 1.13 | 0.12 |
| (1000, 100)     | 1    | 0.96 | 0.77 |
| (100, 100, 100) | 0    | 10.00 | 31.13 |
| (100, 100, 100) | 1    | 7.33 | 9.17 |
| (100, 100, 100) | 2    | 7.52 | 9.48 |

M2 (time in milisecond)

| input shape     | axis | dnn  | ort  |
| --------------- | ---- | ---- | ---- |
| (1000, 100)     | 0    | 0.76 | 2.44 |
| (1000, 100) K5 | 0 | 0.68 | 0.07 |
| (1000, 100)     | 1    | 0.41 | 0.50 |
| (100, 100, 100) | 0    | 4.83 | 17.52|
| (100, 100, 100) | 1    | 3.60 | 5.08 |
| (100, 100, 100) | 2    | 3.73 | 5.10 |

ONNXRuntime performance testing script: https://gist.github.com/fengyuentau/a119f94fd16721ec9974b8c7b0a45d4c

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-08-21 17:03:24 +03:00
Alexander Smorkalov
9dcac41c66
Merge pull request #26043 from Abdurrahheem:ash/fix-nll-layer
Support Matrices with axes > 6 CUDA
2024-08-21 11:12:43 +03:00
Alexander Smorkalov
7d31463fea
Merge pull request #26048 from alexlyulkov:al/dnn-opencl-int
Added integer and bool support to dnn OpenCL layers
2024-08-21 09:39:59 +03:00
Abduragim Shtanchaev
79a731aff0 Fix for support matrixes with number of axes = 6. 2024-08-21 09:06:46 +03:00
Alexander Smorkalov
d69db7aac4 Supress more ONNX conformance test cases for quant-dequant with CUDA. 2024-08-20 17:12:26 +03:00
Abduragim Shtanchaev
45fd4d8217
Merge pull request #26026 from Abdurrahheem:ash/python_bool_binding
Add support for boolan input/outputs in python bindings #26026

This PR add support boolean input/output binding in python. The issue what mention in ticket https://github.com/opencv/opencv/issues/26024 and the PR soleves it. Data and models are located in [here](https://github.com/opencv/opencv_extra/pull/1201)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-08-20 16:38:21 +03:00
Alexander Lyulkov
a69cd7d6ba Added integer and bool support to dnn OpenCL layers 2024-08-20 12:33:08 +03:00
Yuantao Feng
93e0c7e53f fix matmul crash 2024-08-15 16:10:40 +08:00
Alexander Smorkalov
67d0338c9c
Merge pull request #26004 from ericmariasis:eric-mariasis-issue-26000
got rid of std prefix
2024-08-07 10:26:24 +03:00
ericmariasis
3f92884520 got rid of std prefix 2024-08-07 00:17:04 -04:00
Alexander Smorkalov
7e8f2a1bc4 Merge branch 4.x 2024-08-06 15:31:30 +03:00
Aven
796974cccc fix compilation errors caused by namespace
related: #25199
2024-08-04 05:04:03 +08:00
Daniele Affinita
2a333a6c86
Merge pull request #25644 from DaniAffCH:blockwise-quantization
[GSoC] dnn: Blockwise quantization support #25644

This PR introduces blockwise quantization in DNN allowing the parsing of ONNX models quantized in blockwise style. In particular it modifies the `Quantize` and `Dequantize` operations. The related PR opencv/opencv_extra#1181 contains the test data.

Additional notes:
- The original quantization issue has been fixed. Previously, for 1D scale and zero-point, the operation applied was  $y = int8(x/s - z)$ instead of $y = int8(x/s + z)$. Note that the operation was already correctly implemented when the scale and zero-point were scalars. The previous implementation failed the ONNX test cases, but now all have passed successfully.  [Reference](https://github.com/onnx/onnx/blob/main/docs/Operators.md#QuantizeLinear)
- the function `block_repeat` broadcasts scale and zero-point to the input shape. It repeats all the elements of a given axis n times. This function generalizes the behavior of `repeat` from the core module which is defined just for 2 axis assuming `Mat` has 2 dimensions. If appropriate and useful, you might consider moving `block_repeat` to the core module.
- Now, the scale and zero-point can be taken as layer inputs. This increases the ONNX layers' coverage and enables us to run the ONNX test cases (previously disabled) being fully compliant with ONNX standards. Since they are now supported, I have enabled the test cases for: `test_dequantizelinear`, `test_dequantizelinear_axis`, `test_dequantizelinear_blocked`, `test_quantizelinear`, `test_quantizelinear_axis`, `test_quantizelinear_blocked` just in CPU backend. All of them pass successfully.
   
### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-30 14:16:08 +03:00
Alexander Smorkalov
672a662dff Merge branch 4.x 2024-07-26 09:10:36 +03:00
Yuantao Feng
23b244d3a3
Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp
dnn: optimize activations with v_exp #25881

Merge with https://github.com/opencv/opencv_extra/pull/1191.

This PR optimizes the following activations:

- [x] Swish
- [x] Mish
- [x] Elu
- [x] Celu
- [x] Selu
- [x] HardSwish

### Performance (Updated on 2024-07-18)

#### AmLogic A311D2 (ARM Cortex A73 + A53)

```
Geometric mean (ms)

            Name of Test              activations activations.patch activations.patch
                                                                              vs
                                                                         activations
                                                                          (x-factor)
Celu::Layer_Elementwise::OCV/CPU        115.859          27.930              4.15
Elu::Layer_Elementwise::OCV/CPU          27.846          27.003              1.03
Gelu::Layer_Elementwise::OCV/CPU         0.657           0.602               1.09
HardSwish::Layer_Elementwise::OCV/CPU    31.885          6.781               4.70
Mish::Layer_Elementwise::OCV/CPU         35.729          32.089              1.11
Selu::Layer_Elementwise::OCV/CPU         61.955          27.850              2.22
Swish::Layer_Elementwise::OCV/CPU        30.819          26.688              1.15
```

#### Apple M1

```
Geometric mean (ms)

               Name of Test                activations activations.patch activations.patch
                                                                                   vs
                                                                              activations
                                                                               (x-factor)
Celu::Layer_Elementwise::OCV/CPU              16.184          2.118               7.64
Celu::Layer_Elementwise::OCV/CPU_FP16         16.280          2.123               7.67
Elu::Layer_Elementwise::OCV/CPU               9.123           1.878               4.86
Elu::Layer_Elementwise::OCV/CPU_FP16          9.085           1.897               4.79
Gelu::Layer_Elementwise::OCV/CPU              0.089           0.081               1.11
Gelu::Layer_Elementwise::OCV/CPU_FP16         0.086           0.074               1.17
HardSwish::Layer_Elementwise::OCV/CPU         1.560           1.555               1.00
HardSwish::Layer_Elementwise::OCV/CPU_FP16    1.536           1.523               1.01
Mish::Layer_Elementwise::OCV/CPU              6.077           2.476               2.45
Mish::Layer_Elementwise::OCV/CPU_FP16         5.990           2.496               2.40
Selu::Layer_Elementwise::OCV/CPU              11.351          1.976               5.74
Selu::Layer_Elementwise::OCV/CPU_FP16         11.533          1.985               5.81
Swish::Layer_Elementwise::OCV/CPU             4.687           1.890               2.48
Swish::Layer_Elementwise::OCV/CPU_FP16        4.715           1.873               2.52
```

#### Intel i7-12700K

```
Geometric mean (ms)

            Name of Test              activations activations.patch activations.patch
                                                                    vs
                                                               activations
                                                                (x-factor)
Celu::Layer_Elementwise::OCV/CPU        17.106       3.560         4.81
Elu::Layer_Elementwise::OCV/CPU          5.064       3.478         1.46
Gelu::Layer_Elementwise::OCV/CPU         0.036       0.035         1.04
HardSwish::Layer_Elementwise::OCV/CPU    2.914       2.893         1.01
Mish::Layer_Elementwise::OCV/CPU         3.820       3.529         1.08
Selu::Layer_Elementwise::OCV/CPU        10.799       3.593         3.01
Swish::Layer_Elementwise::OCV/CPU        3.651       3.473         1.05
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-19 16:03:19 +03:00
HAN Liutong
b5ea32158a
Merge pull request #25883 from hanliutong:rvv-intrin-upgrade
Upgrade RISC-V Vector intrinsic and cleanup the obsolete RVV backend. #25883

This patch upgrade RISC-V Vector intrinsic from `v0.10` to `v0.12`/`v1.0`:
- Update cmake check and options;
- Upgrade RVV implement for Universal Intrinsic;
- Upgrade RVV optimized DNN kernel.
- Cleanup the obsolete RVV backend (`intrin_rvv.hpp`) and compatable header file.

With this patch, RVV backend require Clang 17+ or GCC 14+ (which means `__riscv_v_intrinsic >= 12000`, see https://godbolt.org/z/es7ncETE3)

This patch is test with Clang 17.0.6 (require extra `-DWITH_PNG=OFF` due to ICE), Clang 18.1.8 and GCC 14.1.0 on QEMU and k230 (with `--gtest_filter="*hal_*"`).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-19 11:41:42 +03:00
zihaomu
1125755345
Merge pull request #25931 from zihaomu:clean_code
code clean #25931

Align code and remove redundant CMake code

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-18 17:18:37 +03:00
Abduragim Shtanchaev
88f05e49be
Merge pull request #25868 from Abdurrahheem:ash/add-gpt2-sample
Add sample for GPT2 inference #25868

### Pull Request Readiness Checklist

This PR adds sample for inferencing GPT-2 model. More specificly implementation of GPT-2 from [this repository](https://github.com/karpathy/build-nanogpt). Currently inference in OpenCV is only possible to do with fixed window size due to not supported dynamic shapes. 

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-18 16:47:12 +03:00
Abduragim Shtanchaev
060c24bec9
Merge pull request #25101 from Abdurrahheem:ash/1D-reduce-test
1D test for Reduce layer #25101

This PR introduces test for `Reduce` layer to test its functionality for 1D arrays

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-17 19:19:18 +03:00
Abduragim Shtanchaev
d05047ae41
Merge pull request #25917 from Abdurrahheem:ash/reduce-parser-fix
Fix Reduce layer for cosnt inputs #25917

### Pull Request Readiness Checklist

This PR adds support for const inputs for reducing the layer. Particularly, it fixes the following case. The test model and data are located in [1194](https://github.com/opencv/opencv_extra/pull/1194)

<img width="190" alt="image" src="https://github.com/user-attachments/assets/45a90f0a-b798-4529-bece-24c7bfc9e7ba">


See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-17 19:17:44 +03:00
Alexander Smorkalov
fc9208cff5 Merge branch 4.x 2024-07-17 10:08:16 +03:00
Yuantao Feng
420663498f
Merge pull request #25900 from fengyuentau:dnn/nary_elementwise_multi_thread
dnn: merge #25630 to 5.x #25900

Sync changes from https://github.com/opencv/opencv/pull/25630 to 5.x.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-15 08:57:50 +03:00
Aliaksei Urbanski
35ca2f78d6
Merge pull request #25880 from Jamim:fix/cuda-no-fp16
Fix CUDA for old GPUs without FP16 support #25880

Fixes #21461

~This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.~
~We shouldn't add an invalid target while building with `CUDA_ARCH_BIN` < 53.~
_(please see [this discussion](https://github.com/opencv/opencv/pull/25880#discussion_r1668074505))_

This is a run-time solution that basically reverts [these lines](d0fe6ad109 (diff-757c5ab6ddf2f99cdd09f851e3cf17abff203aff4107d908c7ad3d0466f39604L245-R245)).

I've debugged these changes, [coupled with other fixes](https://github.com/gentoo/gentoo/pull/37479), on [Gentoo Linux](https://www.gentoo.org/) and [related tests passed](https://github.com/user-attachments/files/16135391/opencv-4.10.0.20240708-224733.log.gz) on my laptop with `GeForce GTX 960M`.

Alternative solution:
  - #21462

_Best regards!_

### Pull Request Readiness Checklist

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] `n/a` There is accuracy test, performance test and test data in opencv_extra repository, if applicable
- [ ] `n/a` The feature is well documented and sample code can be built with the project CMake
2024-07-10 12:39:30 +03:00
Yuantao Feng
e3858cc5a3
Merge pull request #25147 from fengyuentau:dnn/elementwise_layers/speedup
* added v_erf and implemented gelu acceleration via vectorization

* remove anonymous v_erf and use v_erf from intrin_math

* enable perf for ov and cuda backend
2024-07-08 14:24:36 +03:00
Abduragim Shtanchaev
efbc9f0b66
Merge pull request #25861 from Abdurrahheem:ash/torch-attention-export-fix-4x
Merge pull request #25861 from Abdurrahheem:ash/torch-attention-export-fix-4x

Support for Unflatten operation requred by Attention layer - 4.x #25861

### Pull Request Readiness Checklist

All test data and models for PR are located [#1190](https://github.com/opencv/opencv_extra/pull/1190)

This PR fixes issue reised when importing batched  vanilla `Attention` layer from `PyTorch` via ONNX. Currently batched version of `Attention` layer in PyTorch [has unflatten operation inside](e3b3431c42/torch/nn/functional.py (L5500C17-L5500C31)). `unflatten` operation causes issue in `reshape` layer (see the Reshape_2 in the graph below) due to incorrect output of `slice` layer. This PR particularly fixes `slice` and `concat` layers to handle `unflatten` operation. 


<img width="673" alt="image" src="https://github.com/opencv/opencv/assets/44877829/5b612b31-657a-47f1-83a4-0ac35a950abd">


See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-04 16:25:31 +03:00
Yuantao Feng
5510718381
Merge pull request #25810 from fengyuentau:python/fix_parsing_3d_mat_in_dnn
python: attempts to fix 3d mat parsing problem for dnn #25810

Fixes https://github.com/opencv/opencv/issues/25762 https://github.com/opencv/opencv/issues/23242
Relates https://github.com/opencv/opencv/issues/25763 https://github.com/opencv/opencv/issues/19091

Although `cv.Mat` has already been introduced to workaround this problem, people do not know it and it kind of leads to confusion with `numpy.array`. This patch adds a "switch" to turn off the auto multichannel feature when the API is from cv::dnn::Net (more specifically, `setInput`) and the parameter is of type `Mat`. This patch only leads to changes of three places in `pyopencv_generated_types_content.h`:

```.diff
static PyObject* pyopencv_cv_dnn_dnn_Net_setInput(PyObject* self, PyObject* py_args, PyObject* kw)
{
...
- pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 0)) &&
+ pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 8)) &&
...
}

// I guess we also need to change this as one-channel blob is expected for param
static PyObject* pyopencv_cv_dnn_dnn_Net_setParam(PyObject* self, PyObject* py_args, PyObject* kw)
{
...
- pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 0)) )
+ pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 8)) )
...
- pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 0)) )
+ pyopencv_to_safe(pyobj_blob, blob, ArgInfo("blob", 8)) )
...
}
```

Others are unchanged, e.g. `dnn_SegmentationModel` and stuff like that.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-04 08:33:20 +03:00
Alexander Smorkalov
25fb55601b Fixed narrowing conversion warning with MSVC compiler. 2024-07-03 12:10:31 +03:00
Yuantao Feng
a7fd9446cf
Merge pull request #25630 from fengyuentau:nary-multi-thread
dnn: parallelize nary elementwise forward implementation & enable related conformance tests #25630

This PR introduces the following changes:

- [x] Parallelize binary forward impl
- [x] Parallelize ternary forward impl (Where)
- [x] Parallelize nary (Operator that can take >=1 operands)
- [x] Enable conformance tests if workable

## Performance

### i7-12700K, RAM 64GB, Ubuntu 22.04

```
Geometric mean (ms)

                Name of Test                     opencv        opencv        opencv
                                                  perf          perf          perf
                                              core.x64.0606 core.x64.0606 core.x64.0606
                                                                               vs
                                                                             opencv
                                                                              perf
                                                                          core.x64.0606
                                                                           (x-factor)
NCHW_C_sum::Layer_NaryEltwise::OCV/CPU           16.116        11.161         1.44
NCHW_NCHW_add::Layer_NaryEltwise::OCV/CPU        17.469        11.446         1.53
NCHW_NCHW_div::Layer_NaryEltwise::OCV/CPU        17.531        11.469         1.53
NCHW_NCHW_equal::Layer_NaryEltwise::OCV/CPU      28.653        13.682         2.09
NCHW_NCHW_greater::Layer_NaryEltwise::OCV/CPU    21.899        13.422         1.63
NCHW_NCHW_less::Layer_NaryEltwise::OCV/CPU       21.738        13.185         1.65
NCHW_NCHW_max::Layer_NaryEltwise::OCV/CPU        16.172        11.473         1.41
NCHW_NCHW_mean::Layer_NaryEltwise::OCV/CPU       16.309        11.565         1.41
NCHW_NCHW_min::Layer_NaryEltwise::OCV/CPU        16.166        11.454         1.41
NCHW_NCHW_mul::Layer_NaryEltwise::OCV/CPU        16.157        11.443         1.41
NCHW_NCHW_pow::Layer_NaryEltwise::OCV/CPU        163.459       15.234         10.73
NCHW_NCHW_ref_div::Layer_NaryEltwise::OCV/CPU    10.880        10.868         1.00
NCHW_NCHW_ref_max::Layer_NaryEltwise::OCV/CPU    10.947        11.058         0.99
NCHW_NCHW_ref_min::Layer_NaryEltwise::OCV/CPU    10.948        10.910         1.00
NCHW_NCHW_ref_mul::Layer_NaryEltwise::OCV/CPU    10.874        10.871         1.00
NCHW_NCHW_ref_sum::Layer_NaryEltwise::OCV/CPU    10.971        10.920         1.00
NCHW_NCHW_sub::Layer_NaryEltwise::OCV/CPU        17.546        11.462         1.53
NCHW_NCHW_sum::Layer_NaryEltwise::OCV/CPU        16.175        11.475         1.41
NHWC_C::Layer_NaryEltwise::OCV/CPU               11.339        11.333         1.00
NHWC_H::Layer_NaryEltwise::OCV/CPU               16.154        11.102         1.46
```

### Apple M1, RAM 16GB, macOS 14.4.1

```
Geometric mean (ms)

                Name of Test                     opencv          opencv             opencv      
                                                  perf            perf               perf       
                                              core.m1.0606 core.m1.0606.patch core.m1.0606.patch
                                                                                      vs        
                                                                                    opencv      
                                                                                     perf       
                                                                                 core.m1.0606   
                                                                                  (x-factor)    
NCHW_C_sum::Layer_NaryEltwise::OCV/CPU           28.418          3.768               7.54       
NCHW_NCHW_add::Layer_NaryEltwise::OCV/CPU        6.942           5.679               1.22       
NCHW_NCHW_div::Layer_NaryEltwise::OCV/CPU        5.822           5.653               1.03       
NCHW_NCHW_equal::Layer_NaryEltwise::OCV/CPU      5.751           5.628               1.02       
NCHW_NCHW_greater::Layer_NaryEltwise::OCV/CPU    5.797           5.599               1.04       
NCHW_NCHW_less::Layer_NaryEltwise::OCV/CPU       7.272           5.578               1.30       
NCHW_NCHW_max::Layer_NaryEltwise::OCV/CPU        5.777           5.562               1.04       
NCHW_NCHW_mean::Layer_NaryEltwise::OCV/CPU       5.819           5.559               1.05       
NCHW_NCHW_min::Layer_NaryEltwise::OCV/CPU        5.830           5.574               1.05       
NCHW_NCHW_mul::Layer_NaryEltwise::OCV/CPU        5.759           5.567               1.03       
NCHW_NCHW_pow::Layer_NaryEltwise::OCV/CPU       342.260          74.655              4.58       
NCHW_NCHW_ref_div::Layer_NaryEltwise::OCV/CPU    8.338           8.280               1.01       
NCHW_NCHW_ref_max::Layer_NaryEltwise::OCV/CPU    8.359           8.309               1.01       
NCHW_NCHW_ref_min::Layer_NaryEltwise::OCV/CPU    8.412           8.295               1.01       
NCHW_NCHW_ref_mul::Layer_NaryEltwise::OCV/CPU    8.380           8.297               1.01       
NCHW_NCHW_ref_sum::Layer_NaryEltwise::OCV/CPU    8.356           8.323               1.00       
NCHW_NCHW_sub::Layer_NaryEltwise::OCV/CPU        6.818           5.561               1.23       
NCHW_NCHW_sum::Layer_NaryEltwise::OCV/CPU        5.805           5.570               1.04       
NHWC_C::Layer_NaryEltwise::OCV/CPU               3.834           4.817               0.80       
NHWC_H::Layer_NaryEltwise::OCV/CPU               28.402          3.771               7.53
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-03 10:09:05 +03:00
Abduragim Shtanchaev
a8d1373919
Merge pull request #25794 from Abdurrahheem:ash/yolov10-support
Add sample support of YOLOv9 and YOLOv10 in OpenCV #25794

This PR adds sample support of  [`YOLOv9`](https://github.com/WongKinYiu/yolov9) and [`YOLOv10`](https://github.com/THU-MIG/yolov10/tree/main)) in OpenCV. Models for this test are located in this [PR](https://github.com/opencv/opencv_extra/pull/1186). 

**Running YOLOv10 using OpenCV.** 
1. In oder to run `YOLOv10` one needs to cut off postporcessing with dynamic shapes from torch and then convert it to ONNX. If someone is looking for ready solution, there is [this forked branch](https://github.com/Abdurrahheem/yolov10/tree/ash/opencv-export) from official YOLOv10.  Particularty follow this proceduce. 

```bash
git clone git@github.com:Abdurrahheem/yolov10.git
conda create -n yolov10 python=3.9
conda activate yolov10
pip install -r requirements.txt
python export_opencv.py --model=<model-name> --imgsz=<input-img-size>
```
By default `model="yolov10s"` and `imgsz=(480,640)`. This will generate file `yolov10s.onnx`, which can be use for inference in OpenCV

2. For inference part on OpenCV.  one can use `yolo_detector.cpp` [sample](https://github.com/opencv/opencv/blob/4.x/samples/dnn/yolo_detector.cpp). If you have followed above exporting procedure, then you can use following command to run the model. 

``` bash
build opencv from source 
cd build 
./bin/example_dnn_yolo_detector --model=<path-to-yolov10s.onnx-file> --yolo=yolov10 --width=640 --height=480 --input=<path-to-image> --scale=0.003921568627 --padvalue=114
```
If you do not specify `--input` argument, OpenCV will grab first camera that is avaliable on your platform. 
For more deatils on how to run the `yolo_detector.cpp` file see this [guide](https://docs.opencv.org/4.x/da/d9d/tutorial_dnn_yolo.html#autotoc_md443) 


**Running YOLOv9 using OpenCV**

1. Export model following [official guide](https://github.com/WongKinYiu/yolov9)of the YOLOv9 repository. Particularly you can do following for converting.

```bash
git clone https://github.com/WongKinYiu/yolov9.git
cd yolov9
conda create -n yolov9 python=3.9
conda activate yolov9
pip install -r requirements.txt
wget https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-t-converted.pt
python export.py --weights=./yolov9-t-converted.pt --include=onnx --img-size=(480,640) 
```

This will generate <yolov9-t-converted.onnx> file.

2.  Inference on OpenCV.

```bash
build opencv from source 
cd build 
./bin/example_dnn_yolo_detector --model=<path-to-yolov9-t-converted.onnx> --yolo=yolov9 --width=640 --height=480 --scale=0.003921568627 --padvalue=114 --path=<path-to-image>
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-02 18:26:34 +03:00
Wanli
6e1864e3fc
Merge pull request #24941 from WanliZhong:v_exp
Add support for v_exp (exponential) #24941

This PR aims to implement `v_exp(v_float16 x)`, `v_exp(v_float32 x)` and `v_exp(v_float64 x)`.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-02 12:32:49 +03:00
Alexander Smorkalov
3d74d646d8 Fixed CuDNN runtime version check for CuDNN 9+. 2024-07-01 17:33:24 +03:00
Alexander Smorkalov
3abd9f2a28 Merge branch 4.x 2024-07-01 15:59:43 +03:00
alexlyulkov
12b8ed1443
Merge pull request #25755 from alexlyulkov:al/more-types
Added more types support to dnn layers #25755

Added support of more types to dnn layers for CPU, CUDA and OpenVINO backends.
Now most of the multi-type layers support uint8, int8, int32, int64, float32, float16, bool types.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-06-28 09:02:15 +03:00
alexlyulkov
fd7cb1be85
Merge pull request #25739 from alexlyulkov:al/openvino2022-fixed-blank
Fixed blank layer for OpenVINO 2022.1 #25739

Changed blank layer because it didn't work with old OpenVINO versions(2022.1). The blank layer was implemented using ConvertLike layer, now it is implemented using ShapeOf and Reshape layers.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-06-27 18:51:35 +03:00