Commit Graph

1093 Commits

Author SHA1 Message Date
Yuantao Feng
23b244d3a3
Merge pull request #25881 from fengyuentau:dnn/cpu/optimize_activations_with_v_exp
dnn: optimize activations with v_exp #25881

Merge with https://github.com/opencv/opencv_extra/pull/1191.

This PR optimizes the following activations:

- [x] Swish
- [x] Mish
- [x] Elu
- [x] Celu
- [x] Selu
- [x] HardSwish

### Performance (Updated on 2024-07-18)

#### AmLogic A311D2 (ARM Cortex A73 + A53)

```
Geometric mean (ms)

            Name of Test              activations activations.patch activations.patch
                                                                              vs
                                                                         activations
                                                                          (x-factor)
Celu::Layer_Elementwise::OCV/CPU        115.859          27.930              4.15
Elu::Layer_Elementwise::OCV/CPU          27.846          27.003              1.03
Gelu::Layer_Elementwise::OCV/CPU         0.657           0.602               1.09
HardSwish::Layer_Elementwise::OCV/CPU    31.885          6.781               4.70
Mish::Layer_Elementwise::OCV/CPU         35.729          32.089              1.11
Selu::Layer_Elementwise::OCV/CPU         61.955          27.850              2.22
Swish::Layer_Elementwise::OCV/CPU        30.819          26.688              1.15
```

#### Apple M1

```
Geometric mean (ms)

               Name of Test                activations activations.patch activations.patch
                                                                                   vs
                                                                              activations
                                                                               (x-factor)
Celu::Layer_Elementwise::OCV/CPU              16.184          2.118               7.64
Celu::Layer_Elementwise::OCV/CPU_FP16         16.280          2.123               7.67
Elu::Layer_Elementwise::OCV/CPU               9.123           1.878               4.86
Elu::Layer_Elementwise::OCV/CPU_FP16          9.085           1.897               4.79
Gelu::Layer_Elementwise::OCV/CPU              0.089           0.081               1.11
Gelu::Layer_Elementwise::OCV/CPU_FP16         0.086           0.074               1.17
HardSwish::Layer_Elementwise::OCV/CPU         1.560           1.555               1.00
HardSwish::Layer_Elementwise::OCV/CPU_FP16    1.536           1.523               1.01
Mish::Layer_Elementwise::OCV/CPU              6.077           2.476               2.45
Mish::Layer_Elementwise::OCV/CPU_FP16         5.990           2.496               2.40
Selu::Layer_Elementwise::OCV/CPU              11.351          1.976               5.74
Selu::Layer_Elementwise::OCV/CPU_FP16         11.533          1.985               5.81
Swish::Layer_Elementwise::OCV/CPU             4.687           1.890               2.48
Swish::Layer_Elementwise::OCV/CPU_FP16        4.715           1.873               2.52
```

#### Intel i7-12700K

```
Geometric mean (ms)

            Name of Test              activations activations.patch activations.patch
                                                                    vs
                                                               activations
                                                                (x-factor)
Celu::Layer_Elementwise::OCV/CPU        17.106       3.560         4.81
Elu::Layer_Elementwise::OCV/CPU          5.064       3.478         1.46
Gelu::Layer_Elementwise::OCV/CPU         0.036       0.035         1.04
HardSwish::Layer_Elementwise::OCV/CPU    2.914       2.893         1.01
Mish::Layer_Elementwise::OCV/CPU         3.820       3.529         1.08
Selu::Layer_Elementwise::OCV/CPU        10.799       3.593         3.01
Swish::Layer_Elementwise::OCV/CPU        3.651       3.473         1.05
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-19 16:03:19 +03:00
Aliaksei Urbanski
35ca2f78d6
Merge pull request #25880 from Jamim:fix/cuda-no-fp16
Fix CUDA for old GPUs without FP16 support #25880

Fixes #21461

~This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.~
~We shouldn't add an invalid target while building with `CUDA_ARCH_BIN` < 53.~
_(please see [this discussion](https://github.com/opencv/opencv/pull/25880#discussion_r1668074505))_

This is a run-time solution that basically reverts [these lines](d0fe6ad109 (diff-757c5ab6ddf2f99cdd09f851e3cf17abff203aff4107d908c7ad3d0466f39604L245-R245)).

I've debugged these changes, [coupled with other fixes](https://github.com/gentoo/gentoo/pull/37479), on [Gentoo Linux](https://www.gentoo.org/) and [related tests passed](https://github.com/user-attachments/files/16135391/opencv-4.10.0.20240708-224733.log.gz) on my laptop with `GeForce GTX 960M`.

Alternative solution:
  - #21462

_Best regards!_

### Pull Request Readiness Checklist

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] `n/a` There is accuracy test, performance test and test data in opencv_extra repository, if applicable
- [ ] `n/a` The feature is well documented and sample code can be built with the project CMake
2024-07-10 12:39:30 +03:00
Yuantao Feng
e3858cc5a3
Merge pull request #25147 from fengyuentau:dnn/elementwise_layers/speedup
* added v_erf and implemented gelu acceleration via vectorization

* remove anonymous v_erf and use v_erf from intrin_math

* enable perf for ov and cuda backend
2024-07-08 14:24:36 +03:00
Abduragim Shtanchaev
efbc9f0b66
Merge pull request #25861 from Abdurrahheem:ash/torch-attention-export-fix-4x
Merge pull request #25861 from Abdurrahheem:ash/torch-attention-export-fix-4x

Support for Unflatten operation requred by Attention layer - 4.x #25861

### Pull Request Readiness Checklist

All test data and models for PR are located [#1190](https://github.com/opencv/opencv_extra/pull/1190)

This PR fixes issue reised when importing batched  vanilla `Attention` layer from `PyTorch` via ONNX. Currently batched version of `Attention` layer in PyTorch [has unflatten operation inside](e3b3431c42/torch/nn/functional.py (L5500C17-L5500C31)). `unflatten` operation causes issue in `reshape` layer (see the Reshape_2 in the graph below) due to incorrect output of `slice` layer. This PR particularly fixes `slice` and `concat` layers to handle `unflatten` operation. 


<img width="673" alt="image" src="https://github.com/opencv/opencv/assets/44877829/5b612b31-657a-47f1-83a4-0ac35a950abd">


See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-04 16:25:31 +03:00
Alexander Smorkalov
25fb55601b Fixed narrowing conversion warning with MSVC compiler. 2024-07-03 12:10:31 +03:00
Yuantao Feng
a7fd9446cf
Merge pull request #25630 from fengyuentau:nary-multi-thread
dnn: parallelize nary elementwise forward implementation & enable related conformance tests #25630

This PR introduces the following changes:

- [x] Parallelize binary forward impl
- [x] Parallelize ternary forward impl (Where)
- [x] Parallelize nary (Operator that can take >=1 operands)
- [x] Enable conformance tests if workable

## Performance

### i7-12700K, RAM 64GB, Ubuntu 22.04

```
Geometric mean (ms)

                Name of Test                     opencv        opencv        opencv
                                                  perf          perf          perf
                                              core.x64.0606 core.x64.0606 core.x64.0606
                                                                               vs
                                                                             opencv
                                                                              perf
                                                                          core.x64.0606
                                                                           (x-factor)
NCHW_C_sum::Layer_NaryEltwise::OCV/CPU           16.116        11.161         1.44
NCHW_NCHW_add::Layer_NaryEltwise::OCV/CPU        17.469        11.446         1.53
NCHW_NCHW_div::Layer_NaryEltwise::OCV/CPU        17.531        11.469         1.53
NCHW_NCHW_equal::Layer_NaryEltwise::OCV/CPU      28.653        13.682         2.09
NCHW_NCHW_greater::Layer_NaryEltwise::OCV/CPU    21.899        13.422         1.63
NCHW_NCHW_less::Layer_NaryEltwise::OCV/CPU       21.738        13.185         1.65
NCHW_NCHW_max::Layer_NaryEltwise::OCV/CPU        16.172        11.473         1.41
NCHW_NCHW_mean::Layer_NaryEltwise::OCV/CPU       16.309        11.565         1.41
NCHW_NCHW_min::Layer_NaryEltwise::OCV/CPU        16.166        11.454         1.41
NCHW_NCHW_mul::Layer_NaryEltwise::OCV/CPU        16.157        11.443         1.41
NCHW_NCHW_pow::Layer_NaryEltwise::OCV/CPU        163.459       15.234         10.73
NCHW_NCHW_ref_div::Layer_NaryEltwise::OCV/CPU    10.880        10.868         1.00
NCHW_NCHW_ref_max::Layer_NaryEltwise::OCV/CPU    10.947        11.058         0.99
NCHW_NCHW_ref_min::Layer_NaryEltwise::OCV/CPU    10.948        10.910         1.00
NCHW_NCHW_ref_mul::Layer_NaryEltwise::OCV/CPU    10.874        10.871         1.00
NCHW_NCHW_ref_sum::Layer_NaryEltwise::OCV/CPU    10.971        10.920         1.00
NCHW_NCHW_sub::Layer_NaryEltwise::OCV/CPU        17.546        11.462         1.53
NCHW_NCHW_sum::Layer_NaryEltwise::OCV/CPU        16.175        11.475         1.41
NHWC_C::Layer_NaryEltwise::OCV/CPU               11.339        11.333         1.00
NHWC_H::Layer_NaryEltwise::OCV/CPU               16.154        11.102         1.46
```

### Apple M1, RAM 16GB, macOS 14.4.1

```
Geometric mean (ms)

                Name of Test                     opencv          opencv             opencv      
                                                  perf            perf               perf       
                                              core.m1.0606 core.m1.0606.patch core.m1.0606.patch
                                                                                      vs        
                                                                                    opencv      
                                                                                     perf       
                                                                                 core.m1.0606   
                                                                                  (x-factor)    
NCHW_C_sum::Layer_NaryEltwise::OCV/CPU           28.418          3.768               7.54       
NCHW_NCHW_add::Layer_NaryEltwise::OCV/CPU        6.942           5.679               1.22       
NCHW_NCHW_div::Layer_NaryEltwise::OCV/CPU        5.822           5.653               1.03       
NCHW_NCHW_equal::Layer_NaryEltwise::OCV/CPU      5.751           5.628               1.02       
NCHW_NCHW_greater::Layer_NaryEltwise::OCV/CPU    5.797           5.599               1.04       
NCHW_NCHW_less::Layer_NaryEltwise::OCV/CPU       7.272           5.578               1.30       
NCHW_NCHW_max::Layer_NaryEltwise::OCV/CPU        5.777           5.562               1.04       
NCHW_NCHW_mean::Layer_NaryEltwise::OCV/CPU       5.819           5.559               1.05       
NCHW_NCHW_min::Layer_NaryEltwise::OCV/CPU        5.830           5.574               1.05       
NCHW_NCHW_mul::Layer_NaryEltwise::OCV/CPU        5.759           5.567               1.03       
NCHW_NCHW_pow::Layer_NaryEltwise::OCV/CPU       342.260          74.655              4.58       
NCHW_NCHW_ref_div::Layer_NaryEltwise::OCV/CPU    8.338           8.280               1.01       
NCHW_NCHW_ref_max::Layer_NaryEltwise::OCV/CPU    8.359           8.309               1.01       
NCHW_NCHW_ref_min::Layer_NaryEltwise::OCV/CPU    8.412           8.295               1.01       
NCHW_NCHW_ref_mul::Layer_NaryEltwise::OCV/CPU    8.380           8.297               1.01       
NCHW_NCHW_ref_sum::Layer_NaryEltwise::OCV/CPU    8.356           8.323               1.00       
NCHW_NCHW_sub::Layer_NaryEltwise::OCV/CPU        6.818           5.561               1.23       
NCHW_NCHW_sum::Layer_NaryEltwise::OCV/CPU        5.805           5.570               1.04       
NHWC_C::Layer_NaryEltwise::OCV/CPU               3.834           4.817               0.80       
NHWC_H::Layer_NaryEltwise::OCV/CPU               28.402          3.771               7.53
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-03 10:09:05 +03:00
Abduragim Shtanchaev
a8d1373919
Merge pull request #25794 from Abdurrahheem:ash/yolov10-support
Add sample support of YOLOv9 and YOLOv10 in OpenCV #25794

This PR adds sample support of  [`YOLOv9`](https://github.com/WongKinYiu/yolov9) and [`YOLOv10`](https://github.com/THU-MIG/yolov10/tree/main)) in OpenCV. Models for this test are located in this [PR](https://github.com/opencv/opencv_extra/pull/1186). 

**Running YOLOv10 using OpenCV.** 
1. In oder to run `YOLOv10` one needs to cut off postporcessing with dynamic shapes from torch and then convert it to ONNX. If someone is looking for ready solution, there is [this forked branch](https://github.com/Abdurrahheem/yolov10/tree/ash/opencv-export) from official YOLOv10.  Particularty follow this proceduce. 

```bash
git clone git@github.com:Abdurrahheem/yolov10.git
conda create -n yolov10 python=3.9
conda activate yolov10
pip install -r requirements.txt
python export_opencv.py --model=<model-name> --imgsz=<input-img-size>
```
By default `model="yolov10s"` and `imgsz=(480,640)`. This will generate file `yolov10s.onnx`, which can be use for inference in OpenCV

2. For inference part on OpenCV.  one can use `yolo_detector.cpp` [sample](https://github.com/opencv/opencv/blob/4.x/samples/dnn/yolo_detector.cpp). If you have followed above exporting procedure, then you can use following command to run the model. 

``` bash
build opencv from source 
cd build 
./bin/example_dnn_yolo_detector --model=<path-to-yolov10s.onnx-file> --yolo=yolov10 --width=640 --height=480 --input=<path-to-image> --scale=0.003921568627 --padvalue=114
```
If you do not specify `--input` argument, OpenCV will grab first camera that is avaliable on your platform. 
For more deatils on how to run the `yolo_detector.cpp` file see this [guide](https://docs.opencv.org/4.x/da/d9d/tutorial_dnn_yolo.html#autotoc_md443) 


**Running YOLOv9 using OpenCV**

1. Export model following [official guide](https://github.com/WongKinYiu/yolov9)of the YOLOv9 repository. Particularly you can do following for converting.

```bash
git clone https://github.com/WongKinYiu/yolov9.git
cd yolov9
conda create -n yolov9 python=3.9
conda activate yolov9
pip install -r requirements.txt
wget https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-t-converted.pt
python export.py --weights=./yolov9-t-converted.pt --include=onnx --img-size=(480,640) 
```

This will generate <yolov9-t-converted.onnx> file.

2.  Inference on OpenCV.

```bash
build opencv from source 
cd build 
./bin/example_dnn_yolo_detector --model=<path-to-yolov9-t-converted.onnx> --yolo=yolov9 --width=640 --height=480 --scale=0.003921568627 --padvalue=114 --path=<path-to-image>
```

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-02 18:26:34 +03:00
Yuantao Feng
3f13ce797b
Merge pull request #25779 from fengyuentau:dnn/fix_onnx_depthtospace
dnn: add DepthToSpace and SpaceToDepth #25779

We are working on updating WeChat QRCode module. One of the new models is a fully convolutional model and hence it should be able to run with different input shapes. However,  it has an operator `DepthToSpace`, which is parsed as a subgraph of `Reshape -> Permute -> Reshape` with a fixed shape getting during parsing. The subgraph itself is not a problem, but the true problem is the subgraph with a fixed input and output shape regardless input changes. This does not allow the model to run with different input shapes.

Solution is to add a dedicated layer for DepthtoSpace and SpaceToDepth.

Backend support:

- [x] CPU
- [x] CUDA
- [x] OpenCL
- [x] OpenVINO
- [x] CANN
- [x] TIMVX
-  ~Vulkan~ (missing fundamental tools, like permutation and reshape)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-06-21 19:28:22 +03:00
CNOCycle
98b8825031
Merge pull request #25613 from CNOCycle:tflite/ops
Support Global_Pool_2D ops in .tflite model #25613

### Pull Request Readiness Checklist

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1180

This PR adds support for `GlobalAveragePooling2D` and `GlobalMaxPool2D` on the TFlite backend. When the k`eep_dims` option is enabled, the output is a 2D tensor, necessitating the inclusion of an additional flatten layer. Additionally, the names of these layers have been updated to match the output tensor names generated by `generate.py` from the opencv_extra repository.

- [X] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [X] The feature is well documented and sample code can be built with the project CMake
2024-05-31 19:31:21 +03:00
Alexander Smorkalov
5f509e2ec1 Skip Test_Caffe_layers.Concat with Vulkan due to sporadic failures. 2024-05-17 11:54:25 +03:00
CNOCycle
7713c84465
Merge pull request #25297 from CNOCycle:tflite/transpose
Support Transpose op in TFlite #25297

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1168

The purpose of this PR is to introduce support for the Transpose op in TFlite format and to add a shape comparison between the output tensors and the references. In some occasional cases, the shape of the output tensor is `[1,4,1,1]`, while the shape of the reference tensor is `[1,4]`. Consequently, the norm check incorrectly reports that the test has passed, as the residual is zero.

Below is a Python script for generating testing data. The generated data can be integrated into the repo `opencv_extra`.

```python
import numpy as np
import tensorflow as tf

PREFIX_TFL = '/path/to/opencv_extra/testdata/dnn/tflite/'

def generator(input_tensor, model, saved_name):

    # convert keras model to .tflite format
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    #converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.optimizations = [None]
    tflite_model = converter.convert()
    with open(f'{PREFIX_TFL}/{saved_name}.tflite', 'wb') as f:
        f.write(tflite_model)

    # save the input tensor to .npy
    if input_tensor.ndim == 4:
        opencv_tensor = np.transpose(input_tensor, (0,3,1,2))
    else:
        opencv_tensor = input_tensor
    opencv_tensor = np.copy(opencv_tensor, order='C').astype(np.float32)
    np.save(f'{PREFIX_TFL}/{saved_name}_inp.npy', opencv_tensor)

    # generate output tenosr and save it to .npy
    mat_out = model(input_tensor).numpy()
    mat_out = np.copy(mat_out, order='C').astype(np.float32)
    if mat_out.ndim == 4:
        mat_out = np.transpose(mat_out, (0,3,1,2))
    interpreter = tf.lite.Interpreter(model_content=tflite_model)
    out_name = interpreter.get_output_details()[0]['name']
    np.save(f'{PREFIX_TFL}/{saved_name}_out_{out_name}.npy', mat_out)

def build_transpose():

    model_name = "keras_permute"
    mat_in = np.array([[[1,2,3], [4,5,6]]], dtype=np.float32)

    model = tf.keras.Sequential()
    model.add(tf.keras.Input(shape=(2,3)))
    model.add(tf.keras.layers.Permute((2,1)))
    model.summary()

    generator(mat_in, model, model_name)

if __name__ == '__main__':
    build_transpose()
```

### Pull Request Readiness Checklist

- [x] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [X] The feature is well documented and sample code can be built with the project CMake
2024-05-15 20:07:25 +03:00
alexlyulkov
03507e06b4
Merge pull request #25518 from alexlyulkov:al/fixed-gemm-openvino
Fixed OpenVINO gemm layer #25518

Fixed OpenVINO gemm layer
The problem was that our layer didn't properly handle all the possible gemm options in OpenVINO mode
Fixes #25472

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-05-14 17:41:19 +03:00
Alexander Smorkalov
d8e18f4576 Made fcn-resnet50-12.onnx model optional. 2024-05-03 16:14:22 +03:00
Wanli
ed47cce1c5 change fcn8s-heavy-pascal tests from caffe to onnx 2024-05-03 00:15:09 +08:00
alexlyulkov
f9dd20eb07
Merge pull request #25414 from alexlyulkov:al/range-fixed
Fixed ONNX range layer #25414

Partially address https://github.com/opencv/opencv/issues/25363
Fixed ONNX range layer. It should support any input type.
Added tests (extra [PR](https://github.com/opencv/opencv_extra/pull/1170))

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-04-17 09:38:21 +03:00
ecchen
e63690a2d9 Add a shape checker for tflite models 2024-04-08 13:28:05 +00:00
Yuantao Feng
55d7e3f8cc
Merge pull request #1165 from fengyuentau:gold_yolo
[BugFix] dnn (ONNX): Foce dropping constant inputs in parseClip if they are shared #25319

Resolves https://github.com/opencv/opencv/issues/25278
Merge with https://github.com/opencv/opencv_extra/pull/1165

In Gold-YOLO ,`Div` has a constant input `B=6` which is then parsed into a `Const` layer in the ONNX importer, but `Clip` also has the shared constant input `max=6` which is already a `Const` layer and then connected to `Elementwise` layer. This should not happen because in the `forward()` of `Elementwise` layer, the legacy code goes through and apply activation to each input. More details on https://github.com/opencv/opencv/issues/25278#issuecomment-2032199630.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-04-03 15:56:59 +03:00
Dmitry Kurtaev
13c95efa74
Merge pull request #25312 from dkurt:dnn_hotfix_tflite
Ownership check in TFLite importer #25312

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/25310

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-04-03 09:41:40 +03:00
Yuantao Feng
b758897c29
Merge pull request #25271 from fengyuentau:matmul_bias
Merge with https://github.com/opencv/opencv_extra/pull/1158

Todo:

- [x] Fix Attention pattern recognition.
- [x] Handle other backends.

Benchmark:

"VIT_B_32 OCV/CPU", M1, results in milliseconds.

| Model | 4.x | This PR |
| - | - | - |
| VIT_B_32 OCV/CPU | 87.66 | **83.83** |


### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-03-29 17:35:23 +03:00
Alexander Smorkalov
9fc4b61074
Merge pull request #25291 from dkurt:einsum_openvino
Einsum OpenVINO backend
2024-03-29 15:54:26 +03:00
Dmitry Kurtaev
cfa42e4338 Einsum OpenVINO backend 2024-03-29 14:29:45 +03:00
Dmitry Kurtaev
01dc010436
Merge pull request #25273 from dkurt:tflite_new_layers
TFLite new layers #25273

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/25272, https://github.com/opencv/opencv/issues/24965

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1160

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-03-29 11:21:13 +03:00
Dmitry Kurtaev
0b6c9a2123
Merge pull request #25181 from dkurt:release_conv_weights
Release convolution weightsMat after usage #25181

### Pull Request Readiness Checklist

related (but not resolved): https://github.com/opencv/opencv/issues/24134

Minor memory footprint improvement. Also, adds a test for VmHWM.

RAM top memory usage (-230MB)

| YOLOv3 (237MB file) |   4.x   |    PR   |
|---------------------|---------|---------|
| no winograd         | 808 MB  | 581 MB  |
| winograd            | 1985 MB | 1750 MB |

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-03-25 09:03:28 +03:00
Oleg Pipikin
6da2ddcf0e Fix for OpenVINO 2024.0
Remove support OpenVINO lower than 2022.1 release
Remove legacy InferenceEngine wrappers
2024-03-18 15:05:50 +04:00
Laurent Berger
5fe3933346
Merge pull request #25120 from LaurentBerger:I25103
Fixed ReduceMean layer behaviour #25120

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake

a93c31e3c9/onnxruntime/core/providers/cpu/reduction/reduction_ops.cc (L433-L443)
2024-03-04 09:36:53 +03:00
Alexander Alekhin
efc9837df1
Merge pull request #24892 from opencv-pushbot:gitee/alalek/dnn_avoid_16s_usage
DNN: avoid CV_16S usage for FP16 #24892

**Merge after**: #24918

TODO:
- [x] measure performance changes
- [x] optimize convertTo for OpenCL: #24918

12700K iGPU:

|Name of Test|0|1|1 vs 0 (x-factor)|
|---|:-:|:-:|:-:|
|AlexNet::DNNTestNetwork::OCV/OCL_FP16|7.441|7.480|0.99|
|CRNN::DNNTestNetwork::OCV/OCL_FP16|10.776|10.736|1.00|
|DenseNet_121::DNNTestNetwork::OCV/OCL_FP16|52.762|52.833|1.00|
|EAST_text_detection::DNNTestNetwork::OCV/OCL_FP16|60.694|60.721|1.00|
|EfficientNet::DNNTestNetwork::OCV/OCL_FP16|33.373|33.173|1.01|
|FastNeuralStyle_eccv16::DNNTestNetwork::OCV/OCL_FP16|81.840|81.724|1.00|
|GoogLeNet::DNNTestNetwork::OCV/OCL_FP16|20.965|20.927|1.00|
|Inception_5h::DNNTestNetwork::OCV/OCL_FP16|22.204|22.173|1.00|
|Inception_v2_SSD_TensorFlow::DNNTestNetwork::OCV/OCL_FP16|47.115|47.460|0.99|
|MPHand::DNNTestNetwork::OCV/OCL_FP16|6.760|6.670|1.01|
|MPPalm::DNNTestNetwork::OCV/OCL_FP16|10.188|10.171|1.00|
|MPPose::DNNTestNetwork::OCV/OCL_FP16|12.510|12.561|1.00|
|MobileNet_SSD_Caffe::DNNTestNetwork::OCV/OCL_FP16|17.290|17.072|1.01|
|MobileNet_SSD_v1_TensorFlow::DNNTestNetwork::OCV/OCL_FP16|19.473|19.306|1.01|
|MobileNet_SSD_v2_TensorFlow::DNNTestNetwork::OCV/OCL_FP16|22.874|23.404|0.98|
|OpenFace::DNNTestNetwork::OCV/OCL_FP16|9.568|9.517|1.01|
|OpenPose_pose_mpi_faster_4_stages::DNNTestNetwork::OCV/OCL_FP16|539.899|539.845|1.00|
|PPHumanSeg::DNNTestNetwork::OCV/OCL_FP16|18.015|18.769|0.96|
|PPOCRv3::DNNTestNetwork::OCV/OCL_FP16|63.122|63.540|0.99|
|ResNet_50::DNNTestNetwork::OCV/OCL_FP16|34.947|34.925|1.00|
|SFace::DNNTestNetwork::OCV/OCL_FP16|10.249|10.206|1.00|
|SSD::DNNTestNetwork::OCV/OCL_FP16|213.068|213.108|1.00|
|SqueezeNet_v1_1::DNNTestNetwork::OCV/OCL_FP16|4.867|4.878|1.00|
|VIT_B_32::DNNTestNetwork::OCV/OCL_FP16|200.563|190.788|1.05|
|VitTrack::DNNTestNetwork::OCV/OCL_FP16|7.528|7.173|1.05|
|YOLOX::DNNTestNetwork::OCV/OCL_FP16|132.858|132.701|1.00|
|YOLOv3::DNNTestNetwork::OCV/OCL_FP16|209.559|208.809|1.00|
|YOLOv4::DNNTestNetwork::OCV/OCL_FP16|221.357|220.924|1.00|
|YOLOv4_tiny::DNNTestNetwork::OCV/OCL_FP16|24.446|24.382|1.00|
|YOLOv5::DNNTestNetwork::OCV/OCL_FP16|43.922|44.080|1.00|
|YOLOv8::DNNTestNetwork::OCV/OCL_FP16|64.159|63.842|1.00|
|YuNet::DNNTestNetwork::OCV/OCL_FP16|10.177|10.231|0.99|
|opencv_face_detector::DNNTestNetwork::OCV/OCL_FP16|15.121|15.445|0.98|

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2024-01-26 16:34:17 +03:00
Sean McBride
e64857c561
Merge pull request #23736 from seanm:c++11-simplifications
Removed all pre-C++11 code, workarounds, and branches #23736

This removes a bunch of pre-C++11 workrarounds that are no longer necessary as C++11 is now required.
It is a nice clean up and simplification.

* No longer unconditionally #include <array> in cvdef.h, include explicitly where needed
* Removed deprecated CV_NODISCARD, already unused in the codebase
* Removed some pre-C++11 workarounds, and simplified some backwards compat defines
* Removed CV_CXX_STD_ARRAY
* Removed CV_CXX_MOVE_SEMANTICS and CV_CXX_MOVE
* Removed all tests of CV_CXX11, now assume it's always true. This allowed removing a lot of dead code.
* Updated some documentation consequently.
* Removed all tests of CV_CXX11, now assume it's always true
* Fixed links.

---------

Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com>
Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
2024-01-19 16:53:08 +03:00
Abduragim
d30bf1bc3c added test for yolo nas 2024-01-17 13:01:43 +03:00
Alexander Smorkalov
26cf82a56c Normalize axis parameter in DNN Concat to handle negative values. 2024-01-16 12:22:22 +03:00
jimmylaw21
a7fa1e6f4b
Merge pull request #24610 from jimmylaw21:dnn-onnx-add-group-norm-layer
dnn onnx: add group norm layer #24610

dnn onnx: add group norm layer

Todo:

- [x] speed up by multi-threading
- [x] add perf
- [x] add backend: OpenVINO
- [x] add backend: CUDA
- [x] add backend: OpenCL (no fp16)
- [ ] add backend: CANN

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake

Co-authored-by: fengyuentau <yuantao.feng@opencv.org.cn>
2024-01-12 15:13:26 +03:00
Yuantao Feng
e7ccff9805
Merge pull request #24834 from fengyuentau:cuda_naryeltwise_broadcast
dnn (cuda): support broadcasting if a.rank() != b.rank() #24834

Inspired by https://github.com/opencv/opencv/pull/24786. This PR keeps the fusion of `NaryEltwise` and `Concat` while addressed the data missing problem via supporting broadcasting if a.rank() != b.rank().

Resolves https://github.com/opencv/opencv/issues/23977
Resolves https://github.com/opencv/opencv/issues/24606
Resolves https://github.com/opencv/opencv/issues/24635
Resolves https://github.com/opencv/opencv/issues/24721 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-01-11 10:04:46 +03:00
Yuantao Feng
7fb336322d
Merge pull request #24808 from fengyuentau:fix_layernorm
dnn: no layer norm fusion if axes.back() is not the axis of last dimension #24808

Merge with https://github.com/opencv/opencv_extra/pull/1137

Resolves https://github.com/opencv/opencv/issues/24797

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-01-10 13:01:00 +03:00
Yuantao Feng
c955564cb3
Merge pull request #24765 from fengyuentau:mod_operator
dnn onnx: add mod #24765

Resolves https://github.com/opencv/opencv/issues/23174

TODO:

- [x] enable some conformance tests
- [x] add backends
    - [x] CANN
    - [x] OpenVINO
    - [x] CUDA

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-01-09 19:00:17 +03:00
Abduragim Shtanchaev
3b26e183cb changed weights of yolov7 2023-12-28 23:03:47 +03:00
Yuantao Feng
f978c99523
Merge pull request #24753 from fengyuentau:einsum_importer
dnn onnx: support constaint inputs in einsum importer #24753 

Merge with https://github.com/opencv/opencv_extra/pull/1132.

Resolves https://github.com/opencv/opencv/issues/24697

Credits to @LaurentBerger.

---

This is a workaround. I suggest to get input shapes and calculate the output shapes in `getMemoryShapes` so as to keep the best compatibility. It is not always robust getting shapes during the importer stage and we should avoid that as much as possible.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-25 14:42:05 +03:00
Alexander Alekhin
f49b26182b dnn(test): skip very long debug tests, reduce test time 2023-12-25 08:44:06 +00:00
Alexander Alekhin
96b894e0e1 Merge pull request #24761 from opencv-pushbot:gitee/alalek/test_skip_update_win32 2023-12-25 08:27:30 +00:00
Alexander Alekhin
f8502d45f9 dnn(test): skip tests on 32-bit Windows 2023-12-25 07:23:45 +00:00
Alexander Smorkalov
953dddd26b
Merge pull request #24747 from asmorkalov:as/tune_vitb_cuda
Increate Vit_b test threshold a bit for CUDA FP16.
2023-12-22 17:04:46 +03:00
Dhanwanth1803
027aee8ad4
Merge pull request #24384 from Dhanwanth1803:feat-crop
Fixes #22747. Support [crop] configuration for DarkNet #24384

Request for comments. This is my first PR. 

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1112

resolves https://github.com/opencv/opencv/issues/22747

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-12-22 14:55:01 +03:00
Alexander Smorkalov
53cd921ab4 Increate Vit_b test threshold a bit for CUDA FP16. 2023-12-22 13:37:44 +03:00
Alexander Alekhin
c9bb92d58b dnn(test): tune FP16 test tolerance 2023-12-21 13:39:05 +00:00
Yuantao Feng
0521a3a384
Merge pull request #24476 from fengyuentau:attention_layer
dnn: add attention layer #24476

Resolves #24609

Merge with: https://github.com/opencv/opencv_extra/pull/1128.

Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.

TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
    - [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack

## Benchmarks

Platform: Macbook Air M1.

### Attention Subgraph

Input scale: [1, 197, 768].

|                        | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75      | 3.68        | 3.22     |
| w/o Attention          | 9.06      | 9.01        | 8.24     |
| ORT (python)           | 4.32      | 2.63        | 2.50     |

### ViTs

All data in millisecond (ms).

| ViTs     | With Attention | Without Attention | ORT    |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77         | 365.35            | 109.70 |
| vit_b_32 | 89.92          | 116.22            | 30.36  |
| vit_l_16 | 1593.32        | 1730.74           | 419.92 |
| vit_l_32 | 468.11         | 577.41            | 134.12 |
| VitTrack | 3.80           | 3.87              | 2.25   |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-20 19:35:07 +03:00
Laurent Berger
3e6dcdc0a4
Merge pull request #24539 from LaurentBerger:blobrecttoimage
Add blobrecttoimage #24539

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/14659

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work #14659
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-19 20:00:04 +03:00
Wanli
6ae1709c6a
Merge pull request #24613 from WanliZhong:softmax_default_axis
Make default axis of softmax in onnx "-1" without opset option #24613

Try to solve problem: https://github.com/opencv/opencv/pull/24476#discussion_r1404821158

**ONNX**
`opset <= 11` use 1
`else` use -1

**TensorFlow**
`TF version = 2.x` use -1
`else` use 1

**Darknet, Caffe, Torch**
use 1 by definition
2023-12-15 10:41:42 +03:00
Abduragim Shtanchaev
d3dd2e463c
Merge pull request #24611 from Abdurrahheem:ash/add_yolov6_test
Add test for YoloX Yolo v6 and Yolo v8 #24611

This PR adds test for YOLOv6 model (which was absent before)
The onnx weights for the test are located in this PR [ #1126](https://github.com/opencv/opencv_extra/pull/1126)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-11 16:42:51 +03:00
Abduragim Shtanchaev
5278560252
Merge pull request #24569 from Abdurrahheem:ash/padding_value_fix
Add support for custom padding in DNN preprocessing #24569

This PR add functionality for specifying value in padding.
It is required in many preprocessing pipelines in DNNs such as Yolox object detection model

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-28 11:54:09 +03:00
Dmitry Kurtaev
332748dd55
Merge pull request #24577 from dkurt:dnn_graph_match_stack
Fix graph fusion with commutative ops #24577

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/24568

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1125

TODO:
- [x]  replace recursive function to sequential

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-24 10:40:32 +03:00
skycat8
848dd12a1f
Merge pull request #24553 from skycat8:yolov5
Add yolov5n to tests #24553

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ X] I agree to contribute to the project under Apache 2 License.
- [ X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ X] The PR is proposed to the proper branch
- [ X] There is a reference to the original bug report and related work
- [ X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ X] The feature is well documented and sample code can be built with the project CMake
2023-11-24 10:36:06 +03:00
Yuantao Feng
d05fb709f9
Merge pull request #24552 from fengyuentau:layernorm_backends
dnn: add openvino, opencl and cuda backends for layer normalization layer #24552

Merge after https://github.com/opencv/opencv/pull/24544.

Todo:

- [x] openvino
- [x] opencl
- [x] cuda

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-21 15:33:01 +03:00