Commit Graph

2225 Commits

Author SHA1 Message Date
Alexander Smorkalov
84bb1cda4e
Merge pull request #24865 from asmorkalov:as/dnn_concat_assert
Normalize axis parameter in DNN Concat to handle negative values
2024-01-16 14:39:28 +03:00
Alexander Smorkalov
26cf82a56c Normalize axis parameter in DNN Concat to handle negative values. 2024-01-16 12:22:22 +03:00
Alexander Smorkalov
99c86bb40c
Merge pull request #24556 from plctlab:rvp
Optimization based on RISC-V P Packed SIMD Extension v0.5.2
2024-01-16 11:36:31 +03:00
Alexander Smorkalov
68dc02e302
Merge pull request #24858 from Dhanwanth1803:avx-fix
Use AVX2 overload instread on AVX in AVX2 scope
2024-01-16 09:14:31 +03:00
Dhanwanth1803
a289eba357 Fixes #24677 2024-01-13 09:56:56 +05:30
Stefan Dragnev
2791bb7062
Merge pull request #24773 from tailsu:sd/pathlike
python: accept path-like objects wherever file names are expected #24773

Merry Christmas, all 🎄

Implements #15731

Support is enabled for all arguments named `filename` or `filepath` (case-insensitive), or annotated with `CV_WRAP_FILE_PATH`.

Support is based on `PyOS_FSPath`, which is available in Python 3.6+. When running on older Python versions the arguments must have a `str` value as before.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-01-12 16:23:05 +03:00
jimmylaw21
a7fa1e6f4b
Merge pull request #24610 from jimmylaw21:dnn-onnx-add-group-norm-layer
dnn onnx: add group norm layer #24610

dnn onnx: add group norm layer

Todo:

- [x] speed up by multi-threading
- [x] add perf
- [x] add backend: OpenVINO
- [x] add backend: CUDA
- [x] add backend: OpenCL (no fp16)
- [ ] add backend: CANN

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake

Co-authored-by: fengyuentau <yuantao.feng@opencv.org.cn>
2024-01-12 15:13:26 +03:00
Alexander Smorkalov
97c418ab86
Merge pull request #24840 from fengyuentau:ocl_innerproduct
dnn (opencl): integrate bias handling in the inner product opencl kernel
2024-01-12 15:10:16 +03:00
Abduragim Shtanchaev
c923c59833
Merge pull request #24812 from Abdurrahheem:ash/einsum_bachedGemm
Replace interactive batched Matrix Multiply. #24812

This PR replaces iterative batch matrix multiplication which `FastGemmBatch` in Einsum layer.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-01-12 14:23:43 +03:00
Yuantao Feng
e7ccff9805
Merge pull request #24834 from fengyuentau:cuda_naryeltwise_broadcast
dnn (cuda): support broadcasting if a.rank() != b.rank() #24834

Inspired by https://github.com/opencv/opencv/pull/24786. This PR keeps the fusion of `NaryEltwise` and `Concat` while addressed the data missing problem via supporting broadcasting if a.rank() != b.rank().

Resolves https://github.com/opencv/opencv/issues/23977
Resolves https://github.com/opencv/opencv/issues/24606
Resolves https://github.com/opencv/opencv/issues/24635
Resolves https://github.com/opencv/opencv/issues/24721 

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-01-11 10:04:46 +03:00
fengyuentau
83acb656f1 integrate bias handling in ocl kernel 2024-01-11 11:15:17 +08:00
Yuantao Feng
7fb336322d
Merge pull request #24808 from fengyuentau:fix_layernorm
dnn: no layer norm fusion if axes.back() is not the axis of last dimension #24808

Merge with https://github.com/opencv/opencv_extra/pull/1137

Resolves https://github.com/opencv/opencv/issues/24797

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-01-10 13:01:00 +03:00
Yuantao Feng
c955564cb3
Merge pull request #24765 from fengyuentau:mod_operator
dnn onnx: add mod #24765

Resolves https://github.com/opencv/opencv/issues/23174

TODO:

- [x] enable some conformance tests
- [x] add backends
    - [x] CANN
    - [x] OpenVINO
    - [x] CUDA

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-01-09 19:00:17 +03:00
Abduragim Shtanchaev
3b26e183cb changed weights of yolov7 2023-12-28 23:03:47 +03:00
cudawarped
7d681cf80d build: first class cuda support 2023-12-26 09:39:18 +03:00
Alexander Smorkalov
62f1a7410d
Merge pull request #24766 from asmorkalov:update_version_4.9.0-pre
pre: OpenCV 4.9.0 (version++)
2023-12-25 16:04:53 +03:00
Alexander Smorkalov
b407c58b96 pre: OpenCV 4.9.0 (version++). 2023-12-25 15:20:10 +03:00
Yuantao Feng
f978c99523
Merge pull request #24753 from fengyuentau:einsum_importer
dnn onnx: support constaint inputs in einsum importer #24753 

Merge with https://github.com/opencv/opencv_extra/pull/1132.

Resolves https://github.com/opencv/opencv/issues/24697

Credits to @LaurentBerger.

---

This is a workaround. I suggest to get input shapes and calculate the output shapes in `getMemoryShapes` so as to keep the best compatibility. It is not always robust getting shapes during the importer stage and we should avoid that as much as possible.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-25 14:42:05 +03:00
Alexander Alekhin
f49b26182b dnn(test): skip very long debug tests, reduce test time 2023-12-25 08:44:06 +00:00
Alexander Alekhin
96b894e0e1 Merge pull request #24761 from opencv-pushbot:gitee/alalek/test_skip_update_win32 2023-12-25 08:27:30 +00:00
Alexander Alekhin
f8502d45f9 dnn(test): skip tests on 32-bit Windows 2023-12-25 07:23:45 +00:00
Alexander Smorkalov
953dddd26b
Merge pull request #24747 from asmorkalov:as/tune_vitb_cuda
Increate Vit_b test threshold a bit for CUDA FP16.
2023-12-22 17:04:46 +03:00
Dmitry Kurtaev
938bc4d503 [CUDA] Hotfix Scale with 1 parameter 2023-12-22 15:49:27 +03:00
Dhanwanth1803
027aee8ad4
Merge pull request #24384 from Dhanwanth1803:feat-crop
Fixes #22747. Support [crop] configuration for DarkNet #24384

Request for comments. This is my first PR. 

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1112

resolves https://github.com/opencv/opencv/issues/22747

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-12-22 14:55:01 +03:00
Alexander Smorkalov
53cd921ab4 Increate Vit_b test threshold a bit for CUDA FP16. 2023-12-22 13:37:44 +03:00
Vadim Pisarevsky
853e5dfcdf
Merge pull request #24709 from vpisarev:winograd_mode
Try to enable Winograd by default in FP32 mode and disable it by default in FP16 mode #24709

Hopefully, it will resolve regressions since 4.8.1 (see also https://github.com/opencv/opencv/pull/24587)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-12-22 09:22:31 +03:00
Alexander Smorkalov
35e2ef8019
Merge pull request #24740 from opencv-pushbot:gitee/alalek/ocl_fix_kernel_compilation
ocl: fix kernels compilation
2023-12-22 09:20:47 +03:00
Alexander Smorkalov
f5d8245801
Merge pull request #24736 from opencv-pushbot:gitee/alalek/issue_24734
dnn(ocl): don't try KERNEL_TYPE_GEMM_LIKE with kernel_w > 16
2023-12-21 20:01:01 +03:00
Alexander Alekhin
3340c71a2a ocl: fix kernels compilation 2023-12-21 14:29:23 +00:00
Alexander Alekhin
c9bb92d58b dnn(test): tune FP16 test tolerance 2023-12-21 13:39:05 +00:00
Alexander Alekhin
99c94d3d83 dnn(ocl): don't try KERNEL_TYPE_GEMM_LIKE with kernel_w > 16
- OpenCL kernel code doesn't support that
2023-12-21 13:30:57 +00:00
llh721113
a30c987f87 feat: RVP052 Optimization for DNN int8layers 2023-12-21 14:51:41 +08:00
Yuantao Feng
0521a3a384
Merge pull request #24476 from fengyuentau:attention_layer
dnn: add attention layer #24476

Resolves #24609

Merge with: https://github.com/opencv/opencv_extra/pull/1128.

Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.

TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
    - [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack

## Benchmarks

Platform: Macbook Air M1.

### Attention Subgraph

Input scale: [1, 197, 768].

|                        | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75      | 3.68        | 3.22     |
| w/o Attention          | 9.06      | 9.01        | 8.24     |
| ORT (python)           | 4.32      | 2.63        | 2.50     |

### ViTs

All data in millisecond (ms).

| ViTs     | With Attention | Without Attention | ORT    |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77         | 365.35            | 109.70 |
| vit_b_32 | 89.92          | 116.22            | 30.36  |
| vit_l_16 | 1593.32        | 1730.74           | 419.92 |
| vit_l_32 | 468.11         | 577.41            | 134.12 |
| VitTrack | 3.80           | 3.87              | 2.25   |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-20 19:35:07 +03:00
Laurent Berger
3e6dcdc0a4
Merge pull request #24539 from LaurentBerger:blobrecttoimage
Add blobrecttoimage #24539

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/14659

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work #14659
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-19 20:00:04 +03:00
Yuantao Feng
fa5ed62a66
Merge pull request #24694 from fengyuentau:matmul_refactor
dnn: refactor ONNX MatMul with fastGemm #24694

Done:
- [x] add backends
    - [x] CUDA
    - [x] OpenVINO
    - [x] CANN
    - [x] OpenCL
    - [x] Vulkan
- [x] add perf tests
- [x] const B case

### Benchmark

Tests are done on M1. All data is in milliseconds (ms).

| Configuration | MatMul (Prepacked) | MatMul | InnerProduct |
| - | - | - | - |
| A=[12, 197, 197], B=[12, 197, 64], trans_a=0, trans_b=0 | **0.39** | 0.41 | 1.33 |
| A=[12, 197, 64], B=[12, 64, 197], trans_a=0, trans_b=0  | **0.42** | 0.42 | 1.17 |
| A=[12, 50, 64], B=[12, 64, 50], trans_a=0, trans_b=0    | **0.13** | 0.15 | 0.33 |
| A=[12, 50, 50], B=[12, 50, 64], trans_a=0, trans_b=0    | **0.11** | 0.13 | 0.22 |
| A=[16, 197, 197], B=[16, 197, 64], trans_a=0, trans_b=0 | **0.46** | 0.54 | 1.46 |
| A=[16, 197, 64], B=[16, 64, 197], trans_a=0, trans_b=0  | **0.46** | 0.95 | 1.74 |
| A=[16, 50, 64], B=[16, 64, 50], trans_a=0, trans_b=0    | **0.18** | 0.32 | 0.43 |
| A=[16, 50, 50], B=[16, 50, 64], trans_a=0, trans_b=0    | **0.15** | 0.25 | 0.25 |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-19 19:36:41 +03:00
Wanli
6ae1709c6a
Merge pull request #24613 from WanliZhong:softmax_default_axis
Make default axis of softmax in onnx "-1" without opset option #24613

Try to solve problem: https://github.com/opencv/opencv/pull/24476#discussion_r1404821158

**ONNX**
`opset <= 11` use 1
`else` use -1

**TensorFlow**
`TF version = 2.x` use -1
`else` use 1

**Darknet, Caffe, Torch**
use 1 by definition
2023-12-15 10:41:42 +03:00
Wanli
9bbc890d96
Merge pull request #24681 from WanliZhong:err_armv8
Fixed armv8 compilation warnings #24681 

Fixes the following warning on  armv8:
```
warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
```
Buildbot: https://pullrequest.opencv.org/buildbot/builders/4_x_ARMv8-lin
2023-12-12 15:38:07 +03:00
Wanli
6ee71fee88
Merge pull request #24547 from WanliZhong:refactor_conv_perf_test
Classify and extend convolution and depthwise performance tests #24547

This PR aims to:
1. Extend the test cases from models: `YOLOv5`, `YOLOv8`, `EfficientNet`, `YOLOX`, `YuNet`, `SFace`, `MPPalm`, `MPHand`, `MPPose`, `ViTTrack`, `PPOCRv3`, `CRNN`, `PPHumanSeg`. (371 new test cases are added)

2. Classify the existing convolution performance test to below cases
    - CONV_1x1
    - CONV_3x3_S1_D1 (winograd)
    - CONV
    - DEPTHWISE

3. Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned):
(i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved.
(ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]`
(iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 31], [32, 63], [64, 95]... `

> **Reproduced**: 1. follow step in https://github.com/alalek/opencv/commit/dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code [classify conv.txt](https://github.com/opencv/opencv/files/13522228/classify.conv.txt)


**Performance test result on Apple M2**

**Test result details**:  [M2.md](https://github.com/opencv/opencv/files/13379189/M2.md)

**Additional test result details with FP16**:  [m2_results_with_fp16.zip](https://github.com/opencv/opencv/files/13491070/m2_results_with_fp16.zip)


**Brief summary for 4.8.1 vs 4.7.0 or 4.6.0**: 
1. `CONV_1x1_S1_D1` dropped significant with small or large input shape.
2. `DEPTHWISE_5x5 ` dropped a little compared with 4.7.0. 

---

**Performance test result on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)**: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.

**Test result details**: [INTEL.md](https://github.com/opencv/opencv/files/13374093/INTEL.md)
**Brief summary for 4.8.1 vs 4.5.5**: 
1. `CONV_5x5_S1_D1` dropped significant. 
2. `CONV_1x1_S1_D1`, `CONV_3x3_S1_D1`, `DEPTHWISE_3x3_S1_D1`, `DEPTHWISW_3x3_S2_D1` dropped with small input shape.

---

TODO:
- [x] Perform tests on arm with each opencv version
- [x] Perform tests on x86 with each opencv version
- [x] Split each test classification with single test config
- [x] test enable fp16
2023-12-11 21:35:33 +03:00
Abduragim Shtanchaev
d3dd2e463c
Merge pull request #24611 from Abdurrahheem:ash/add_yolov6_test
Add test for YoloX Yolo v6 and Yolo v8 #24611

This PR adds test for YOLOv6 model (which was absent before)
The onnx weights for the test are located in this PR [ #1126](https://github.com/opencv/opencv_extra/pull/1126)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-11 16:42:51 +03:00
Dmitry Kurtaev
ac4b26a561 Replace Slice optional inputs removal to adjustment 2023-12-08 23:29:52 +03:00
Yuantao Feng
a2edf4d929
Merge pull request #24647 from fengyuentau:cuda_sub
dnn cuda: support Sub #24647

Related https://github.com/opencv/opencv/issues/24606#issuecomment-1837390257

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-06 13:46:24 +03:00
Yuantao Feng
f5ec92e4ca
Merge pull request #24655 from fengyuentau:graph_simplifier_optional_input
dnn onnx graph simplifier: handle optional inputs of Slice #24655

Resolves https://github.com/opencv/opencv/issues/24609

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-06 13:43:54 +03:00
Alexander Smorkalov
7b1a5fb3de Migrate Android Face Detection sample to DNN. 2023-11-29 11:02:44 +03:00
Abduragim Shtanchaev
5278560252
Merge pull request #24569 from Abdurrahheem:ash/padding_value_fix
Add support for custom padding in DNN preprocessing #24569

This PR add functionality for specifying value in padding.
It is required in many preprocessing pipelines in DNNs such as Yolox object detection model

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-28 11:54:09 +03:00
Dmitry Kurtaev
332748dd55
Merge pull request #24577 from dkurt:dnn_graph_match_stack
Fix graph fusion with commutative ops #24577

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/24568

**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1125

TODO:
- [x]  replace recursive function to sequential

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-24 10:40:32 +03:00
skycat8
848dd12a1f
Merge pull request #24553 from skycat8:yolov5
Add yolov5n to tests #24553

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ X] I agree to contribute to the project under Apache 2 License.
- [ X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ X] The PR is proposed to the proper branch
- [ X] There is a reference to the original bug report and related work
- [ X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ X] The feature is well documented and sample code can be built with the project CMake
2023-11-24 10:36:06 +03:00
Yuantao Feng
d05fb709f9
Merge pull request #24552 from fengyuentau:layernorm_backends
dnn: add openvino, opencl and cuda backends for layer normalization layer #24552

Merge after https://github.com/opencv/opencv/pull/24544.

Todo:

- [x] openvino
- [x] opencl
- [x] cuda

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-21 15:33:01 +03:00
zihaomu
b913e73d04
DNN: add the Winograd fp16 support (#23654)
* add Winograd FP16 implementation

* fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime

* fixed some warnings

* hopefully fixed winograd on x64 (and maybe other platforms)

---------

Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
2023-11-20 13:45:37 +03:00
Yuantao Feng
a478757483
Merge pull request #24544 from fengyuentau:layernorm_conformance
dnn test: move layer norm tests into conformance tests #24544

Merge with https://github.com/opencv/opencv_extra/pull/1122

## Motivation

Some ONNX operators, such as `LayerNormalization`, `BatchNormalization` and so on, produce outputs for training (mean, stdev). So they have reference outputs of conformance tests for those training outputs as well. However, when it comes to inference, we do not need and produce those outputs for training here in dnn. Hence, output size does not match if we use dnn to infer those conformance models. This has become the barrier if we want to test these operators using their conformance tests.

<!--
| Operator                | Inference needed                    | Outputs (required - total) | Optional outputs for training? |
| ----------------------- | ----------------------------------- | -------------------------- | ------------------------------ |
| BatchNormalization      | Yes                                 | 1 - 3                      | Yes                            |
| Dropout                 | Maybe, can be eliminated via fusion | 1 - 2                      | Yes                            |
| GRU                     | Yes                                 | 0 - 2                      | No                             |
| LSTM                    | Yes                                 | 0 - 3                      | No                             |
| LayerNormalization      | Yes                                 | 1 - 3                      | Yes                            |
| MaxPool                 | Yes                                 | 1 - 2                      | Yes                            |
| RNN                     | Yes                                 | 0 - 2                      | No                             |
| SoftmaxCrossEntropyLoss | No                                  | 1 - 2                      | --                             |
-->

**I checked all ONNX operators with optional outputs. Turns out there are only `BatchNormalization`, `Dropout`, `LayerNormalization` and `MaxPool` has optional outputs for training. All except `LayerNormalization` have models set for training mode and eval mode. Blame ONNX for that.**

## Solution

In this pull request, we remove graph outputs if the graph looks like the following:

```
    [X]   [Scale]  [Bias]                      [X]   [Scale]  [Bias]
      \      |      /         this patch         \      |      /
     LayerNormalization      ----------->       LayerNormalization
      /      |      \                                   |
    [Y]    [Mean]  [Stdev]                             [Y]
```

We can update conformance tests and turn on some cases as well if extending to more layers.

Notes:
1. This workaround does not solve expanded function operators if they are fused into a single operator, such as `$onnx/onnx/backend/test/data/node/test_layer_normalization_2d_axis1_expanded`, but they can be run without fusion. Note that either dnn or onnxruntime does not fuse those expanded function operators.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-20 11:19:24 +03:00
Abduragim Shtanchaev
8c10545d3c
Merge pull request #24509 from Abdurrahheem:ash/dev_einsum_fast_gemm
Fast gemm for einsum #24509

## This PR adds performance tests for Einsum Layer with FastGemm. See below results of performance test on different inputs

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-16 16:20:17 +03:00