Change opencv_face_detector related tests and samples from caffe to onnx #25463
Part of https://github.com/opencv/opencv/issues/25314
This PR aims to change the tests related to opencv_face_detector from caffe framework to onnx. Tests in `test_int8_layer.cpp` and `test_caffe_importer.cpp` will be removed in https://github.com/opencv/opencv/pull/25323
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Added int32, int64 support and type inference to dnn #24411
**Added a type inference to dnn similar to the shape inference, added int32 and int64 support.**
- Added getTypes method for layers that calculates layer outputs types and internals types from inputs types (Similar to getMemoryShapes). By default outputs and internals types = input[0] type
- Added type inference pipeline similar to shape inference pipeline. LayersShapes struct (that is used in shape inference pipeline) now contains both shapes and types
- All layers output blobs are now allocated using the calculated types from the type inference.
- Inputs and constants with int32 and int64 types are not automatically converted into float32 now.
- Added int32 and int64 support for all the layers with indexing and for all the layers required in tests.
Added int32 and int64 support for CUDA:
- Added host<->device data moving for int32 and int64
- Added int32 and int64 support for several layers (just slightly modified CUDA C++ templates)
Passed all the accuracy tests on CPU, OCL, OCL_FP16, CUDA, CUDA_FP16. (except RAFT model)
**CURRENT PROBLEMS**:
- ONNX parser always converts int64 constants and layers attributes to int32, so some models with int64 constants doesn't work (e.g. RAFT). The solution is to disable int64->int32 conversion and fix attributes reading in a lot of ONNX layers parsers (https://github.com/opencv/opencv/issues/25102)
- I didn't add type inference and int support to VULCAN, so it doesn't work at all now.
- Some layers don't support int yet, so some unknown models may not work.
**CURRENT WORKAROUNDS**:
- CPU arg_layer indides are implemented in int32 followed by a int32->int64 conversion (the master branch has the same workaround with int32->float conversion)
- CPU and OCL pooling_layer indices are implemented in float followed by a float->int64 conversion
- CPU gather_layer indices are implemented in int32, so int64 indices are converted to int32 (the master branch has the same workaround with float->int32 conversion)
**DISABLED TESTS**:
- RAFT model
**REMOVED TESTS**:
- Greater_input_dtype_int64 (because it doesn't fit ONNX rules, the whole test is just comparing float tensor with int constant)
**TODO IN NEXT PULL REQUESTS**:
- Add int64 support for ONNX parser
- Add int support for more layers
- Add int support for OCL (currently int layers just run on CPU)
- Add int tests
- Add int support for other backends
dnn: add attention layer #24476Resolves#24609
Merge with: https://github.com/opencv/opencv_extra/pull/1128.
Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.
TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
- [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack
## Benchmarks
Platform: Macbook Air M1.
### Attention Subgraph
Input scale: [1, 197, 768].
| | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75 | 3.68 | 3.22 |
| w/o Attention | 9.06 | 9.01 | 8.24 |
| ORT (python) | 4.32 | 2.63 | 2.50 |
### ViTs
All data in millisecond (ms).
| ViTs | With Attention | Without Attention | ORT |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77 | 365.35 | 109.70 |
| vit_b_32 | 89.92 | 116.22 | 30.36 |
| vit_l_16 | 1593.32 | 1730.74 | 419.92 |
| vit_l_32 | 468.11 | 577.41 | 134.12 |
| VitTrack | 3.80 | 3.87 | 2.25 |
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Remove torch (old torch7) from dnn in 5.x #24294
Merge with https://github.com/opencv/opencv_extra/pull/1097
Completely removed torch (old torch7) from dnn:
- removed modules/dnn/src/torch directory that contained torch7 model parser
- removed readNetFromTorch() and readTorchBlob() public functions
- removed torch7 references from comments and help texts
- replaced links to t7 models by links to similar onnx models in js_style_transfer turtorial (similar to https://github.com/opencv/opencv/pull/24245/files)
dnn: cleanup of halide backend for 5.x #24231
Merge with https://github.com/opencv/opencv_extra/pull/1092.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
OpenVINO backend for INT8 models #23987
### Pull Request Readiness Checklist
TODO:
- [x] DetectionOutput layer (https://github.com/opencv/opencv/pull/24069)
- [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum)
- [x] Accuracy, performance tests (https://github.com/opencv/opencv/pull/24039)
- [x] Single layer tests (convolution)
- [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~
Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`:
| backend | performance (median time) |
|---|---|
| OpenCV | 77.42ms |
| OpenVINO 2023.0 | 10.90ms |
CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz`
Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef
---
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
OCL_FP16 MatMul with large batch
* Workaround FP16 MatMul with large batch
* Fix OCL reinitialization
* Higher thresholds for INT8 quantization
* Try fix gemm_buffer_NT for half (columns)
* Fix GEMM by rows
* Add batch dimension to InnerProduct layer test
* Fix Test_ONNX_conformance.Layer_Test/test_basic_conv_with_padding
* Batch 16
* Replace all vload4
* Version suffix for MobileNetSSD_deploy Caffe model
DNN: reduce the memory used in convolution layer
* reduce the memory in winograd and disabel the test when usage memory is larger than 2gb.
* remove VERY_LOG tag
* Fix precision in tests for MyriadX
* Fix ONNX tests
* Add output range in ONNX tests
* Skip tests on Myriad OpenVINO 2018R5
* Add detect MyriadX
* Add detect MyriadX on OpenVINO R5
* Skip tests on Myriad next version of OpenVINO
* dnn(ie): VPU type from environment variable
* dnn(test): validate VPU type
* dnn(test): update DLIE test skip conditions
* Added ResizeBilinear op for tf
Combined ResizeNearestNeighbor and ResizeBilinear layers into Resize (with an interpolation param).
Minor changes to tf_importer and resize layer to save some code lines
Minor changes in init.cpp
Minor changes in tf_importer.cpp
* Replaced implementation of a custom ResizeBilinear layer to all layers
* Use Mat::ptr. Replace interpolation flags