Commit Graph

24087 Commits

Author SHA1 Message Date
Yuantao Feng
fa5ed62a66
Merge pull request #24694 from fengyuentau:matmul_refactor
dnn: refactor ONNX MatMul with fastGemm #24694

Done:
- [x] add backends
    - [x] CUDA
    - [x] OpenVINO
    - [x] CANN
    - [x] OpenCL
    - [x] Vulkan
- [x] add perf tests
- [x] const B case

### Benchmark

Tests are done on M1. All data is in milliseconds (ms).

| Configuration | MatMul (Prepacked) | MatMul | InnerProduct |
| - | - | - | - |
| A=[12, 197, 197], B=[12, 197, 64], trans_a=0, trans_b=0 | **0.39** | 0.41 | 1.33 |
| A=[12, 197, 64], B=[12, 64, 197], trans_a=0, trans_b=0  | **0.42** | 0.42 | 1.17 |
| A=[12, 50, 64], B=[12, 64, 50], trans_a=0, trans_b=0    | **0.13** | 0.15 | 0.33 |
| A=[12, 50, 50], B=[12, 50, 64], trans_a=0, trans_b=0    | **0.11** | 0.13 | 0.22 |
| A=[16, 197, 197], B=[16, 197, 64], trans_a=0, trans_b=0 | **0.46** | 0.54 | 1.46 |
| A=[16, 197, 64], B=[16, 64, 197], trans_a=0, trans_b=0  | **0.46** | 0.95 | 1.74 |
| A=[16, 50, 64], B=[16, 64, 50], trans_a=0, trans_b=0    | **0.18** | 0.32 | 0.43 |
| A=[16, 50, 50], B=[16, 50, 64], trans_a=0, trans_b=0    | **0.15** | 0.25 | 0.25 |

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-19 19:36:41 +03:00
Alexander Smorkalov
465e601e10
Merge pull request #24713 from MaximSmolskiy:improve-icvSmoothHistogram256
Improve icvSmoothHistogram256
2023-12-19 18:39:34 +03:00
zzuliys
dfc61fbfaa
Merge pull request #24666 from zzuliys:4.x
Add support for Orbbec Gemini2 and Gemini2 XL camera #24666

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
2023-12-19 18:34:21 +03:00
Alexander Smorkalov
509c1afb8d
Merge pull request #24659 from MaximSmolskiy:fix-bug-in-ChessBoardDetector-cleanFoundConnectedQuads
Fix bug in ChessBoardDetector::cleanFoundConnectedQuads
2023-12-19 16:05:29 +03:00
MaximSmolskiy
398611b7e8 Improve icvSmoothHistogram256 2023-12-18 16:56:05 +03:00
Vincent Rabaud
915e39cdf0 Empty vectors before filling them in ChessBoardDetector::processQuads
It seems the port in https://github.com/opencv/opencv/pull/11703 lost
the initialization.
2023-12-15 14:48:14 +01:00
Alexander Smorkalov
0735d7b328
Merge pull request #24701 from dodo920306:4.x
Fix typo
2023-12-15 14:48:51 +03:00
Wanli
6ae1709c6a
Merge pull request #24613 from WanliZhong:softmax_default_axis
Make default axis of softmax in onnx "-1" without opset option #24613

Try to solve problem: https://github.com/opencv/opencv/pull/24476#discussion_r1404821158

**ONNX**
`opset <= 11` use 1
`else` use -1

**TensorFlow**
`TF version = 2.x` use -1
`else` use 1

**Darknet, Caffe, Torch**
use 1 by definition
2023-12-15 10:41:42 +03:00
Kirin Chu
fb9f75c5ba
Fix typo
Changed "shough" to "should" for better clarity.
2023-12-15 09:21:23 +08:00
Anatoliy Talamanov
9a47e1764a
Merge pull request #24068 from TolyaTalamanov:at/add-onnx-coreml-execution-provider
G-API: Support CoreML Execution Providers for ONNXRT Backend #24068

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-12-13 21:22:15 +03:00
Alexey Smirnov
14688e95ea
Merge pull request #24658 from smirnov-alexey:as/gapi_ov_get_model_layout
G-API: Get input model layout from the IR if possible in OV 2.0 backend #24658

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-13 18:40:08 +03:00
Wanli
9bbc890d96
Merge pull request #24681 from WanliZhong:err_armv8
Fixed armv8 compilation warnings #24681 

Fixes the following warning on  armv8:
```
warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
```
Buildbot: https://pullrequest.opencv.org/buildbot/builders/4_x_ARMv8-lin
2023-12-12 15:38:07 +03:00
Alexander Smorkalov
3c1423c970
Merge pull request #24685 from AleksandrPanov:fix_build_grandle
fix build.gradle
2023-12-12 09:11:22 +03:00
AleksandrPanov
9814a514fa fix build.gradle 2023-12-12 02:57:06 +03:00
Wanli
6ee71fee88
Merge pull request #24547 from WanliZhong:refactor_conv_perf_test
Classify and extend convolution and depthwise performance tests #24547

This PR aims to:
1. Extend the test cases from models: `YOLOv5`, `YOLOv8`, `EfficientNet`, `YOLOX`, `YuNet`, `SFace`, `MPPalm`, `MPHand`, `MPPose`, `ViTTrack`, `PPOCRv3`, `CRNN`, `PPHumanSeg`. (371 new test cases are added)

2. Classify the existing convolution performance test to below cases
    - CONV_1x1
    - CONV_3x3_S1_D1 (winograd)
    - CONV
    - DEPTHWISE

3. Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned):
(i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved.
(ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]`
(iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 31], [32, 63], [64, 95]... `

> **Reproduced**: 1. follow step in https://github.com/alalek/opencv/commit/dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code [classify conv.txt](https://github.com/opencv/opencv/files/13522228/classify.conv.txt)


**Performance test result on Apple M2**

**Test result details**:  [M2.md](https://github.com/opencv/opencv/files/13379189/M2.md)

**Additional test result details with FP16**:  [m2_results_with_fp16.zip](https://github.com/opencv/opencv/files/13491070/m2_results_with_fp16.zip)


**Brief summary for 4.8.1 vs 4.7.0 or 4.6.0**: 
1. `CONV_1x1_S1_D1` dropped significant with small or large input shape.
2. `DEPTHWISE_5x5 ` dropped a little compared with 4.7.0. 

---

**Performance test result on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)**: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.

**Test result details**: [INTEL.md](https://github.com/opencv/opencv/files/13374093/INTEL.md)
**Brief summary for 4.8.1 vs 4.5.5**: 
1. `CONV_5x5_S1_D1` dropped significant. 
2. `CONV_1x1_S1_D1`, `CONV_3x3_S1_D1`, `DEPTHWISE_3x3_S1_D1`, `DEPTHWISW_3x3_S2_D1` dropped with small input shape.

---

TODO:
- [x] Perform tests on arm with each opencv version
- [x] Perform tests on x86 with each opencv version
- [x] Split each test classification with single test config
- [x] test enable fp16
2023-12-11 21:35:33 +03:00
Maxim Smolskiy
b1b59c87b9
Merge pull request #24605 from MaximSmolskiy:speed-up-ChessBoardDetector-findQuadNeighbors
Speed up ChessBoardDetector::findQuadNeighbors #24605

### Pull Request Readiness Checklist

Replaced brute-force algorithm with O(N^2) time complexity with kd-tree with something like O(N * log N) time complexity (maybe only in average case).

For example, on image from #23558 without quads filtering (by using `CALIB_CB_FILTER_QUADS` flag) finding chessboards corners took ~770 seconds on my laptop, of which finding quads neighbors took ~620 seconds.

Now finding chessboards corners takes ~155-160 seconds, of which finding quads neighbors takes only ~5-10 seconds.

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-12-11 19:11:58 +03:00
Abduragim Shtanchaev
d3dd2e463c
Merge pull request #24611 from Abdurrahheem:ash/add_yolov6_test
Add test for YoloX Yolo v6 and Yolo v8 #24611

This PR adds test for YOLOv6 model (which was absent before)
The onnx weights for the test are located in this PR [ #1126](https://github.com/opencv/opencv_extra/pull/1126)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-11 16:42:51 +03:00
Alexander Smorkalov
06ff35c9af
Merge pull request #24662 from asmorkalov:as/android_native_camera
Added experimental NativeCameraView class for Android 24+.
2023-12-11 16:40:57 +03:00
Alexander Smorkalov
308f00f158
Merge pull request #24634 from jubinchheda:deprecated-ios-api-patches
The AVVideoCodecJPEG symbol was deprecated in iOS 11.0. We may want to use AVVideoCodecTypeJPEG instead
2023-12-10 21:01:34 +03:00
Alexander Smorkalov
e5468a88e6 Added experimental NativeCameraView class for Android 24+. 2023-12-10 19:51:55 +03:00
Dmitry Kurtaev
ac4b26a561 Replace Slice optional inputs removal to adjustment 2023-12-08 23:29:52 +03:00
Alexander Alekhin
850ebec135 Merge pull request #24224 from AsyaPronina:asyadev/port_vas_ot_to_opencv 2023-12-08 11:41:25 +00:00
Alexander Alekhin
55923c8dc7 Merge pull request #24665 from opencv-pushbot:gitee/alalek/fix_winpack_vc14 2023-12-08 11:39:14 +00:00
Vincent Rabaud
b7348e1b65 Get code to compile without DNN 2023-12-08 10:54:59 +01:00
Alexander Smorkalov
0bf519dd05
Merge pull request #24657 from asmorkalov:as/ffmpeg_timeout_warning
Added warning, if FFmpeg pipeline was interrupted by timeout
2023-12-08 10:30:42 +03:00
Alexander Smorkalov
1cfd5acb41
Merge pull request #24640 from asmorkalov:as/android_info_lib_cleanup
Removed info lib handling in OpenCV initialization on Android
2023-12-08 08:57:38 +03:00
Alexander Alekhin
13c2320e38 cmake: use /INCREMENTAL:NO with MSVS 2015 2023-12-07 19:46:27 +00:00
MaximSmolskiy
2f0de10120 Fix bug in ChessBoardDetector::cleanFoundConnectedQuads 2023-12-06 22:46:09 +03:00
Alexander Smorkalov
ad079ea5da Added warning, if FFmpeg pipeline was interrupted by timeout. 2023-12-06 20:15:08 +03:00
Alexander Smorkalov
dc0c59fdc6
Merge pull request #24649 from asmorkalov:as/android_camera2_extact_request
Refactor JavaCamera2View to add option to override Camera2 session request options
2023-12-06 17:33:32 +03:00
Yuantao Feng
a2edf4d929
Merge pull request #24647 from fengyuentau:cuda_sub
dnn cuda: support Sub #24647

Related https://github.com/opencv/opencv/issues/24606#issuecomment-1837390257

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-06 13:46:24 +03:00
Yuantao Feng
f5ec92e4ca
Merge pull request #24655 from fengyuentau:graph_simplifier_optional_input
dnn onnx graph simplifier: handle optional inputs of Slice #24655

Resolves https://github.com/opencv/opencv/issues/24609

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-12-06 13:43:54 +03:00
Alexander Smorkalov
938951f4f0 Refactor JavaCamera2View to add option to override Camera2 session request options.
Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com>
2023-12-06 13:25:43 +03:00
Alexander Smorkalov
22edfd2628
Merge pull request #24648 from vrabaud:compilation
Fix stereoRectify image boundaries again.
2023-12-06 11:52:06 +03:00
Alexander Smorkalov
b1441c9d6a Report resolution together with FPS in JavaCamera2View. 2023-12-05 18:46:23 +03:00
Anastasiya Pronina
e70c8e5f0e Ported VAS Object Tracking into OpenCV G-API 2023-12-05 14:58:00 +00:00
Vincent Rabaud
7f0a094e4e Fix stereoRectify image boundaries again.
This should have been fixed in https://github.com/opencv/opencv/pull/24035
2023-12-05 13:36:17 +01:00
Alexander Smorkalov
1bf4f2386a Removed info lib handling in OpenCV initialization on Android. 2023-12-04 15:00:36 +03:00
JUBIN CHHEDA
48e6be822c
The AVVideoCodecJPEG symbol was deprecated in iOS 11.0. We may want to use AVVideoCodecTypeJPEG instead 2023-12-03 06:51:40 -05:00
Tomoaki Teshima
c7ed293484 typo fix 2023-12-02 13:30:01 +09:00
Alexander Smorkalov
408730b7ab
Merge pull request #24618 from vrabaud:compilation
Fix compilation on some 32-bit windows
2023-12-01 09:10:30 +03:00
Alexander Smorkalov
21d5a41e92
Merge pull request #24599 from asmorkalov:as/android_face_detect_dnn
Migrate Android Face Detection sample to DNN.
2023-11-30 17:43:26 +03:00
Alexander Smorkalov
4cfbc5af08
Merge pull request #24625 from asmorkalov:as/mjpeg_encoder_status
Report correct open status from Bitstream
2023-11-30 16:42:01 +03:00
Alexander Smorkalov
3893936243
Merge pull request #24565 from CNClareChen:4.x
Change the lsx to baseline features.
2023-11-30 15:27:49 +03:00
Alexander Smorkalov
1db23e0f12 Report correct open status from Bitstream. 2023-11-30 15:16:27 +03:00
Alexander Smorkalov
e20250139a
Merge pull request #24582 from hanliutong:rvv-lut
Optimize the v_lut* functions for RISC-V Vector(RVV).
2023-11-30 10:59:51 +03:00
Maxim Smolskiy
10c43e5642
Merge pull request #24597 from MaximSmolskiy:fix-bug-in-ChessBoardDetector-findQuadNeighbors
Fix bug in ChessBoardDetector::findQuadNeighbors #24597

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-11-30 10:59:06 +03:00
Vincent Rabaud
0812659e92 Fix compilation on some 32-bit windows
I do not have more info on the platform as it is internal.

Without this fix, the error is:
core/src/arithm.simd.hpp:868:1: error: too few arguments provided to function-like macro invocation
  868 | DEFINE_SIMD_ALL(cmp)
      | ^
./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:93:5: note: expanded from macro 'DEFINE_SIMD_ALL'
   93 |     DEFINE_SIMD_NSAT(fun, __VA_ARGS__)
      |     ^
./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:89:5: note: expanded from macro 'DEFINE_SIMD_NSAT'
   89 |     DEFINE_SIMD_F64(fun, __VA_ARGS__)
      |     ^
./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:77:9: note: expanded from macro 'DEFINE_SIMD_F64'
   77 |         DEFINE_NOSIMD(__CV_CAT(fun, 64f), double, __VA_ARGS__)
      |         ^
./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:47:56: note: expanded from macro 'DEFINE_NOSIMD'
   47 |         DEFINE_NOSIMD_FUN(fun_name, c_type, __VA_ARGS__)
      |                                                        ^
./third_party/OpenCV/public/modules/./core/src/arithm.simd.hpp:860:9: note: macro 'DEFINE_NOSIMD_FUN' defined here
  860 | #define DEFINE_NOSIMD_FUN(fun, _T1, _Tvec, ...)     \
2023-11-29 16:27:11 +01:00
Alexander Smorkalov
5df28f1eaa
Merge pull request #24615 from smirnov-alexey:as/infer2_assert_soften
G-API: Soften the argument check in infer2
2023-11-29 17:45:02 +03:00
Anatoliy Talamanov
79797a3eb6
Merge pull request #24584 from TolyaTalamanov:at/implement-inference-only-mode-for-ov-backend
G-API: Implement inference only mode for OV backend #24584

### Changes overview

Introduced `cv::gapi::wip::ov::benchmark_mode{}` compile argument which if enabled force `OpenVINO` backend to run only inference without populating input and copying back output tensors. 

This mode is only relevant for measuring the performance of pure inference without data transfers. Similar approach is using on OpenVINO side in `benchmark_app`: https://github.com/openvinotoolkit/openvino/blob/master/samples/cpp/benchmark_app/benchmark_app.hpp#L134-L139



### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-11-29 17:40:45 +03:00