Commit Graph

23936 Commits

Author SHA1 Message Date
刘佩其
631f229827 Fix the issue of missing imshow icons when linking OpenCV as a static library (https://github.com/opencv/opencv-python/issues/585) 2023-10-07 11:18:32 +08:00
Sean McBride
5fb3869775
Merge pull request #23109 from seanm:misc-warnings
* Fixed clang -Wnewline-eof warnings
* Fixed all trivial clang -Wextra-semi and -Wc++98-compat-extra-semi warnings
* Removed trailing semi from various macros
* Fixed various -Wunused-macros warnings
* Fixed some trivial -Wdocumentation warnings
* Fixed some -Wdocumentation-deprecated-sync warnings
* Fixed incorrect indentation
* Suppressed some clang warnings in 3rd party code
* Fixed QRCodeEncoder::Params documentation.

---------

Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
2023-10-06 13:33:21 +03:00
jvuillaumier
24fd39538e
Merge pull request #24233 from jvuillaumier:rotate_flip_hal_hooks
Add HAL implementation hooks to cv::flip() and cv::rotate() functions from core module #24233

Hello,

This change proposes the addition of HAL hooks for cv::flip() and cv::rotate() functions from OpenCV core module.
Flip and rotation are functions commonly available from 2D hardware accelerators. This is convenient provision to enable custom optimized implementation of image flip/rotation on systems embedding such accelerator.

Thank you

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-10-06 12:31:53 +03:00
HAN Liutong
07bf9cb013
Merge pull request #24325 from hanliutong:rewrite
Rewrite Universal Intrinsic code: float related part #24325

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

The series of PRs is listed below:
#23885 First patch, an example
#23980 Core module
#24058 ImgProc module, part 1
#24132 ImgProc module, part 2
#24166 ImgProc module, part 3
#24301 Features2d and calib3d module
#24324 Gapi module

This patch (hopefully) is the last one in the series. 

This patch mainly involves 3 parts
1. Add some modifications related to float (CV_SIMD_64F)
2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, 
    then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD`
3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments
    - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....`
    - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here)
      - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct)
      - ./modules/imgproc/src/color_lab.cpp (Array of vector type)
      - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type)
      - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`)
      These algorithms will need to be redesigned to accommodate scalable backends.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-10-05 17:57:25 +03:00
Dmitry Kurtaev
2c92eb3175 Enable more tests for OpenVINO 2023.0 2023-10-05 12:51:55 +03:00
Alexander Smorkalov
7b6d65cf20
Merge pull request #24337 from mshabunin:bump-ade-012c
3rdparty: update ade version
2023-10-04 14:59:16 +03:00
Wanli
62b5470b78
Merge pull request #24298 from WanliZhong:extend_perf_net_test
Extend performance test models #24298

**Merged With https://github.com/opencv/opencv_extra/pull/1095**

This PR aims to extend the performance tests. 

- **YOLOv5** for object detection
- **YOLOv8** for object detection
- **EfficientNet** for classification

Models from OpenCV Zoo:

- **YOLOX** for object detection
- **YuNet** for face detection
- **SFace** for face recognization
- **MPPalm** for palm detection
- **MPHand** for hand landmark
- **MPPose** for pose estimation
- **ViTTrack** for object tracking
- **PPOCRv3** for text detection
- **CRNN** for text recognization
- **PPHumanSeg** for human segmentation

If other models should be added, **please leave some comments**. Thanks!



Build opencv with script:
```shell
-DBUILD_opencv_python2=OFF
-DBUILD_opencv_python3=OFF
-DBUILD_opencv_gapi=OFF
-DINSTALL_PYTHON_EXAMPLES=OFF
-DINSTALL_C_EXAMPLES=OFF
-DBUILD_DOCS=OFF
-DBUILD_EXAMPLES=OFF
-DBUILD_ZLIB=OFF
-DWITH_FFMPEG=OFF
```



Performance Test on **Apple M2 CPU**
```shell
MacOS 14.0
8 threads
```

**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  76.244   |  76.611   |  62.534   |  57.678   |  57.238   |
| EfficientNet |    ---    |    ---    |  109.224  |  130.753  |  109.076  |
| MPHand       |    ---    |    ---    |  19.289   |  22.727   |  27.593   |
| MPPalm       |  47.150   |  47.061   |  41.064   |  65.598   |  40.109   |
| MPPose       |    ---    |    ---    |  26.592   |  32.022   |  26.956   |
| PPHumanSeg   |  41.672   |  41.790   |  27.819   |  27.212   |  30.461   |
| PPOCRv3      |    ---    |    ---    |  140.371  |  187.922  |  170.026  |
| SFace        |  43.830   |  43.834   |  27.575   |  30.653   |  26.387   |
| ViTTrack     |    ---    |    ---    |    ---    |  14.617   |  15.028   |
| YOLOX        | 1060.507  | 1061.361  |  495.816  |  533.309  |  549.713  |
| YOLOv5       |    ---    |    ---    |    ---    |  191.350  |  193.261  |
| YOLOv8       |    ---    |    ---    |  198.893  |  218.733  |  223.142  |
| YuNet        |  27.084   |  27.095   |  26.238   |  30.512   |  34.439   |
| MobileNet_SSD_Caffe         |  44.742   |  44.565   |  33.005   |  29.421   |  29.286   |
| MobileNet_SSD_v1_TensorFlow |  49.352   |  49.274   |  35.163   |  32.134   |  31.904   |
| MobileNet_SSD_v2_TensorFlow |  83.537   |  83.379   |  56.403   |  42.947   |  42.148   |
| ResNet_50                   |  148.872  |  148.817  |  77.331   |  67.682   |  67.760   |


**n threads:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  44.262   |  44.408   |  41.540   |  40.731   |  41.151   |
| EfficientNet |    ---    |    ---    |  28.683   |  42.676   |  38.204   |
| MPHand       |    ---    |    ---    |   6.738   |  13.126   |   8.155   |
| MPPalm       |  16.613   |  16.588   |  12.477   |  31.370   |  17.048   |
| MPPose       |    ---    |    ---    |  12.985   |  19.700   |  16.537   |
| PPHumanSeg   |  14.993   |  15.133   |  13.438   |  15.269   |  15.252   |
| PPOCRv3      |    ---    |    ---    |  63.752   |  85.469   |  76.190   |
| SFace        |  10.685   |  10.822   |   8.127   |   8.318   |   7.934   |
| ViTTrack     |    ---    |    ---    |    ---    |  10.079   |   9.579   |
| YOLOX        |  417.358  |  422.977  |  230.036  |  234.662  |  228.555  |
| YOLOv5       |    ---    |    ---    |    ---    |  74.249   |  75.480   |
| YOLOv8       |    ---    |    ---    |  63.762   |  88.770   |  70.927   |
| YuNet        |   8.589   |   8.731   |  11.269   |  16.466   |  14.513   |
| MobileNet_SSD_Caffe         |  12.575   |  12.636   |  11.529   |  12.114   |  12.236   |
| MobileNet_SSD_v1_TensorFlow |  13.922   |  14.160   |  13.078   |  12.124   |  13.298   |
| MobileNet_SSD_v2_TensorFlow |  25.096   |  24.836   |  22.823   |  20.238   |  20.319   |
| ResNet_50                   |  41.561   |  41.296   |  29.092   |  30.412   |  29.339   |


Performance Test on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)
```shell
Ubuntu 22.04.2 LTS
8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz)
4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz)
20 threads
```


**1 thread:**
| Name of Test | 4.5.5-1th | 4.6.0-1th | 4.7.0-1th | 4.8.0-1th | 4.8.1-1th |
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |  16.752   |  16.851   |  16.840   |  16.625   |  16.663   |
| EfficientNet |    ---    |    ---    |  61.107   |  76.037   |  53.890   |
| MPHand       |    ---    |    ---    |   8.906   |   9.969   |   8.403   |
| MPPalm       |  24.243   |  24.638   |  18.104   |  35.140   |  18.387   |
| MPPose       |    ---    |    ---    |  12.322   |  16.515   |  12.355   |
| PPHumanSeg   |  15.249   |  15.303   |  10.203   |  10.298   |  10.353   |
| PPOCRv3      |    ---    |    ---    |  87.788   |  144.253  |  90.648   |
| SFace        |  15.583   |  15.884   |  13.957   |  13.298   |  13.284   |
| ViTTrack     |    ---    |    ---    |    ---    |  11.760   |  11.710   |
| YOLOX        |  324.927  |  325.173  |  235.986  |  253.653  |  254.472  |
| YOLOv5       |    ---    |    ---    |    ---    |  102.163  |  102.621  |
| YOLOv8       |    ---    |    ---    |  87.013   |  103.182  |  103.146  |
| YuNet        |  12.806   |  12.645   |  10.515   |  12.647   |  12.711   |
| MobileNet_SSD_Caffe         |  23.556   |  23.768   |  24.304   |  22.569   |  22.602   |
| MobileNet_SSD_v1_TensorFlow |  26.136   |  26.276   |  26.854   |  24.828   |  24.961   |
| MobileNet_SSD_v2_TensorFlow |  43.521   |  43.614   |  46.892   |  44.044   |  44.682   |
| ResNet_50                   |  73.588   |  73.501   |  75.191   |  66.893   |  65.144   |


**n thread:**
| Name of Test | 4.5.5-nth | 4.6.0-nth | 4.7.0-nth | 4.8.0-nth | 4.8.1-nth | 
|--------------|:---------:|:---------:|:---------:|:---------:|:---------:|
| CRNN         |   8.665   |   8.827   |  10.643   |   7.703   |   7.743   | 
| EfficientNet |    ---    |    ---    |  16.591   |  12.715   |   9.022   |   
| MPHand       |    ---    |    ---    |   2.678   |   2.785   |   1.680   |           
| MPPalm       |   5.309   |   5.319   |   3.822   |  10.568   |   4.467   |       
| MPPose       |    ---    |    ---    |   3.644   |   6.088   |   4.608   |        
| PPHumanSeg   |   4.756   |   4.865   |   5.084   |   5.179   |   5.148   |        
| PPOCRv3      |    ---    |    ---    |  32.023   |  50.591   |  32.414   |      
| SFace        |   3.838   |   3.980   |   4.629   |   3.145   |   3.155   |       
| ViTTrack     |    ---    |    ---    |    ---    |  10.335   |  10.357   |   
| YOLOX        |  68.314   |  68.081   |  82.801   |  74.219   |  73.970   |      
| YOLOv5       |    ---    |    ---    |    ---    |  47.150   |  47.523   |    
| YOLOv8       |    ---    |    ---    |  32.195   |  30.359   |  30.267   |    
| YuNet        |   2.604   |   2.644   |   2.622   |   3.278   |   3.349   |    
| MobileNet_SSD_Caffe         |  13.005   |   5.935   |   8.586   |   4.629   |   4.713   |
| MobileNet_SSD_v1_TensorFlow |   7.002   |   7.129   |   9.314   |   5.271   |   5.213   |
| MobileNet_SSD_v2_TensorFlow |  11.939   |  12.111   |  22.688   |  12.038   |  12.086   |
| ResNet_50                   |  18.227   |  18.600   |  26.150   |  15.584   |  15.706   |
2023-10-04 13:05:32 +03:00
Alexander Smorkalov
670c52f75e
Merge pull request #24356 from VadimLevin:dev/vlevin/typing-re-export
feat: re-export cv2.typing module as typing
2023-10-04 09:14:51 +03:00
Alexander Smorkalov
6bcf3d4311
Merge pull request #24354 from asmorkalov:as/charuco_ub
Removed invalid reference usage in charuco detector
2023-10-03 17:45:42 +03:00
Dmitry Kurtaev
d752bac43f
Merge pull request #24234 from dkurt:distanceTransform_max_dist
Change max distance at distanceTransform #24234

### Pull Request Readiness Checklist

resolves https://github.com/opencv/opencv/issues/23895
related: https://github.com/opencv/opencv/pull/12278

* DIST_MASK_3 and DIST_MASK_5 maximal distance increased from 8192 to 65533 +/- 1
* Fix squares processing at DIST_MASK_PRECISE
* - [ ] TODO: Check with IPP

```cpp
    cv::Mat gray = cv::imread("opencv/samples/data/stuff.jpg", cv::ImreadModes::IMREAD_GRAYSCALE);

    cv::Mat gray_resize;
    cv::resize(gray, gray_resize, cv::Size(70000,70000), 0.0, 0.0, cv::INTER_LINEAR);

    gray_resize = gray_resize >= 100;

    cv::Mat dist;
    cv::distanceTransform(gray_resize, dist, cv::DIST_L2, cv::DIST_MASK_5, CV_32F);

    double minVal, maxVal;
    minMaxLoc(dist, &minVal, &maxVal);
    dist = 255 * (dist - minVal) / (maxVal - minVal);
    std::cout << minVal << " " << maxVal << std::endl;

    cv::Mat dist_resize;
    cv::resize(dist, dist_resize, cv::Size(1024,1024), 0.0, 0.0, cv::INTER_LINEAR);

    cv::String outfilePath = "test_mask_5.png";
    cv::imwrite(outfilePath, dist_resize);
```

mask | 4.x | PR |
----------|--------------|--------------
DIST_MASK_3 | <img src="https://github.com/opencv/opencv/assets/25801568/23e5de76-a8ba-4eb8-ab03-fa55672834be" width="128"> | <img src="https://github.com/opencv/opencv/assets/25801568/e1149f6a-49d6-47bd-a2a8-20bb7e4dafa4" width="128"> |
DIST_MASK_5 | <img src="https://github.com/opencv/opencv/assets/25801568/98aba29b-8865-4b9a-8066-669b16d175c9" width="128"> | <img src="https://github.com/opencv/opencv/assets/25801568/54f62ed2-9ef6-485f-bd63-48cc96accd7d" width="128"> |
DIST_MASK_PRECISE | <img src="https://github.com/opencv/opencv/assets/25801568/c4d79451-fd7a-461f-98fc-13060c63f473" width="128"> | <img src="https://github.com/opencv/opencv/assets/25801568/b5bfcaf5-bc48-40ba-b8e3-d000e5ab48db" width="128">|

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-10-03 17:23:32 +03:00
Maksim Shabunin
1bccc14e05
Merge pull request #24343 from mshabunin:fix-test-writes
Fix tests writing to current work dir #24343

Several tests were writing files in the current work directory and did not clean up after test. Moved all temporary files to the `/tmp` dir and added a cleanup code.
2023-10-03 16:34:25 +03:00
Yuriy Chernyshov
9f74982a54
Merge pull request #24323 from georgthegreat:akaze-variadic 2023-10-03 16:16:41 +03:00
Vadim Levin
4708d1aed7 feat: re-export cv2.typing module as typing
Import Python typing module as `_typing` to avoid name clashes.
2023-10-03 14:12:55 +03:00
alexlyulkov
9bd14d5417
Merge pull request #24353 from alexlyulkov:al/fixed-cumsum-layer
Fixed CumSum dnn layer #24353

Fixes #20110

The algorithm had several errors, so I rewrote it.
Also the layer didn't work with non constant axis tensor. Fixed it.
Enabled CumSum layer tests from ONNX conformance.
2023-10-03 13:58:25 +03:00
Alexander Smorkalov
c497fe05a0 Removed invalid reference usage in charuco detector. 2023-10-03 11:31:48 +03:00
Alexander Smorkalov
4e60392040
Merge pull request #24349 from AleksandrPanov:aruco_check_board_separation
add aruco board separation check
2023-10-03 10:00:03 +03:00
Alex
e5b114e5b8 Added ArUco marker size check for Aruco and Charuco boards. 2023-10-03 09:16:07 +03:00
Alexander Smorkalov
63819c1e1f
Merge pull request #24342 from asmorkalov:as/java_test_status
Fail Java test suite, execution, if one of test failed.
2023-10-02 09:17:11 +03:00
Alexander Smorkalov
2af5815d47 Fail Java test suite, execution, if one of test failed. 2023-10-01 18:31:04 +03:00
Alexander Smorkalov
5caee5cc64 Fixed OpenCL PF16 fallback in Einsum layer. 2023-09-29 15:52:23 +03:00
Dmitry Kurtaev
c7ec0d599a
Merge pull request #23987 from dkurt:openvino_int8_backend
OpenVINO backend for INT8 models #23987

### Pull Request Readiness Checklist

TODO:
- [x] DetectionOutput layer (https://github.com/opencv/opencv/pull/24069)
- [x] Less FP32 fallbacks (i.e. Sigmoid, eltwise sum)
- [x] Accuracy, performance tests (https://github.com/opencv/opencv/pull/24039)
- [x] Single layer tests (convolution)
- [x] ~~Fixes for OpenVINO 2022.1 (https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100334)~~


Performace results for object detection model `coco_efficientdet_lite0_v1_1.0_quant_2021_09_06.tflite`:
| backend | performance (median time) |
|---|---|
| OpenCV | 77.42ms |
| OpenVINO 2023.0 | 10.90ms |

CPU: `11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz`

Serialized model per-layer stats (note that Convolution should use `*_I8` primitives if they are quantized correctly): https://gist.github.com/dkurt/7772bbf1907035441bb5454f19f0feef

---

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-09-28 16:24:43 +03:00
Alexander Smorkalov
b8d4ac589d
Merge pull request #24334 from fengyuentau:fix_24319
dnn onnx: fix not-found constant indices for Gather if shared
2023-09-28 13:08:26 +03:00
Maksim Shabunin
0433fe539c 3rdparty: update ade version 2023-09-28 12:45:34 +03:00
fengyuentau
7fa0493ca0 init commit 2023-09-28 11:50:21 +08:00
casualwinds
7b399c4248
Merge pull request #24280 from casualwind:parallel_opt
Optimization for parallelization when large core number #24280

**Problem description:**
When the number of cores is large, OpenCV’s thread library may reduce performance when processing parallel jobs.

**The reason for this problem:**
When the number of cores (the thread pool initialized the threads, whose number is as same as the number of cores) is large, the main thread will spend too much time on waking up unnecessary threads.
When a parallel job needs to be executed, the main thread will wake up all threads in sequence, and then wait for the signal for the  job completion after waking up all threads. When the number of threads is larger than the parallel number of a job slices, there will be a situation where the main thread wakes up the threads in sequence and the awakened threads have completed the job, but the main thread is still waking up the other threads. The threads woken up by the main thread after this have nothing to do, and the broadcasts made by the waking threads take a lot of time, which reduce the performance.

**Solution:**
Reduce the time for the process of main thread waking up the worker threads through the following two methods:

•	The number of threads awakened by the main thread should be adjusted according to the parallel number of a job slices. If the number of threads is greater than the number of the parallel number of job slices, the total number of threads awakened should be reduced.
•	In the process of waking up threads in sequence, if the main thread finds that all parallel job slices have been allocated, it will jump out of the loop in time and wait for the signal for the job completion.

**Performance Test:**
The tests were run in the manner described by https://github.com/opencv/opencv/wiki/HowToUsePerfTests.
At core number =  160, There are big performance gain in some cases.

Take the following cases in the video module as examples:

OpticalFlowPyrLK_self::Path_Idx_Cn_NPoints_WSize_Deriv::("cv/optflow/frames/VGA_%02d.png", 2, 1, (9, 9), 11, true)
Performance improves 191%:0.185405ms ->0.0636496ms
perf::DenseOpticalFlow_VariationalRefinement::(320x240, 10, 10)
Performance improves 112%:23.88938ms -> 11.2562ms  
Among all the modules, the performance improvement is greatest on module video, and there are also certain improvements on other modules.

At core number = 160, the times labeled below are the geometric mean of the average time of all cases for one module. The optimization is available on each module.

overall | time(ms) |   |   |   |   |   |   |  
-- | -- | -- | -- | -- | -- | -- | -- | --
module   name | gapi | dnn | features2d | objdetect | core | imgproc | stitching | video
original | 0.185 | 1.586 | 9.998 | 11.846 | 0.205 | 0.215 | 164.409 | 0.803
optimized | 0.174 | 1.353 | 9.535 | 11.105 | 0.199 | 0.185 | 153.972 | 0.489
Performance   improves | 6% | 17% | 5% | 7% | 3% | 16% | 7% | 64%

Meanwhile, It is found that adjusting the order of test cases will have an impact on some test cases. For example, we used option --gtest-shuffle to run opencv_perf_gapi, the performance of TestPerformance::CmpWithScalarPerfTestFluid/CmpWithScalarPerfTest::(compare_f, CMP_GE, 1920x1080, 32FC1, { gapi.kernel_package })  case had 30% changes compared to the case without shuffle. I would like to ask if you have also encountered such a situation and could you share your experience?

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-27 16:21:20 +03:00
Yuantao Feng
307324f4ac
Merge pull request #24283 from fengyuentau:halide_tests
dnn: merge tests from test_halide_layers to test_backends #24283

Context: https://github.com/opencv/opencv/pull/24231#pullrequestreview-1628649980

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-09-27 14:09:47 +03:00
Dmitry Kurtaev
2b6d0f36f0
Merge pull request #24309 from dkurt:gemm_ov_hotfix
Update OpenVINO init of new GEMM layer #24309

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

CI validation:

- [x] 2022.1.0: https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100368
- [ ] 2021.4.2: https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100373

Checklist:
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-09-27 10:25:45 +03:00
Yuantao Feng
bb171a0c05
dnn: expand refactor with cv::broadcast for onnx models (#24295)
* add expand impl with cv::broadcast

* remove expandMid

* deduce shape from -1

* add constant folding

* handle input constant; handle input constant 1d

* add expand conformance tests; add checks to disallow shape of neg values; add early copy for unchanged total elements

* fix ExpandSubgraph

* dummy commit to trigger build

* dummy commit to trigger build 1

* remove conformance from test names
2023-09-27 09:28:52 +03:00
Alexander Smorkalov
9942757bab
Merge pull request #24316 from alexlyulkov:al/fix-caffe-read-segfault
Fixed segfault when reading Caffe model
2023-09-25 17:53:54 +03:00
HAN Liutong
aa143a3dd1
Merge pull request #24301 from hanliutong:rewrite-stereo-sift
Rewrite Universal Intrinsic code: features2d and calib3d module. #24301

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

This is the modification to the features2d module and calib3d module.

Test with clang 16 and QEMU v7.0.0. `AP3P.ctheta1p_nan_23607` failed beacuse of a small calculation error. But this patch does not touch the relevant code, and this error always reproduce on QEMU, regardless of whether the patch is applied or not. I think we can ignore it
```
[ RUN      ] AP3P.ctheta1p_nan_23607
/home/hanliutong/project/opencv/modules/calib3d/test/test_solvepnp_ransac.cpp:2319: Failure
Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16
[  FAILED  ] AP3P.ctheta1p_nan_23607 (26 ms)

...

[==========] 148 tests from 64 test cases ran. (1147114 ms total)
[  PASSED  ] 147 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] AP3P.ctheta1p_nan_23607
```

Note: There are 2 test cases failed with GCC 13.2.1 without this patch, seems like there are someting wrong with RVV part on GCC.
```
[----------] Global test environment tear-down
[==========] 148 tests from 64 test cases ran. (1511399 ms total)
[  PASSED  ] 146 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] Calib3d_StereoSGBM.regression
[  FAILED  ] Calib3d_StereoSGBM_HH4.regression
```

The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-25 13:03:25 +03:00
Alexander Lyulkov
72e7672a6c Fixed segfault when reading Caffe model 2023-09-25 12:55:11 +07:00
Abduragim Shtanchaev
865e7cacca
Merge pull request #24037 from Abdurrahheem:ash/dev_einsum
Add Support for Einsum Layer #24037

### This PR adding support for [Einsum Layer](https://pytorch.org/docs/stable/generated/torch.einsum.html) (in progress). 

This PR is currently not to be merged but only reviewed. Test cases are located in [#1090](https://github.com/opencv/opencv_extra/pull/1090)RP in OpenCV extra

**DONE**: 
 - [x] 2-5D GMM support added
 - [x] Matrix transpose support added
 - [x] Reduction type comupte  'ij->j'
 - [x] 2nd shape computation - during forward 

**Next PRs**:
- [ ] Broadcasting reduction "...ii ->...i"
- [ ] Add lazy shape deduction. "...ij, ...jk->...ik"
- [ ] Add implicit output computation support. "bij,bjk ->" (output subscripts should be "bik")
- [ ] Add support for CUDA backend 
- [ ] BatchWiseMultiply optimize

**Later in 5.x version (requires support for 1D matrices)**: 
- [ ] Add 1D vector multiplication support 
- [ ] Inter product "i, i" (problems with 1D shapes)

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-09-22 11:25:02 +03:00
Alexander Smorkalov
b51a78d439
Merge pull request #24302 from dkurt:ts_setup_skip
Skip test cases in case of SkipTestException in SetUp
2023-09-21 12:41:08 +03:00
Alexander Smorkalov
6295f7d05b
Merge pull request #24303 from asmorkalov:as/vittack_warning_fix
Warnings fix on Windows.
2023-09-20 22:08:02 +03:00
Alexander Smorkalov
219a34261f Warnings fix on Windows. 2023-09-20 16:53:40 +03:00
Alexander Smorkalov
224dac9427
Merge pull request #24126 from AleksandrPanov:fix_charuco_checkBoard
fix charuco checkBoard
2023-09-20 14:38:14 +03:00
Dmitry Kurtaev
d78637102c Skip test cases in case of SkipTestException in SetUp 2023-09-20 13:27:06 +03:00
Alex
60ae973142 fix charuco checkBoard 2023-09-20 12:27:07 +03:00
Alexander Smorkalov
799bb0cd18
Merge pull request #24291 from visitorckw:fix-memory-leak
Fix memory leak and handle realloc failure
2023-09-20 08:49:56 +03:00
Yuantao Feng
8a96e34e33
dnn: add gemm_layer in place of fully_connected_layer for onnx models (#23897)
* first commit

* turned C from input to constant; force C constant in impl; better handling 0d/1d cases

* integrate with gemm from ficus nn

* fix const inputs

* adjust threshold for int8 tryQuantize

* adjust threshold for int8 quantized 2

* support batched gemm and matmul; tune threshold for rcnn_ilsvrc13; update googlenet

* add gemm perf against innerproduct

* add perf tests for innerproduct with bias

* fix perf

* add memset

* renamings for next step

* add dedicated perf gemm

* add innerproduct in perf_gemm

* remove gemm and innerproduct perf tests from perf_layer

* add perf cases for vit sizes; prepack constants

* remove batched gemm; fix wrong trans; optimize KC

* remove prepacking for const A; several fixes for const B prepacking

* add todos and gemm expression

* add optimized branch for avx/avx2

* trigger build

* update macros and signature

* update signature

* fix macro

* fix bugs for neon aarch64 & x64

* add backends: cuda, cann, inf_ngraph and vkcom

* fix cuda backend

* test commit for cuda

* test cuda backend

* remove debug message from cuda backend

* use cpu dispatcher

* fix neon macro undef in dispatcher

* fix dispatcher

* fix inner kernel for neon aarch64

* fix compiling issue on armv7; try fixing accuracy issue on other platforms

* broadcast C with beta multiplied; improve func namings

* fix bug for avx and avx2

* put all platform-specific kernels in dispatcher

* fix typos

* attempt to fix compile issues on x64

* run old gemm when neon, avx, avx2 are all not available; add kernel for armv7 neon

* fix typo

* quick fix: add macros for pack4

* quick fix: use vmlaq_f32 for armv7

* quick fix for missing macro of fast gemm pack f32 4

* disable conformance tests when optimized branches are not supported

* disable perf tests when optimized branches are not supported

* decouple cv_try_neon and cv_neon_aarch64

* drop googlenet_2023; add fastGemmBatched

* fix step in fastGemmBatched

* cpu: fix initialization ofb; gpu: support batch

* quick followup fix for cuda

* add default kernels

* quick followup fix to avoid macro redef

* optmized kernels for lasx

* resolve mis-alignment; remove comments

* tune performance for x64 platform

* tune performance for neon aarch64

* tune for armv7

* comment time consuming tests

* quick follow-up fix
2023-09-20 00:53:34 +03:00
lpylpy0514
70d7e83dca
Merge pull request #24201 from lpylpy0514:4.x
VIT track(gsoc realtime object tracking model) #24201

Vit tracker(vision transformer tracker) is a much better model for real-time object tracking. Vit tracker can achieve speeds exceeding nanotrack by 20% in single-threaded mode with ARM chip, and the advantage becomes even more pronounced in multi-threaded mode. In addition, on the dataset, vit tracker demonstrates better performance compared to nanotrack. Moreover, vit trackerprovides confidence values during the tracking process, which can be used to determine if the tracking is currently lost.
opencv_zoo: https://github.com/opencv/opencv_zoo/pull/194
opencv_extra: [https://github.com/opencv/opencv_extra/pull/1088](https://github.com/opencv/opencv_extra/pull/1088)

# Performance comparison is as follows:
NOTE: The speed below is tested by **onnxruntime** because opencv has poor support for the transformer architecture for now.

ONNX speed test on ARM platform(apple M2)(ms):
| thread nums | 1| 2| 3| 4|
|--------|--------|--------|--------|--------|
| nanotrack| 5.25| 4.86| 4.72| 4.49|
| vit tracker| 4.18| 2.41| 1.97| **1.46 (3X)**|

ONNX speed test on x86 platform(intel i3 10105)(ms):
| thread nums | 1| 2| 3| 4|
|--------|--------|--------|--------|--------|
| nanotrack|3.20|2.75|2.46|2.55|
| vit tracker|3.84|2.37|2.10|2.01|

opencv speed test on x86 platform(intel i3 10105)(ms):
| thread nums | 1| 2| 3| 4|
|--------|--------|--------|--------|--------|
| vit tracker|31.3|31.4|31.4|31.4|

preformance test on lasot dataset(AUC is the most important data. Higher AUC means better tracker):

|LASOT | AUC| P| Pnorm|
|--------|--------|--------|--------|
| nanotrack| 46.8| 45.0| 43.3|
| vit tracker| 48.6| 44.8| 54.7|

[https://youtu.be/MJiPnu1ZQRI](https://youtu.be/MJiPnu1ZQRI)
 In target tracking tasks, the score is an important indicator that can indicate whether the current target is lost. In the video, vit tracker can track the target and display the current score in the upper left corner of the video. When the target is lost, the score drops significantly. While nanotrack will only return 0.9 score in any situation, so that we cannot determine whether the target is lost.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-19 15:36:38 +03:00
HAN Liutong
320c0bf419
Merge pull request #24166 from hanliutong:rewrite-remaining
Rewrite Universal Intrinsic code: ImgProc (CV_SIMD_WIDTH related Part) #24166

Related PR: #24058, #24132. The goal of this series of PRs is to modify the SIMD code blocks in the opencv/modules/imgproc folder by using the new Universal Intrinsic API.

The modification of this PR mainly focuses on the code that uses the `CV_SIMD_WIDTH` macro. This macro is sometimes used for loop tail processing, such as `box_filter.simd.hpp` and `morph.simd.hpp`.

```cpp
#if CV_SIMD
int i = 0;
for (i < n - v_uint16::nlanes; i += v_uint16::nlanes) {
// some universal intrinsic code
// e.g. v_uint16...
}
#if CV_SIMD_WIDTH > 16
for (i < n - v_uint16x8::nlanes; i += v_uint16x8::nlanes) {
// handle loop tail by 128 bit SIMD
// e.g. v_uint16x8
}
#endif //CV_SIMD_WIDTH 
#endif// CV_SIMD
```
The main contradiction is that the variable-length Universal Intrinsic backend cannot use 128bit fixed-length data structures. Therefore, this PR uses the scalar loop to handle the loop tail.

This PR is marked as draft because the modification of the `box_filter.simd.hpp` file caused a compilation error. The cause of the error is initially believed to be due to an internal error in the GCC compiler.

```bash
box_filter.simd.hpp:1162:5: internal compiler error: Segmentation fault
 1162 |     }
      |     ^
0xe03883 crash_signal
        /wafer/share/gcc/gcc/toplev.cc:314
0x7ff261c4251f ???
        ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x6bde48 hash_set<rtl_ssa::set_info*, false, default_hash_traits<rtl_ssa::set_info*> >::iterator::operator*()
        /wafer/share/gcc/gcc/hash-set.h:125
0x6bde48 extract_single_source
        /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1184
0x6bde48 extract_single_source
        /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1174
0x119ad9e pass_vsetvl::propagate_avl() const
        /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4087
0x119ceaf pass_vsetvl::execute(function*)
        /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4344
0x119ceaf pass_vsetvl::execute(function*)
        /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4325
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
```

This PR can be compiled with Clang 16, and `opencv_test_imgproc` is passed on QEMU.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-19 15:12:52 +03:00
Kumataro
b870ad46bf
Merge pull request #24074 from Kumataro/fix24057
Python: support tuple src for cv::add()/subtract()/... #24074

fix https://github.com/opencv/opencv/issues/24057

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ x The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-09-19 10:32:47 +03:00
HAN Liutong
f617fbe166
Merge pull request #24132 from hanliutong:rewrite-imgproc2
Rewrite Universal Intrinsic code by using new API: ImgProc module Part 2 #24132

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro in the opencv/modules/imgproc folder: rewrite them by using the new Universal Intrinsic API.

This is the second part of the modification to the Imgproc module ( Part 1: #24058 ), And I tested this patch on RVV (QEMU) and AVX devices, `opencv_test_imgproc` is passed.

The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-19 08:52:42 +03:00
Alexander Smorkalov
8f2e6640e3
Merge pull request #24288 from tailsu:sd/emscripten-3.1.45-fixes
build fixes for emscripten 3.1.45
2023-09-19 08:09:18 +03:00
Kuan-Wei Chiu
e16ca08b33 Fix memory leak and handle realloc failure
In the previous code, there was a memory leak issue where the
previously allocated memory was not freed upon a failed realloc
operation. This commit addresses the problem by releasing the old
memory before setting the pointer to NULL in case of a realloc failure.
This ensures that memory is properly managed and avoids potential
memory leaks.
2023-09-18 22:43:44 +08:00
Stefan Dragnev
9b5a719d80 build fixes for emscripten 3.1.45 2023-09-18 15:38:31 +02:00
Alexander Smorkalov
157b0e7760
Merge pull request #24275 from alexlyulkov:al/fix-tf-graph-simplifier
Fixed removePhaseSwitches in tf_graph_simplifier
2023-09-18 11:02:44 +03:00
Alexander Smorkalov
0a53afe8ba
Merge pull request #24278 from georgthegreat:compat-fixes
More fixes for iterators-are-pointers case
2023-09-18 10:47:53 +03:00
Dmitry Kurtaev
6bc369fc56
Merge pull request #24250 from dkurt:ts_fixture_constructor_skip_2
Skip test on SkipTestException at fixture's constructor (version 2) #24250

### Pull Request Readiness Checklist

Another version of https://github.com/opencv/opencv/pull/24186 (reverted by https://github.com/opencv/opencv/pull/24223). Current implementation cannot handle skip exception at `static void SetUpTestCase` but works on `virtual void SetUp`.

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2023-09-18 10:23:24 +03:00