Commit Graph

3069 Commits

Author SHA1 Message Date
kallaballa
63b5dee274 fixed bug: variable shadowing 2024-10-10 06:35:42 +02:00
kallaballa
8ba7389b21 properly size the devices array 2024-10-10 06:32:22 +02:00
kallaballa
885bbc643f renaming 2024-10-10 06:30:33 +02:00
kallaballa
dceeb47cd3 rewrote clgl device discovery 2024-10-10 00:02:56 +02:00
Alexander Smorkalov
307dc2a298 Excluded nullptr leak to arithmetic HAL got from empty Mat. 2024-09-06 16:49:14 +03:00
Alexander Smorkalov
5b4d1ce6a0
Merge pull request #26080 from asmorkalov:as/HAL_minMaxIdx_ND_offset
Added offset for HAL as ofs2idx expects 1-based index #26080

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-08-30 13:10:24 +03:00
Alexander Smorkalov
76bf17a248 Removed duplicated code in Pow implementation that triggers wrong assert on Intel iGPU. 2024-08-23 17:44:58 +03:00
penghuiho
f4c2e4f872
Merge pull request #26061 from penghuiho:fix-pow-bug
Fixed the simd bugs of iPow8u and iPow16u #26061

Add the following cases in opencv_perf_core:

* OCL_PowFixture_iPow.iPow/0, where GetParam() = (640x480, 8UC1)
* OCL_PowFixture_iPow.iPow/2, where GetParam() = (640x480, 16UC1)

iPow8u and iPow16u failed to call to simd accelerating while executing.

Fix the bug by changing the input type of iPow_SIMD function.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-08-23 17:12:19 +03:00
Kumataro
da3debda6d
Merge pull request #25981 from Kumataro:fix25971
imgproc: add specific error code when cvtColor is used on an image with an invalid number of channels #25981

close #25971

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-08-09 14:22:02 +03:00
James Choi
582a7f32d5
Merge pull request #25832 from chachoi-world:4.x
Add support for QNX #25832

Build and test instruction for QNX:
https://github.com/chachoi-world/qnx-ports/blob/main/opencv/README.md

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-08-06 20:25:39 +03:00
Alexander Smorkalov
ab99f87b6a
Merge pull request #25979 from asmorkalov:as/custom_allocator
Set and check allocator pointer for all cv::Mat instances
2024-08-05 11:52:00 +03:00
Alexander Smorkalov
9de2ebbec1
Merge pull request #25978 from chacha21:cuda_stdallocator
Adding getStdAllocator() to cv::cuda::GpuMat
2024-08-05 10:58:33 +03:00
Alexander Smorkalov
a15cd4b63d Set and check allocator pointer for all cv::Mat instances. 2024-08-05 10:07:14 +03:00
chacha21
f67d4852bf Added no-imp placeholder when HAVE_CUDA is false 2024-08-01 10:00:31 +02:00
chacha21
2db7f8e827 Adding getStdAllocator() to cv::cuda::GpuMat
To be on par with `cv::Mat`, let's add `cv::cuda::GpuMat::getStdAllocator()`
This is useful anyway, because when a user wants to use custom allocators, he might want to resort to the standard default allocator behaviour, not some other allocator that could have been set by `setDefaultAllocator()`
2024-08-01 09:36:08 +02:00
Kumataro
be3c519956 core: FileStorage: detect invalid attribute value 2024-07-26 05:55:00 +09:00
Vincent Rabaud
e1b57057bf Avoid future integer overflow in _OutputArray::create
This fix is useless in 4.x and fixes harmless overflows in 5.x
This belongs to 4.x as it is closer to the intended meaning.
2024-07-23 16:22:55 +02:00
Rostislav Vasilikhin
44c814e334
Merge pull request #25936 from savuor:rv/hal_dot
HAL for dot product added #25936

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-07-23 08:06:15 +03:00
Richard Barnes
d1505693dd throw() -> noexcept 2024-07-16 06:36:52 -07:00
Alexander Smorkalov
a6b8ea892b Post-merge fixes for algorithm hint API. 2024-07-15 14:44:03 +03:00
Alexander Smorkalov
15783d6598
Merge pull request #25792 from asmorkalov:as/HAL_fast_GaussianBlur
Added flag to GaussianBlur for faster but not bit-exact implementation #25792

Rationale:
Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.

The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.

Replaces https://github.com/opencv/opencv/pull/22073
Possibly related issue: https://github.com/opencv/opencv/issues/24135

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-12 15:03:33 +03:00
Vincent Rabaud
3ff97c5580
Merge pull request #25899 from vrabaud:move_no_except
Mark cv::Mat(Mat&&) as noexcept #25899

This fixes https://github.com/opencv/opencv/issues/25065

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-07-12 14:41:17 +03:00
Alexander Smorkalov
445022682e
Merge pull request #25789 from asmorkalov:as/HAL_meanStdDev_tails
Fill mean and stdDev tails with zeros for HAL branch in meanStdDev #25789

as it's done for other branches.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-06-27 19:11:05 +03:00
Alexander Smorkalov
a102b24285 Added LUT for FP16 and accuracy test. 2024-06-19 16:16:11 +03:00
Rostislav Vasilikhin
a7e53aa184
Merge pull request #25671 from savuor:rv/arithm_extend_tests
Tests added for mixed type arithmetic operations #25671

### Changes
* added accuracy tests for mixed type arithmetic operations
    _Note: div-by-zero values are removed from checking since the result is implementation-defined in common case_
* added perf tests for the same cases
* fixed a typo in `getMulExtTab()` function that lead to dead code

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-06-02 14:28:06 +03:00
Alexander Smorkalov
9ed1d6730f Fixed offset computation for ND case in MinMaxIdx HAL. 2024-05-31 10:09:34 +03:00
Alexander Smorkalov
f85014534f Fixed width and height order in HAL call for LUT. 2024-05-23 10:30:33 +03:00
Yuantao Feng
49f80cb3c4
Merge pull request #24804 from fengyuentau:fix_lapack_warnings
core: try to solve warnings caused by Apple's new LAPACK interface #24804

Resolves https://github.com/opencv/opencv/issues/24660

Apple's BLAS documentation: https://developer.apple.com/documentation/accelerate/blas?language=objc

New interface since macOS >= 13.3, iOS >= 16.4.

Todo:

- [x] Detect macOS version.
- [x] ~Detect iOS versions (major and minor version).~ No calling of Accelerate New LAPACK on iOS.
- [x] Solve calling `cblas_cgemm` and `cblas_zgemm`.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-05-21 16:58:16 +03:00
Rostislav Vasilikhin
d95ff3ac04 HAL for sub8x32f added 2024-05-20 10:48:56 +03:00
Rostislav Vasilikhin
69af621ef6
Merge pull request #25506 from savuor:rv/hal_mul16
HAL mul8x8to16 added #25506

Fixes #25034

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-05-20 10:43:18 +03:00
Alexander Smorkalov
0044047782
Merge pull request #25598 from asmorkalov:as/tables_range_check_core
Check range for type-dependant function tables #25598

Address https://github.com/opencv/opencv/issues/24703

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-05-17 10:48:40 +03:00
Alexander Smorkalov
1f1ba7e402
Merge pull request #25563 from asmorkalov:as/HAL_min_max_idx
Transform offset to indeces for MatND in minMaxIdx HAL #25563

Address comments in https://github.com/opencv/opencv/pull/25553

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-05-08 18:57:02 +03:00
Rostislav Vasilikhin
5bd64e09a3
Merge pull request #25554 from savuor:rv/hal_lut
Merge pull request #25554 from savuor:rv/hal_lut

HAL for LUT added #25554

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-05-08 17:45:08 +03:00
Alexander Smorkalov
faa259ab34
Merge pull request #25553 from asmorkalov:as/HAL_min_max_idx
Fix HAL interface for hal_ni_minMaxIdx #25553

Fixes https://github.com/opencv/opencv/issues/25540

The original implementation call HAL with the same parameters independently from amount of channels. The patch uses HAL correctly for the case cn > 1.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-05-07 20:46:17 +03:00
Alexander Smorkalov
b94cb5bc68 HAL interface for meanStdDev. 2024-05-03 16:36:43 +03:00
Rostislav Vasilikhin
12e2cc9502
Merge pull request #25491 from savuor:rv/hal_norm_hamming
HAL for Hamming norm added #25491

fixes #25474

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-04-27 14:38:44 +03:00
Alexander Smorkalov
5da17a4b03
Merge pull request #25454 from fengyuentau:fix_core_gemm_acc
core: fix `Core_GEMM.accuracy` failure on recent macOS
2024-04-19 11:36:31 +03:00
fengyuentau
4ef5986d4d remove manual unrolling that causes problem 2024-04-19 14:24:26 +08:00
Kumataro
b14ea19466
Merge pull request #25351 from Kumataro:fix25073_format_g
core: persistence: output reals as human-friendly expression. #25351

Close #25073
Related https://github.com/opencv/opencv/pull/25087

This patch is need to merge same time with https://github.com/opencv/opencv_contrib/pull/3714

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-04-10 15:17:15 +03:00
Alexander Smorkalov
e1ed422bdb HALL interface for transpose2d. 2024-04-05 14:12:36 +03:00
Alexander Smorkalov
9f123f8d74
Merge pull request #25285 from johnteslade:cgroupsv2-support
core: Add cgroupsv2 support to parallel.cpp
2024-03-30 11:26:23 +03:00
Alexander Smorkalov
7945f2cf40 Fixed HAL invocation for DCT. 2024-03-29 11:01:42 +03:00
John Slade
7f1140b48b core: Add cgroupsv2 support to parallel.cpp
The parallel code works out how many CPUs are on the system by checking
the quota it has been assigned in the Linux cgroup. The existing code
works under cgroups v1 but the file structure changed in cgroups v2.
From [1]:

    "cpu.cfs_quota_us" and "cpu.cfs_period_us" are replaced by "cpu.max"
  which contains both quota and period.

This commit add support to parallel so it will read from the cgroups v2
location. v1 support is still retained.

Resolves #25284

[1] 0d5936344f
2024-03-28 11:52:47 +00:00
Pierre Chatelier
1a537ab98f
Merge pull request #24893 from chacha21:cart_polar_inplace
Added in-place support for cartToPolar and polarToCart #24893

- a fused hal::cartToPolar[32|64]f() is used instead of sequential hal::magnitude[32|64]f/hal::fastAtan[32|64]f
- ipp_polarToCart is skipped for in-place processing (it seems not to support it correctly)

relates to #24891
### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [X] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [X] There is a reference to the original bug report and related work
- [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-03-26 15:38:17 +03:00
Yuantao Feng
3afe8ddaf8
core: Rename cv::float16_t to cv::hfloat (#25217)
* rename cv::float16_t to cv::fp16_t

* add typedef fp16_t float16_t

* remove zero(), bits() from fp16_t class

* fp16_t -> hfloat

* remove cv::float16_t::fromBits; add hfloatFromBits

* undo changes in conv_winograd_f63.simd.hpp and conv_block.simd.hpp

* undo some changes in dnn
2024-03-21 23:44:19 +03:00
Alexander Smorkalov
daa8f7dfc6 Partially back-port #25075 to 4.x 2024-03-05 12:15:39 +03:00
Vincent Rabaud
f8aa2896a1
Merge pull request #25024 from vrabaud:neon
Replace legacy __ARM_NEON__ by __ARM_NEON #25024

Even ACLE 1.1 referes to __ARM_NEON
https://developer.arm.com/documentation/ihi0053/b/?lang=en

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2024-02-20 11:29:23 +03:00
ryanking13
422d519703 Enable file system on Emscripten 2024-01-31 11:28:59 -08:00
Alexander Alekhin
40533dbf69
Merge pull request #24918 from opencv-pushbot:gitee/alalek/core_convertfp16_replacement
core(OpenCL): optimize convertTo() with CV_16F (convertFp16() replacement) #24918

relates #24909
relates #24917
relates #24892

Performance changes:

- [x] 12700K (1 thread) + Intel iGPU

|Name of Test|noOCL|convertFp16|convertTo BASE|convertTo PATCH|
|---|:-:|:-:|:-:|:-:|
|ConvertFP16FP32MatMat::OCL_Core|3.130|3.152|3.127|3.136|
|ConvertFP16FP32MatUMat::OCL_Core|3.030|3.996|3.007|2.671|
|ConvertFP16FP32UMatMat::OCL_Core|3.010|3.101|3.056|2.854|
|ConvertFP16FP32UMatUMat::OCL_Core|3.016|3.298|2.072|2.061|
|ConvertFP32FP16MatMat::OCL_Core|2.697|2.652|2.723|2.721|
|ConvertFP32FP16MatUMat::OCL_Core|2.752|4.268|2.662|2.947|
|ConvertFP32FP16UMatMat::OCL_Core|2.706|2.601|2.603|2.528|
|ConvertFP32FP16UMatUMat::OCL_Core|2.704|3.215|1.999|1.988|

Patched version is not worse than convertFp16 and convertTo baseline (except MatUMat 32->16, baseline uses CPU code+dst buffer map).
There are still gaps against noOpenCL(CPU only) mode due to T-API implementation issues (unnecessary synchronization).


- [x] 12700K + AMD dGPU

|Name of Test|noOCL|convertFp16 dGPU|convertTo BASE dGPU|convertTo PATCH dGPU|
|---|:-:|:-:|:-:|:-:|
|ConvertFP16FP32MatMat::OCL_Core|3.130|3.133|3.172|3.087|
|ConvertFP16FP32MatUMat::OCL_Core|3.030|1.713|9.559|1.729|
|ConvertFP16FP32UMatMat::OCL_Core|3.010|6.515|6.309|4.452|
|ConvertFP16FP32UMatUMat::OCL_Core|3.016|0.242|23.597|0.170|
|ConvertFP32FP16MatMat::OCL_Core|2.697|2.641|2.713|2.689|
|ConvertFP32FP16MatUMat::OCL_Core|2.752|4.076|6.483|4.191|
|ConvertFP32FP16UMatMat::OCL_Core|2.706|9.042|16.481|1.834|
|ConvertFP32FP16UMatUMat::OCL_Core|2.704|0.229|15.730|0.176|

convertTo-baseline can't compile OpenCL kernel for FP16 properly - FIXED.
dGPU has much more power, so results are x16-17 better than single cpu core. 
Patched version is not worse than convertFp16 and convertTo baseline.
There are still gaps against noOpenCL(CPU only) mode due to T-API implementation issues (unnecessary synchronization) and required memory transfers.

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2024-01-26 12:56:52 +03:00
Sean McBride
e64857c561
Merge pull request #23736 from seanm:c++11-simplifications
Removed all pre-C++11 code, workarounds, and branches #23736

This removes a bunch of pre-C++11 workrarounds that are no longer necessary as C++11 is now required.
It is a nice clean up and simplification.

* No longer unconditionally #include <array> in cvdef.h, include explicitly where needed
* Removed deprecated CV_NODISCARD, already unused in the codebase
* Removed some pre-C++11 workarounds, and simplified some backwards compat defines
* Removed CV_CXX_STD_ARRAY
* Removed CV_CXX_MOVE_SEMANTICS and CV_CXX_MOVE
* Removed all tests of CV_CXX11, now assume it's always true. This allowed removing a lot of dead code.
* Updated some documentation consequently.
* Removed all tests of CV_CXX11, now assume it's always true
* Fixed links.

---------

Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com>
Co-authored-by: Alexander Smorkalov <alexander.smorkalov@xperience.ai>
2024-01-19 16:53:08 +03:00