G-API: Introduce a Queue Source #24178
- Added a new IStreamSource class: in fact, a wrapper over a concurrent queue;
- Added minimal example on how it can be used;
- Extended IStreamSource with optional "halt" interface to break the blocking calls in the emitter threads when required to stop.
- Introduced a QueueInput class which allows to pass the whole graph's input vector at once. In fact it is a thin wrapper atop of individual Queue Sources.
There is a hidden trap found with our type system as described in https://github.com/orgs/g-api-org/discussions/2
While it works even in this form, it should be addressed somewhere in the 5.0 timeframe.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
* add broadcast_to with tests
* change name
* fix test
* fix implicit type conversion
* replace type of shape with InputArray
* add perf test
* add perf tests which takes care of axis
* v2 from ficus expand
* rename to broadcast
* use randu in place of declare
* doc improvement; smaller scale in perf
* capture get_index by reference
If building with -mcpu=native or any other setting which implies the current
CPU has FP16 but with intrinsics disabled, we mistakenly try to use it even
though convolution.hpp conditionally defines it correctly based on whether
we should *use it*. convolution.cpp on the other hand was mismatched and
trying to use it if the CPU supported it, even if not enabled in the build
system.
Make the guards match.
Bug: https://bugs.gentoo.org/913031
Signed-off-by: Sam James <sam@gentoo.org>
Skip test on SkipTestException at fixture's constructor
* Skip test on SkipTestException at fixture's constructor
* Add warning supression
* Skip Python tests if no test file found
* Skip instances of test fixture with exception at SetUpTestCase
* Skip test with exception at SetUp method
* Try remove warning disable
* Add CV_NORETURN
* Remove FAIL assertion
* Use findDataFile to throw Skip exception
* Throw exception conditionally
* core:add OPENCV_IPP_MEAN/MINMAX/SUM option to enable IPP optimizations
* fix: to use guard HAVE_IPP and ocv_append_source_file_compile_definitions() macro.
* support OPENCV_IPP_ENABLE_ALL
* add document for OPENCV_IPP_ENABLE_ALL
* fix OPENCV_IPP_ENABLE_ALL comment
Fixed an off-by-1 buffer resize, the space for the null termination was forgotten.
Prefer snprintf, which can never overflow (if given the right size).
In one case I cheated and used strcpy, because I cannot figure out the buffer size at that point in the code.
OCL_FP16 MatMul with large batch
* Workaround FP16 MatMul with large batch
* Fix OCL reinitialization
* Higher thresholds for INT8 quantization
* Try fix gemm_buffer_NT for half (columns)
* Fix GEMM by rows
* Add batch dimension to InnerProduct layer test
* Fix Test_ONNX_conformance.Layer_Test/test_basic_conv_with_padding
* Batch 16
* Replace all vload4
* Version suffix for MobileNetSSD_deploy Caffe model
Rewrite Universal Intrinsic code by using new API: Core module. #23980
The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API.
The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885.
Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are:
1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited
- ./modules/core/src/stat.simd.hpp
- ./modules/core/src/matrix_transform.cpp
- ./modules/core/src/matmul.simd.hpp
2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly.
- ./modules/core/src/mathfuncs_core.simd.hpp
```cpp
struct v_atan_f32
{
explicit v_atan_f32(const float& scale)
{
...
}
v_float32 compute(const v_float32& y, const v_float32& x)
{
...
}
...
v_float32 val90; // sizeless type can not used in a class
v_float32 val180;
v_float32 val360;
v_float32 s;
};
```
3. The API interface does not support/does not match
- ./modules/core/src/norm.cpp
Use `v_popcount`, ~~waiting for #23966~~ Fixed
- ./modules/core/src/has_non_zero.simd.hpp
Use illegal Universal Intrinsic API: For float type, there is no logical operation `|`. Further discussion needed
```cpp
/** @brief Bitwise OR
Only for integer types. */
template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b);
template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b);
```
```cpp
#if CV_SIMD
typedef v_float32 v_type;
const v_type v_zero = vx_setzero_f32();
constexpr const int unrollCount = 8;
int step = v_type::nlanes * unrollCount;
int len0 = len & -step;
const float* srcSimdEnd = src+len0;
int countSIMD = static_cast<int>((srcSimdEnd-src)/step);
while(!res && countSIMD--)
{
v_type v0 = vx_load(src);
src += v_type::nlanes;
v_type v1 = vx_load(src);
src += v_type::nlanes;
....
src += v_type::nlanes;
v0 |= v1; //Illegal ?
....
//res = v_check_any(((v0 | v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ
res = !v_check_all(((v0 | v4) == v_zero));
}
v_cleanup();
#endif
```
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Fix python sample code (tst_scene_render) #24116
Fix bug of python sample code (samples/python/tst_scene_render.py) when backGr or fgr is None (#24114)
1) pass shape tuple to np.zeros arguments instead of integers
2) change np.int to int
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [o] I agree to contribute to the project under Apache 2 License.
- [o] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [o] The PR is proposed to the proper branch
- [o] There is a reference to the original bug report and related work
- [o] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [o] The feature is well documented and sample code can be built with the project CMake
dnn: cleanup of tengine backend #24122🚀 Cleanup for OpenCV 5.0. Tengine backend is added for convolution layer speedup on ARM CPUs, but it is not maintained and the convolution layer on our default backend has reached similar performance to that of Tengine.
Tengine backend related PRs:
- https://github.com/opencv/opencv/pull/16724
- https://github.com/opencv/opencv/pull/18323
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake