G-API: Fix Journal usage in Fluid backend (#15238)
* Fix Journal usage in Fluid backend
* Delete dumpDotRequired(): invalid check
* Update mem consumption test
* Test that new test works
* Debug memory consumption function
* Increase iterations in test
* Re-write memory consumption measurement part
* Restore correct fix for Fluid journals
* G-API: rename ArgKind OPAQUE to GOPAQUE
Rename ArgKind value to GOPAQUE to fix conflict in the
user code when wingdi.h is included: it defines OPAQUE
macro that (for some reason) is chosen instead of ArgKind
value
* Add compatibility with existing API
* Renamed GOPAQUE to OPAQUE_VAL
Convert HOG from SSE SIMD to HAL - 35-45% faster on Power (VSX) (#15199)
* Convert SSE SIMD to HAL. 35-45% improvement for Power (VSX)
* Remove CV_NEON code. Use v_floor instead of 3 lines of code.
* Invert comparison logic to simplify code.
* Change initialization from v_load to constructor type.
* Remove unavoidable print of CV error
The return value covers whether the device exists.
This might be better hidden behind a debug flag, but I couldn't work out how to do that nicely.
* Use `CV_LOG_WARNING` macro to log rather than removing it entirely
* add -Wno-psabi when using GCC 6
* add -Wundef for CUDA 10
* add -Wdeprecated-declarations when using GCC 7
* add -Wstrict-aliasing and -Wtautological-compare for GCC 7
* replace cudaThreadSynchronize with cudaDeviceSynchronize
Implement cvRound using inline asm. No compiler support
exists today to properly optimize this. This results in
about a 4x speedup over the default rounding. Likewise,
simplify the growing number of rounding function overloads.
For P9 enabled targets, utilize the classification
testing instruction to test for Inf/Nan values. Operation
speedup is about 1.2x for FP32, and 1.5x for FP64 operands.
For P8 targets, fallback to the GCC nan inline. It provides
a 1.1/1.4x improvement for FP32/FP64 arguments.
Add a new macro definition OPENCV_USE_FASTMATH_GCC_BUILTINS to enable
usage of GCC inline math functions, if available and requested by the
user.
Likewise, enable it for POWER. This is nearly always a substantial
improvement over using integer manipulation as most operations can
be done in several instructions with no branching. The result is a
1.5-1.8x speedup in the ceil/floor operations.
1. As tested with AT 12.0-1 (GCC 8.3.1) compiler on P9 LE.
Add a basic sanity test to verify the rounding functions
work as expected.
Likewise, extend the rounding performance test to cover the
additional float -> int fast math functions.
Support new IE API (#15184)
* Add support OpenVINO R2 for layers
* Add Core API
* Fix tests
* Fix expectNoFallbacksFromIE for ONNX nets
* Remove deprecated API
* Remove td
* Remove TargetDevice
* Fix Async
* Add test
* Fix detectMyriadX
* Fix test
* Fix warning
* G-API-NG/API: Introduced inference API and IE-based backend
- Very quick-n-dirty implementation
- OpenCV's own DNN module is not used
- No tests so far
* G-API-NG/IE: Refined IE backend, added more tests
* G-API-NG/IE: Fixed various CI warnings & build issues + tests
- Added tests on multi-dimensional own::Mat
- Added tests on GMatDesc with dimensions
- Documentation on infer.hpp
- Fixed more warnings + added a ROI list test
- Fix descr_of clash for vector<Mat> & standalone mode
- Fix build issue with gcc-4.8x
- Addressed review comments
* G-API-NG/IE: Addressed review comments
- Pass `false` to findDataFile()
- Add deprecation warning suppression macros for IE