Improve performance on Arm64
* Improve performance on Apple silicon
This patch will
- Enable dot product intrinsics for macOS arm64 builds
- Enable for macOS arm64 builds
- Improve HAL primitives
- reduction (sum, min, max, sad)
- signmask
- mul_expand
- check_any / check_all
Results on a M1 Macbook Pro
* Updates to #20011 based on feedback
- Removes Apple Silicon specific workarounds
- Makes #ifdef sections smaller for v_mul_expand cases
- Moves dot product optimization to compiler optimization check
- Adds 4x4 matrix transpose optimization
* Remove dotprod and fix v_transpose
Based on the latest, we've removed dotprod entirely and will revisit in a future PR.
Added explicit cats with v_transpose4x4()
This should resolve all opens with this PR
* Remove commented out lines
Remove two extraneous comments
* Add Neon optimised RGB2Lab conversion
* Fix compile errors, change lambda to macro
* Change NEON optimised RGB2Lab to just use HAL
* Change [] to v_extract_n in RGB2Lab
* RGB2LAB Code quality, change to nlane agnostic
* Change RGB2Lab to use function rather than macro
* Remove whitespace
Co-authored-by: Francesco Petrogalli <25690309+fpetrogalli@users.noreply.github.com>
- Added missing documentation for the CALIB_FIX_FOCAL_LENGTH flag
- Removed erroneous information about the number of distortion coefficients
returned
- Added some missing @ref tags
Fix unsigned int bug in computeECC
* address issue with unsigned ints in computeEcc
* remove additional logic checking firstOctave
* use swap instead of same src/dst
* simplify the unsigned check logic
fix a build warning:
```
C:\Slave\workspace\precommit\windows10\opencv\modules\photo\src\contrast_preserve.hpp(289): warning C4244: '=': conversion from 'double' to '_Tp', possible loss of data
with
[
_Tp=float
]
C:\Slave\workspace\precommit\windows10\opencv\modules\photo\src\contrast_preserve.hpp(361): warning C4244: '=': conversion from 'double' to '_Tp', possible loss of data
with
[
_Tp=float
]
```
(from https://build.opencv.org.cn/job/precommit/job/windows10/1633/console)
Currently, the LOADER_DIR is set as os.path.dirname(os.path.abspath(__file__)). This does not point to the true library path if the cv2 folder is symlinked into the Python package directory such that importing cv2 under Python fails. The proposed change only resolves symbolic links correctly by calling os.path.realpath(__file__) first and does not change anything if __file__ contains no symbolic link.
Fix bug with predictions in RTrees/Boost
* address bug where predict functions with invalid feature count in rtrees/boost models
* compact matrix rep in tests
* check 1..n-1 and n+1 in feature size validation test
Fix Single ThresholdBug in Simple Blob Detector
* address bug with using min dist between blobs in blob detector
cast type in comparison and remove docs
address bug with using min dist between blobs in blob detector
use scalar instead of int
address bug with using min dist between blobs in blob detector
* fix namespace and formatting
Also bring perf_imgproc CornerMinEigenVal accuracy requirements in line with
the test_imgproc accuracy requirements on that test and fix indentation on
the latter.
Partially addresses issue #9821
This commit passes the parameter maxIters that represent
the maximum number of iterations, that can be passed to findFundamentalMat
to the method LMeDS.
This parameter were added to the function findFundamentalMat and
were passed just for the RANSAC method, but should be passed to
both methods to be consistent.
The MinEigenVal path through the corner.cl kernel makes use of native_sqrt,
a math builtin function which has implementation defined accuracy.
Partially addresses issue #9821
* Aligned OpenCV DNN and TF sum op behaviour
Support Mat (shape: [1, m, k, n] ) + Vec (shape: [1, 1, 1, n]) operation
by vec to mat expansion
* Added code corrections: backend, minor refactoring
Added OpenVINO ARM target
* Added IE ARM target
* Added OpenVINO ARM target
* Delete ARM target
* Detect ARM platform
* Changed device name in ArmPlugin
* Change ARM detection