opencv/modules/core
Francesco Petrogalli 6ee23c9b85
Merge pull request #19486 from fpetrogalli:dotprod_fast-3.4
* [hal][neon] Optimize the v_dotprod_fast intrinsics for aarch64.

On Armv8 in AArch64 execution mode, we can skip the sequence

   v<op>_<ty>(vget_high_<ty>(x), vget_high_<ty>(y))

in favour of

   v<op>_high_<ty>(x, y)

This has better changes for recent compilers to use less data movement
operations and better register allocation. See for example:

   https://godbolt.org/z/bPq7vd

* [hal][neon] Fix build failure on armv7.

* [hal][neon] Address review comments in PR.

PR: https://github.com/opencv/opencv/pull/19486

* [hal][neon] Define macro to check for the AArch64 execution state of Armv8.

* [hal][neon] Fix macro definition for AArch64.

The fix is needed to prevent warnings when building for Armv7.
2021-02-11 13:24:09 +00:00
..
3rdparty/SoftFloat Add install component for 3rdparty libraries licenses 2018-03-06 16:32:30 +03:00
doc docs: intro formatting update, minor cleanup 2018-11-04 02:36:24 +00:00
include/opencv2 Merge pull request #19486 from fpetrogalli:dotprod_fast-3.4 2021-02-11 13:24:09 +00:00
misc bindings: basic support for #if preprocessor directives 2019-12-04 18:42:31 +03:00
perf core(ocl): options to control buffer access flags 2020-04-02 11:11:06 +00:00
src Disable thread sanitization when CV_USE_GLOBAL_WORKERS_COND_VAR is not set. 2021-02-09 14:12:39 +01:00
test Optimize calls to std::string::find() and friends for a single char. 2020-12-17 09:39:23 +01:00
CMakeLists.txt core: update handling of allocator stats type 2020-12-05 20:54:47 +00:00