opencv/modules/core
Developer-Ecosystem-Engineering 814550d2a6
Merge pull request #20011 from Developer-Ecosystem-Engineering:3.4
Improve performance on Arm64

* Improve performance on Apple silicon

This patch will
- Enable dot product intrinsics for macOS arm64 builds
- Enable for macOS arm64 builds
- Improve HAL primitives
  - reduction (sum, min, max, sad)
  - signmask
  - mul_expand
  - check_any / check_all

Results on a M1 Macbook Pro

* Updates to #20011 based on feedback

  - Removes Apple Silicon specific workarounds
  - Makes #ifdef sections smaller for v_mul_expand cases
  - Moves dot product optimization to compiler optimization check
  - Adds 4x4 matrix transpose optimization

* Remove dotprod and fix v_transpose

Based on the latest, we've removed dotprod entirely and will revisit in a future PR.

Added explicit cats with v_transpose4x4()

This should resolve all opens with this PR

* Remove commented out lines

Remove two extraneous comments
2021-06-01 09:39:55 +03:00
..
3rdparty/SoftFloat Add install component for 3rdparty libraries licenses 2018-03-06 16:32:30 +03:00
doc docs: intro formatting update, minor cleanup 2018-11-04 02:36:24 +00:00
include/opencv2 Merge pull request #20011 from Developer-Ecosystem-Engineering:3.4 2021-06-01 09:39:55 +03:00
misc Fix/optimize Android put/get functions 2021-02-19 17:10:11 +09:00
perf Relax accuracy requirements in the OpenCL sqrt perf arithmetic test. 2021-04-06 17:32:48 +01:00
src Merge pull request #20172 from alalek:fixup_19334 2021-05-28 14:09:52 +00:00
test Merge pull request #20011 from Developer-Ecosystem-Engineering:3.4 2021-06-01 09:39:55 +03:00
CMakeLists.txt core: rework code locality 2021-03-02 23:24:28 +00:00