Commit Graph

9 Commits

Author SHA1 Message Date
Maksim Shabunin
f37924796f
Merge pull request #25364 from mshabunin:fix-unaligned-filter
imgproc: fix unaligned memory access in filters and Gaussian blur #25364

* filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform.
* GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations

Performance comparison:
- [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods)
- [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods)

Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK.

Slightly related PR: https://github.com/opencv/ci-gha-workflow/pull/165
2024-04-09 17:44:36 +03:00
HAN Liutong
5e9191558d
Merge pull request #24058 from hanliutong:rewrite-imgporc
Rewrite Universal Intrinsic code by using new API: ImgProc module. #24058

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro in the `opencv/modules/imgproc` folder: rewrite them by using the new Universal Intrinsic API. 

For easier review, this PR includes a part of the rewritten code, and another part will be brought in the next PR (coming soon). I tested this patch on RVV (QEMU) and AVX devices, `opencv_test_imgproc` is passed.

The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR https://github.com/opencv/opencv/pull/23885 and https://github.com/opencv/opencv/pull/23980.



### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
2023-09-14 20:37:46 +03:00
Tomoaki Teshima
1e74f5850b suppress GaussianBlur to generate empty images
* sharp Gaussian kernel causes over flow and ends up in blank image
2021-10-01 23:17:02 +09:00
Xinguang Bian
7499a15c92 fix data overflow problem in GaussianBlur 2021-05-21 15:17:20 +08:00
Yosshi999
fdeac73a59
Merge pull request #18983 from Yosshi999:bitexact-gaussian-16U-faster
support SIMD for larger symmetric Bit-exact 16U gaussian blur

* support SIMD for bit-exact 16U symmetric gaussian blur

* use tighter SIMD registers
2020-12-11 10:14:15 +00:00
Yosshi999
698b2bf729
Merge pull request #18167 from Yosshi999:bit-exact-gaussian
Bit exact gaussian blur for 16bit unsigned int

* bit-exact gaussian kernel for CV_16U

* SIMD optimization

* template GaussianBlurFixedPoint

* remove template specialization

* simd support for h3N121 uint16

* test for u16 gaussian blur

* remove unnecessary comments

* fix return type of raw()

* add typedef of native internal type in fixedpoint

* update return type of raw()
2020-09-01 10:28:25 +00:00
Vitaly Tuzov
894ad33bf4 Fix pixel value evaluation overflow in bit-exact GaussianBlur implementation 2019-07-12 18:11:51 +03:00
Alexander Alekhin
b99c9145bf imgproc: dispatch smooth 2019-03-11 13:54:12 +00:00
Alexander Alekhin
6eac8f78b9 imgproc: copy .simd.hpp 2019-03-11 13:53:59 +00:00