Added flag to GaussianBlur for faster but not bit-exact implementation #25792
Rationale:
Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.
The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.
Replaces https://github.com/opencv/opencv/pull/22073
Possibly related issue: https://github.com/opencv/opencv/issues/24135
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Bit exact gaussian blur for 16bit unsigned int
* bit-exact gaussian kernel for CV_16U
* SIMD optimization
* template GaussianBlurFixedPoint
* remove template specialization
* simd support for h3N121 uint16
* test for u16 gaussian blur
* remove unnecessary comments
* fix return type of raw()
* add typedef of native internal type in fixedpoint
* update return type of raw()
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
* Bit-exact implementation of GaussianBlur smoothing
* Added universal intrinsics based implementation for bit-exact CV_8U GaussianBlur smoothing.
* Added parallel_for to evaluation of bit-exact GaussianBlur
* Added custom implementations for 3x3 and 5x5 bit-exact GaussianBlur