Improve solveCubic accuracy #27347
### Pull Request Readiness Checklist
Fix#27323
```
2e-13 * x^3 + x^2 - 2 * x + 1 = 0 -> x^3 + 5e12 * x^2 - 1e13 * x + 5e12 = 0
```
The problem that coefficients have quite big magnitudes and current calculations are subject to round-off error
```
Q = (a1 * a1 - 3 * a2) * (1./9)
R = (2 * a1 * a1 * a1 - 9 * a1 * a2 + 27 * a3) * (1./54)
Qcubed = Q * Q * Q = a1^6/729 - (a1^4 a2)/81 + (a1^2 a2^2)/27 - a2^3/27
R * R = R^2 = a1^6/729 - (a1^4 a2)/81 + (a1^2 a2^2)/36 + (a1^3 a3)/27 - (a1 a2 a3)/6 + a3^2/4
d = Qcubed - R * R
```
Let `a1`, `a2`, `a3` have quite big same magnitudes, then we see that `Qcubed` and `R * R` have same terms `a1^6/729` and `-(a1^4 a2)/81` (which will be reduced in `d`), but they level out the other terms (these terms have `6`th and `5`th degree and other terms - less or equal than `4`th degree).
So, if these terms will participate in the calculation, this will lead to a huge round-off error.
But if we expand the expression, then round-off error should be less
```
d = Qcubed - R * R = 1/108 (a1^2 a2^2 - 4 a2^3 - 4 a1^3 a3 + 18 a1 a2 a3 - 27 a3^2)
```
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Add tests for solveCubic #27331
### Pull Request Readiness Checklist
Related to #27323
I found only randomized tests with number of roots always equal to `1` or `3`, `x^3 = 0` and some simple test for Java and Swift.
Obviously, they don't cover all cases (implementation has strong branching and number of roots can be equal to `-1`, `0` and `2` additionally).
So, I think it will be useful to try explicitly cover more cases (and implementation branches correspondingly)
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Add a basic sanity test to verify the rounding functions
work as expected.
Likewise, extend the rounding performance test to cover the
additional float -> int fast math functions.
* rewrote Mat::convertTo() and convertScaleAbs() to wide universal intrinsics; added always-available and SIMD-optimized FP16<=>FP32 conversion
* fixed compile warnings
* fix some more compile errors
* slightly relaxed accuracy threshold for int->float conversion (since we now do it using single-precision arithmetics, not double-precision)
* fixed compile errors on iOS, Android and in the baseline C++ version (intrin_cpp.hpp)
* trying to fix ARM-neon builds
* trying to fix ARM-neon builds
* trying to fix ARM-neon builds
* trying to fix ARM-neon builds
* core:OE-27 prepare universal intrinsics to expand (#11022)
* core:OE-27 prepare universal intrinsics to expand (#11022)
* core: Add universal intrinsics for AVX2
* updated implementation of wide univ. intrinsics; converted several OpenCV HAL functions: sqrt, invsqrt, magnitude, phase, exp to the wide universal intrinsics.
* converted log to universal intrinsics; cleaned up the code a bit; added v_lut_deinterleave intrinsics.
* core: Add universal intrinsics for AVX2
* fixed multiple compile errors
* fixed many more compile errors and hopefully some test failures
* fixed some more compile errors
* temporarily disabled IPP to debug exp & log; hopefully fixed Doxygen complains
* fixed some more compile errors
* fixed v_store(short*, v_float16&) signatures
* trying to fix the test failures on Linux
* fixed some issues found by alalek
* restored IPP optimization after the patch with AVX wide intrinsics has been properly tested
* restored IPP optimization after the patch with AVX wide intrinsics has been properly tested
- 'if' logic is moved into templates.
- removed unnecessary cv::Mat objects creation.
- fixed inv() test (invA * A == eye)
- added more Matx tests to cover all defined template specializations
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
* add accuracy test and performance check for matmul
* add performance tests for transform and dotProduct
* add test Core_TransformLargeTest for 8u version of transform
* remove raw SSE2/NEON implementation from matmul.cpp
* use universal intrinsic instead of raw intrinsic
* remove unused templated function
* add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
* add v_rotate_left/right in universal intrinsic
* suppress intrinsic on some function and platform
* add pure SW implementation of new universal intrinsics
* add test for new universal intrinsics
* core: prevent memory access after the end of buffer
* fix perf tests
This fixes all problems from the article "Checking OpenCV with PVS-Studio"
<http://www.viva64.com/en/b/0191/> that are not already fixed and are
not in 3rdparty or the legacy module.
The problems fixed are two instances of useless code and one instance
of unspecified behavior (right-shifting a negative number).