Commit Graph

90 Commits

Author SHA1 Message Date
Vadim Pisarevsky
2f35847960
Merge pull request #26321 from vpisarev:better_bfloat
2x more accurate float => bfloat conversion #26321

There is a magic trick to make float => bfloat conversion more accurate (_original reference needed, is it done this way in PyTorch?_). In simplified form it looks like:

```
uint16_t f2bf(float x) {
    union {
        unsigned u;
        float f;
    } u;
    u.f = x;
    // return (uint16_t)(u.u >> 16); <== the old method before this patch
    return (uint16_t)((u.u + 0x8000) >> 16);
} 
```

it works correctly for almost all valid floating-point values, positive, zero or negative, and even for some extreme cases, like `+/-inf`, `nan` etc. The addition of `0x8000` to integer representation of 32-bit float before retrieving the highest 16 bits reduces the rounding error by ~2x.

The slight problem with this improved method is that the numbers very close to or equal to `+/-FLT_MAX` are mistakenly converted to `+/-inf`, respectively.

This patch implements improved algorithm for `float => bfloat` conversion in scalar and vector form; it fixes the above-mentioned problem using some extra bit magic, i.e. 0x8000 is not added to very big (by absolute value) numbers:

```
// the actual implementation is more efficient,
// without conditions or floating-point operations, see the source code
return (uint16_t)(u.u + (fabsf(x) <= big_threshold ? 0x8000 : 0)) >> 16);
```

The corresponding test has been added as well and this is output from the test:

```
[----------] 1 test from Core_BFloat
[ RUN      ] Core_BFloat.convert
maxerr0 = 0.00774842, mean0 = 0.00190643, stddev0 = 0.00186063
maxerr1 = 0.00389057, mean1 = 0.000952614, stddev1 = 0.000931268
[       OK ] Core_BFloat.convert (7 ms)
```

Here `maxerr0, mean0, stddev0` are for the original method and `maxerr1, mean1, stddev1` are for the new method. As you can see, there is a significant improvement in accuracy.

**Note:**

_Actually, on ~32,000,000 random FP32 numbers with uniformly distributed sign, exponent and mantissa the new method is always at least as accurate as the old one._

The test also checks all the corner cases, where we see no degradation either vs the original method.

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
2024-10-18 14:46:40 +03:00
Maksim Shabunin
d0e410da93 C-API cleanup: rework ArrayTest to use new arrays only 2024-10-09 22:36:20 +03:00
Maksim Shabunin
26ea34c4cb Merge branch '4.x' into '5.x' 2024-06-26 19:01:34 +03:00
Maksim Shabunin
ef3303716e test: use cv::theRNG instead of own generator 2024-06-07 13:36:11 +03:00
Maksim Shabunin
8cbdd0c833
Merge pull request #25075 from mshabunin:cleanup-imgproc-1
C-API cleanup: apps, imgproc_c and some constants #25075

Merge with https://github.com/opencv/opencv_contrib/pull/3642

* Removed obsolete apps - traincascade and createsamples (please use older OpenCV versions if you need them). These apps relied heavily on C-API
* removed all mentions of imgproc C-API headers (imgproc_c.h, types_c.h) - they were empty, included core C-API headers
* replaced usage of several C constants with C++ ones (error codes, norm modes, RNG modes, PCA modes, ...) - most part of this PR (split into two parts - all modules and calib+3d - for easier backporting)
* removed imgproc C-API headers (as separate commit, so that other changes could be backported to 4.x)

Most of these changes can be backported to 4.x.
2024-03-05 12:18:31 +03:00
Alexander Smorkalov
5af40a0269 Merge branch 4.x 2023-07-05 15:51:10 +03:00
Pierre Chatelier
6dd8a9b6ad
Merge pull request #13879 from chacha21:REDUCE_SUM2
add REDUCE_SUM2 #13879 

proposal to add REDUCE_SUM2 to cv::reduce, an operation that sums up the square of elements
2023-04-28 20:42:52 +03:00
Alexander Alekhin
1d530eb2e2 core(test_math): replace the_rng() => cv::theRNG() 2023-01-29 19:51:18 +00:00
Alexander Alekhin
f33598f55e Merge branch 4.x 2023-01-28 17:31:32 +00:00
Alexander Alekhin
18cbfa4a4f Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2023-01-23 00:11:12 +00:00
Rostislav Vasilikhin
f3a03aefad cvIsInf(double) fix + regression test 2023-01-17 23:06:39 +01:00
Alexander Alekhin
899b4d1452 Merge branch 4.x 2022-02-22 19:55:26 +00:00
Suleyman TURKMEN
0e6a2c0491 fix legacy constants 2022-01-03 15:08:10 +03:00
Alexander Alekhin
a0d5277e0d Merge branch 4.x 2021-12-30 21:43:45 +00:00
Alexander Alekhin
8b4fa2605e Merge remote-tracking branch 'upstream/3.4' into merge-3.4 2021-12-03 12:32:49 +00:00
yuki takehara
a6277370ca
Merge pull request #21107 from take1014:remove_assert_21038
resolves #21038

* remove C assert

* revert C header

* fix several points in review

* fix test_ds.cpp
2021-11-27 18:34:52 +00:00
Suleyman TURKMEN
178240ccf1 Clean up C API backport ready changes 2021-03-03 14:37:45 +03:00
Alexander Alekhin
9d2eabaaa2 Merge remote-tracking branch 'upstream/master' into merge-4.x 2020-11-27 18:15:28 +00:00
Alexander Alekhin
5c987e4c75
Merge pull request #18924 from alalek:4.x-xcode12
(4.x) build: Xcode 12 support

* build: xcode 12 support, cmake fixes

* ts: eliminate clang 11 warnigns

* 3rdparty: clang 11 warnings

* features2d: eliminate build warnings

* test: warnings

* gapi: warnings from 18928
2020-11-26 22:56:59 +00:00
Vadim Pisarevsky
2ee9d21dae
Merge pull request #18571 from vpisarev:add_lapack
Added clapack

* bring a small subset of Lapack, automatically converted to C, into OpenCV

* added missing lsame_ prototype

* * small fix in make_clapack script
* trying to fix remaining CI problems

* fixed character arrays' initializers

* get rid of F2C_STR_MAX

* * added back single-precision versions for QR, LU and Cholesky decompositions. It adds very little extra overhead.
* added stub version of sdesdd.
* uncommented calls to all the single-precision Lapack functions from opencv/core/src/hal_internal.cpp.

* fixed warning from Visual Studio + cleaned f2c runtime a bit

* * regenerated Lapack w/o forward declarations of intrinsic functions (such as sqrt(), r_cnjg() etc.)
* at once, trailing whitespaces are removed from the generated sources, just in case
* since there is no declarations of intrinsic functions anymore, we could turn some of them into inline functions

* trying to eliminate the crash on ARM

* fixed API and semantics of s_copy

* * CLapack has been tested successfully. It's now time to restore the standard LAPACK detection procedure
* removed some more trailing whitespaces

* * retained only the essential stuff in CLapack
* added checks to lapack calls to gracefully return "not implemented" instead of returning invalid results with "ok" status

* disabled warning when building lapack

* cmake: update LAPACK detection

Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2020-11-05 21:46:51 +00:00
Alexander Alekhin
3266ac7667 core(kmeans): bailout if can't select cluster center 2019-11-22 14:40:02 +00:00
Paul E. Murphy
b2135be594 fast_math: add extra perf/unit tests
Add a basic sanity test to verify the rounding functions
work as expected.

Likewise, extend the rounding performance test to cover the
additional float -> int fast math functions.
2019-08-07 14:59:46 -05:00
Alexander Alekhin
d6b82dcd65
Merge pull request #14162 from alalek:eliminate_coverity_scan_issues
core: eliminate coverity scan issues (#14162)

* core(hal): avoid using of r,g,b,a parameters in interleave/deinterleave

- static analysis tools blame on possible parameters reordering
- align AVX parameters with corresponding SSE/NEO/VSX/cpp code

* core: avoid "i,j" parameters in Matx methods

- static analysis tools blame on possible parameters reordering

* core: resolve coverity scan issues
2019-03-27 15:48:00 +03:00
Alexander Alekhin
f1f15841d7 Merge pull request #11630 from alalek:c_api_eliminate_constructors 2018-09-06 20:07:16 +00:00
Vadim Pisarevsky
80b62a41c6 Merge pull request #12411 from vpisarev:wide_convert
* rewrote Mat::convertTo() and convertScaleAbs() to wide universal intrinsics; added always-available and SIMD-optimized FP16<=>FP32 conversion

* fixed compile warnings

* fix some more compile errors

* slightly relaxed accuracy threshold for int->float conversion (since we now do it using single-precision arithmetics, not double-precision)

* fixed compile errors on iOS, Android and in the baseline C++ version (intrin_cpp.hpp)

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds

* trying to fix ARM-neon builds
2018-09-06 19:36:59 +03:00
Alexander Alekhin
8a3c394d6a don't use constructors for C API structures 2018-09-06 14:34:16 +03:00
Vadim Pisarevsky
f058b5fb1e
Wide univ intrinsics (#11953)
* core:OE-27 prepare universal intrinsics to expand (#11022)

* core:OE-27 prepare universal intrinsics to expand (#11022)

* core: Add universal intrinsics for AVX2

* updated implementation of wide univ. intrinsics; converted several OpenCV HAL functions: sqrt, invsqrt, magnitude, phase, exp to the wide universal intrinsics.

* converted log to universal intrinsics; cleaned up the code a bit; added v_lut_deinterleave intrinsics.

* core: Add universal intrinsics for AVX2

* fixed multiple compile errors

* fixed many more compile errors and hopefully some test failures

* fixed some more compile errors

* temporarily disabled IPP to debug exp & log; hopefully fixed Doxygen complains

* fixed some more compile errors

* fixed v_store(short*, v_float16&) signatures

* trying to fix the test failures on Linux

* fixed some issues found by alalek

* restored IPP optimization after the patch with AVX wide intrinsics has been properly tested

* restored IPP optimization after the patch with AVX wide intrinsics has been properly tested
2018-07-16 18:57:24 +03:00
Alexander Alekhin
3c74fde349 core: eliminate 'if' logic from Matx::inv()/solve()
- 'if' logic is moved into templates.
- removed unnecessary cv::Mat objects creation.
- fixed inv() test (invA * A == eye)
- added more Matx tests to cover all defined template specializations
2018-07-13 20:09:01 +03:00
Vitaly Tuzov
850a8577b2 Fixed unreachable code warnings for Matx::solve() 2018-07-12 19:19:51 +03:00
Vitaly Tuzov
d0a3686812 Merge pull request #11904 from terfendail/matx_solve_fix
Fixed Matx::solve function for non-square matrixes (#11904)
2018-07-11 22:00:57 +03:00
Alexander Alekhin
4a297a2443 ts: refactor OpenCV tests
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
2018-02-03 19:39:47 +00:00
Rostislav Vasilikhin
7d18f49a49 SoftFloat tests: assert => expect 2017-12-14 21:03:25 +03:00
Vitaly Tuzov
86b128dbb3 Added implementation of softdouble rounding to int64_t 2017-12-11 14:29:32 +03:00
Tomoaki Teshima
3cbe60cca2 Merge pull request #9753 from tomoaki0705:universalMatmul
* add accuracy test and performance check for matmul
  * add performance tests for transform and dotProduct
  * add test Core_TransformLargeTest for 8u version of transform

* remove raw SSE2/NEON implementation from matmul.cpp
  * use universal intrinsic instead of raw intrinsic
  * remove unused templated function
  * add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
  * add v_rotate_left/right in universal intrinsic
  * suppress intrinsic on some function and platform
  * add pure SW implementation of new universal intrinsics
  * add test for new universal intrinsics

* core: prevent memory access after the end of buffer

* fix perf tests
2017-11-20 15:56:53 +03:00
Vladislav Sovrasov
32bf712102 cmake: disable implicit-fallthrough by default 2017-09-11 16:04:00 +03:00
Rostislav Vasilikhin
66b0651607 Merge pull request #9329 from savuor:softfloat_sincos
SoftFloat: added sin, cos and docs (#9329)

* softfloat: comparison operators made inline, min() max() eps() isSubnormal() added

* softfloat: get/set sign/exp

* softfloat: get/set frac

* softfloat: tests rewritten with new tools

* softfloat: added pi(), sin(), cos()

* softfloat: more comments

* softfloat: updated sincos arg reduction

* softfloat: initial tests for sincos added

* softfloat: test works, code cleanup is pending

* softfloat: sincos argreduce rewritten

* softfloat: sincos refactored and simplified

* softfloat sincos: epsilons calibrated

* softfloat: junk code removed from tests

* softfloat: docs added

* inline comparisons undone; warning fixed
2017-08-15 09:23:26 +00:00
Alexander Alekhin
71517a910a build: fix errors for MSVS2010-2013, reduce default softfloat scope 2017-06-08 01:09:21 +00:00
Rostislav Vasilikhin
c6a3a18894 SoftFloat integrated (#8668)
* everything is put into softfloat.cpp and softfloat.hpp

* WIP: try to integrate softfloat into OpenCV

* extra functions removed

* softfloat made stateless

* CV_EXPORTS added

* operators fixed

* exp added, log: WIP

* log32 fixed

* shorter names; a lot of TODOs

* log64 rewritten

* cbrt32 added

* minors, refactoring

* "inline" -> "CV_INLINE"

* cast to bool warnings fixed

* several warnings fixed

* fixed warning about unsigned unary minus

* fixed warnings on type cast

* inline -> CV_INLINE

* special cases processing added (NaNs, Infs, etc.)

* constants for NaN and Inf added

* more macros and helper functions added

* added (or fixed) tests for pow32, pow64, cbrt32

* exp-like functions fixed

* minor changes

* fixed random number generation for tests

* tests for exp32 and exp64: values are compared to SoftFloat-based naive implementation

* minor warning fix

* pow(f, i) 32/64: special cases handling added

* unused functions removed

* refactoring is in progress (not compiling)

* CV_inline added

* unions {uint_t, float_t} removed

* tests compilation fixed

* static const members -> static methods returning const

* reinterpret_cast

* warning fixed

* const-ness fixed

* all FP calculations (even compile-time) are done in SoftFloat + minor fixes

* pow(f, i) removed from interface (can cause incorrect cast) to internals of pow(f, f), tests fixed

* CV_INLINE -> inline

* internal constants moved to .cpp file

* toInt_minMag() methods merged into toInt() methods

* macros moved to .cpp file

* refactoring: types renamed to softfloat and softdouble; explicit constructors, etc.

* toFloat(), toDouble() -> operator float(), operator double()

* removed f32/f64 prefixes from functions names

* toType() methods removed, round() and trunc() functions added

* minor change

* minors

* MSVC: warnings fixed

* added int cvRound(), cvFloor, cvCeil, cvTrunc, saturate_cast<T>()

* typo fixed

* type cast fixed
2017-05-29 17:07:25 +03:00
Maksim Shabunin
b417b4dbee KMeans improvement
- fixed returned compactness value
- added centers drawing to the example app
- added compactness test
2017-01-31 12:05:08 +03:00
Vladislav Sovrasov
dfe4519c07 Add QR decomposition to HAL 2016-09-05 18:20:04 +03:00
LaurentBerger
b75bac7975 Solve Issue 7063
consequences of changes

accuracy test

Solve issue 7063
2016-08-11 10:56:50 +02:00
Alexander Alekhin
3844ee780c build: fix compiler warnings (GCC 5.3.1) 2016-07-01 20:17:16 +03:00
Maksim Shabunin
1e667de1f3 HAL math interfaces: fastAtan2, magnitude, sqrt, invSqrt, log, exp 2016-05-31 11:54:52 +03:00
Alexander Alekhin
b26580cc7b checkRange fixes
1) fix multichannel support
2) remove useless bad_value, read value from original Mat directly
3) add more tests
4) fix docs for cvCeil and checkRange
2015-12-09 18:31:27 +03:00
Vadim Pisarevsky
d19897b734 Merge pull request #5651 from hoangviet1985:fix_solvePoly_3.0.0 2015-12-07 10:12:54 +00:00
hoangviet1985
3e96b724c2 squash 2015-11-20 15:03:32 -05:00
hoangviet1985
b96def885f squash 2015-11-20 14:48:29 -05:00
Vadim Pisarevsky
73f760fdf0 some more compile warnings fixed 2015-05-05 18:03:40 +03:00
Vadim Pisarevsky
9fbd1d68ad refactored div & pow funcs; added tests for special cases in pow() function.
fixed http://code.opencv.org/issues/3935
possibly fixed http://code.opencv.org/issues/3594
2015-05-01 21:49:11 +03:00
Adil Ibragimov
8a4a1bb018 Several type of formal refactoring:
1. someMatrix.data -> someMatrix.prt()
2. someMatrix.data + someMatrix.step * lineIndex -> someMatrix.ptr( lineIndex )
3. (SomeType*) someMatrix.data -> someMatrix.ptr<SomeType>()
4. someMatrix.data -> !someMatrix.empty() ( or !someMatrix.data -> someMatrix.empty() ) in logical expressions
2014-08-13 15:21:35 +04:00