opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-14 00:39:13 +08:00

History

Vadim Pisarevsky 2f35847960 Merge pull request #26321 from vpisarev:better_bfloat 2x more accurate float => bfloat conversion #26321 There is a magic trick to make float => bfloat conversion more accurate (_original reference needed, is it done this way in PyTorch?_). In simplified form it looks like: ``` uint16_t f2bf(float x) { union { unsigned u; float f; } u; u.f = x; // return (uint16_t)(u.u >> 16); <== the old method before this patch return (uint16_t)((u.u + 0x8000) >> 16); } ``` it works correctly for almost all valid floating-point values, positive, zero or negative, and even for some extreme cases, like `+/-inf`, `nan` etc. The addition of `0x8000` to integer representation of 32-bit float before retrieving the highest 16 bits reduces the rounding error by ~2x. The slight problem with this improved method is that the numbers very close to or equal to `+/-FLT_MAX` are mistakenly converted to `+/-inf`, respectively. This patch implements improved algorithm for `float => bfloat` conversion in scalar and vector form; it fixes the above-mentioned problem using some extra bit magic, i.e. 0x8000 is not added to very big (by absolute value) numbers: ``` // the actual implementation is more efficient, // without conditions or floating-point operations, see the source code return (uint16_t)(u.u + (fabsf(x) <= big_threshold ? 0x8000 : 0)) >> 16); ``` The corresponding test has been added as well and this is output from the test: ``` [----------] 1 test from Core_BFloat [ RUN ] Core_BFloat.convert maxerr0 = 0.00774842, mean0 = 0.00190643, stddev0 = 0.00186063 maxerr1 = 0.00389057, mean1 = 0.000952614, stddev1 = 0.000931268 [ OK ] Core_BFloat.convert (7 ms) ``` Here `maxerr0, mean0, stddev0` are for the original method and `maxerr1, mean1, stddev1` are for the new method. As you can see, there is a significant improvement in accuracy. Note: _Actually, on ~32,000,000 random FP32 numbers with uniformly distributed sign, exponent and mantissa the new method is always at least as accurate as the old one._ The test also checks all the corner cases, where we see no degradation either vs the original method. - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake		2024-10-18 14:46:40 +03:00
..
ocl	Merge pull request #26256 from vpisarev:expanded_tests_for_norm	2024-10-07 17:07:59 +03:00
ref_reduce_arg.impl.hpp	Merge pull request #20733 from rogday:argmaxnd	2021-11-28 16:17:46 +00:00
test_allocator.cpp	Merge branch 4.x	2024-08-06 15:31:30 +03:00
test_arithm.cpp	Merge branch 4.x	2024-09-23 14:18:25 +03:00
test_async.cpp	Merge pull request #23736 from seanm:c++11-simplifications	2024-01-19 16:53:08 +03:00
test_concatenation.cpp	Added more strict checks for empty inputs to compare, meanStdDev and RNG::fill	2018-07-26 18:06:38 +03:00
test_conjugate_gradient.cpp	ts: refactor OpenCV tests	2018-02-03 19:39:47 +00:00
test_countnonzero.cpp	removed unnecessary check (checked many times in other tests)	2024-10-04 23:57:24 +03:00
test_cuda.cpp	add cuda::Stream constructor with cuda flags	2021-01-28 16:14:01 +01:00
test_downhill_simplex.cpp	Misc. modules/ typos	2018-02-12 07:09:43 -05:00
test_ds.cpp	Merge pull request #25075 from mshabunin:cleanup-imgproc-1	2024-03-05 12:18:31 +03:00
test_dxt.cpp	C-API cleanup: rework ArrayTest to use new arrays only	2024-10-09 22:36:20 +03:00
test_eigen.cpp	Merge pull request #26101 from mshabunin:cpp-error-ts	2024-09-06 12:05:47 +03:00
test_hal_core.cpp	Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX	2022-09-10 09:39:43 +03:00
test_hasnonzero.cpp	Extended several core functions to support new types (#24962 )	2024-02-11 10:42:41 +03:00
test_intrin128.simd.hpp	Merge pull request #22179 from hanliutong:new-rvv	2022-07-19 20:02:00 +03:00
test_intrin256.simd.hpp	core(test): intrinsic tests for all dispatched CPU optimizations	2018-08-01 13:50:42 +03:00
test_intrin512.simd.hpp	Merge pull request #14210 from terfendail:wui_512	2019-06-03 18:05:35 +03:00
test_intrin_emulator.cpp	Merge pull request #16236 from alalek:fix_core_simd_emulator	2020-01-10 21:31:02 +03:00
test_intrin_utils.hpp	Merge pull request #26023 from WanliZhong:vfunc_hfloat	2024-09-02 12:37:57 +03:00
test_intrin.cpp	core: fix F16C compilation check	2020-11-17 12:22:49 +00:00
test_io.cpp	Merge branch 4.x	2024-08-06 15:31:30 +03:00
test_logtagconfigparser.cpp	Merge pull request #13909 from kinchungwong:logging_20190220	2019-04-22 00:01:10 +03:00
test_logtagmanager.cpp	build: reduce usage of constexpr	2019-04-22 15:41:27 +03:00
test_lpsolver.cpp	Merge remote-tracking branch 'origin/3.4' into merge-3.4	2023-06-20 09:56:57 +03:00
test_main.cpp	Merge pull request #11897 from Jakub-Golinowski:hpx_backend	2018-08-31 16:23:26 +03:00
test_mat.cpp	Merge pull request #25945 from vrabaud:02_fix	2024-07-30 12:44:12 +03:00
test_math.cpp	Merge pull request #26321 from vpisarev:better_bfloat	2024-10-18 14:46:40 +03:00
test_misc.cpp	Merge branch 4.x	2024-08-28 15:06:19 +03:00
test_opencl.cpp	UMat usageFlags fixes opencv/opencv#19807	2021-06-03 16:33:03 +02:00
test_operations.cpp	Merge branch 4.x	2024-08-28 15:06:19 +03:00
test_precomp.hpp	Merge branch 4.x	2024-01-23 17:06:52 +03:00
test_ptr.cpp	3.4: backported changes from 'master' branch	2019-08-14 16:36:08 +03:00
test_quaternion.cpp	Merge pull request #19026 from chargerKong:dualquat	2021-02-17 17:05:08 +00:00
test_rand.cpp	Merge pull request #25075 from mshabunin:cleanup-imgproc-1	2024-03-05 12:18:31 +03:00
test_rotatedrect.cpp	ts: refactor OpenCV tests	2018-02-03 19:39:47 +00:00
test_umat.cpp	Merge pull request #25075 from mshabunin:cleanup-imgproc-1	2024-03-05 12:18:31 +03:00
test_utils_tls.impl.hpp	Merge pull request #23736 from seanm:c++11-simplifications	2024-01-19 16:53:08 +03:00
test_utils.cpp	replace lena.jpg in find-existing-file tests	2024-05-25 08:53:33 +01:00