opencv

mirror of https://github.com/opencv/opencv.git synced 2025-01-24 03:03:12 +08:00

Author	SHA1	Message	Date
Alexander Smorkalov	daa8f7dfc6	Partially back-port #25075 to 4.x	2024-03-05 12:15:39 +03:00
Maksim Shabunin	adde942e34	OCL: fix incompatibility with Mali ruintime	2023-12-21 00:30:44 +03:00
Alexander Smorkalov	1c0ca41b6e	Merge pull request #24371 from hanliutong:clean-up Clean up the obsolete API of Universal Intrinsic	2023-10-20 12:50:26 +03:00
Vincent Rabaud	c96f48e7c9	Merge pull request #24412 from vrabaud:inter_area1 Speed up line merging in INTER_AREA #24412 This provides a 10 to 20% speed-up. Related perf test fix: https://github.com/opencv/opencv/pull/24417 This is a split of https://github.com/opencv/opencv/pull/23525 that will be updated to only deal with column merging. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-10-19 14:06:50 +03:00
Liutong HAN	a287605c3e	Clean up the Universal Intrinsic API.	2023-10-13 19:23:30 +08:00
HAN Liutong	5e9191558d	Merge pull request #24058 from hanliutong:rewrite-imgporc Rewrite Universal Intrinsic code by using new API: ImgProc module. #24058 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro in the `opencv/modules/imgproc` folder: rewrite them by using the new Universal Intrinsic API. For easier review, this PR includes a part of the rewritten code, and another part will be brought in the next PR (coming soon). I tested this patch on RVV (QEMU) and AVX devices, `opencv_test_imgproc` is passed. The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR https://github.com/opencv/opencv/pull/23885 and https://github.com/opencv/opencv/pull/23980. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2023-09-14 20:37:46 +03:00
Dmitry Kurtaev	c92135bdd1	Merge pull request #23634 from dkurt:fix_nearest_exact Fix even input dimensions for INTER_NEAREST_EXACT #23634 ### Pull Request Readiness Checklist resolves https://github.com/opencv/opencv/issues/22204 related: https://github.com/opencv/opencv/issues/9096#issuecomment-1551306017 /cc @Yosshi999 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2023-05-19 20:32:04 +03:00
Sean McBride	58e4a880a2	Deprecated convertTypeStr and made new variant that also takes the buffer size This allows removing the unsafe sprintf.	2023-04-26 09:48:15 -04:00
Sean McBride	47bea69322	Merge pull request #23055 from seanm:sprintf2 * Replaced most remaining sprintf with snprintf * Deprecated encodeFormat and introduced new method that takes the buffer length * Also increased buffer size at call sites to be a little bigger, in case int is 64 bit	2023-04-18 09:22:59 +03:00
Sean McBride	1829eba584	Fixed most clang -Wextra-semi warnings	2022-09-27 18:06:46 -04:00
wxsheng	4154bd0667	Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX * Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX * Add resize.lasx.cpp for Loongson SIMD acceleration * Add imgwarp.lasx.cpp for Loongson SIMD acceleration * Add LASX acceleration support for dnn/conv * Add CV_PAUSE(v) for Loongarch * Set LASX by default on Loongarch64 * LoongArch: tune test threshold for Core/HAL.mat_decomp/15 Co-authored-by: shengwenxue <shengwenxue@loongson.cn>	2022-09-10 09:39:43 +03:00
Alexander Alekhin	659cf7249e	imgproc(ocl): fix resizeLN, avoid integer overflow	2021-12-09 20:30:26 +00:00
yuki takehara	a6277370ca	Merge pull request #21107 from take1014:remove_assert_21038 resolves #21038 * remove C assert * revert C header * fix several points in review * fix test_ds.cpp	2021-11-27 18:34:52 +00:00
Yosshi999	7495a4722f	Merge pull request #18053 from Yosshi999:bit-exact-resizeNN Bit-exact Nearest Neighbor Resizing * bit exact resizeNN * change the value of method enum * add bitexact-nn to ResizeExactTest * test to compare with non-exact version * add perf for bit-exact resizenn * use cvFloor-equivalent * 1/3 scaling is not stable for floating calculation * stricter test * bugfix: broken data in case of 6 or 12bytes elements * bugfix: broken data in default pix_size * stricter threshold * use raw() for floor * use double instead of int * follow code reviews * fewer cases in perf test * center pixel convention	2020-08-28 21:20:05 +03:00
Alexander Alekhin	61c4cfd896	imgproc(resize): drop unused 'pix_size4'	2020-03-29 02:41:50 +00:00
Alexander Alekhin	be17f532e1	imgproc(resize): fix resizeNNInvoker handling of generic pixel size	2020-03-29 02:41:41 +00:00
Alexander Alekhin	3a546aa380	imgproc: revert resize changes from PR 16497	2020-02-17 15:23:59 +03:00
keeper121	d84360e7f3	Merge pull request #16497 from keeper121:master * Fix NN resize with dimentions > 4 * add test check for nn resize with channels > 4 * Change types from float to double * Del unnecessary test file. Move nn test to test_imgwarp. Add 5 channels test only.	2020-02-16 19:33:25 +03:00
Alexander Alekhin	f67c8e37d6	imgproc(resize): drop optimization for channels>4	2020-02-04 17:14:52 +03:00
Paul E. Murphy	c1cdb2416a	imgproc(resize): improve 8u3 HResize vector exit calc Actually, we can do this in constant time. xofs always contains same or increasing offset values. We can instead find the most extreme value used and never attempt to load it. Similarly, we can note for all dx >= 0 and dx < (dwidth - cn) where xofs[dx] + cn < xofs[dwidth-cn] implies dx < (dwidth - cn). Thus, we can use this to control our loop termination optimally. This fixes #16137 with little or no performance impact. I have also added a debug check as a sanity check.	2020-01-03 14:46:59 -06:00
Alexander Alekhin	07729e396d	imgproc(resize): avoid unnecessary type conversions	2019-12-26 00:02:52 +00:00
Alexander Alekhin	4733a19bab	Merge pull request #16194 from alalek:fix_16192 * imgproc(test): resize(LANCZOS4) reproducer 16192 * imgproc: fix resize LANCZOS4 coefficients generation	2019-12-19 13:20:42 +03:00
Vitaly Tuzov	f5a84f75c4	Fix for CV_8UC2 linear resize vectorization	2019-12-18 21:41:36 +00:00
Paul Murphy	1c4a64f0a1	Merge pull request #16138 from pmur:reg_16137 * imgproc: Prevent 1B overrun of 8C3 SIMD optimization The fourth value read via v_load_q is essentially ignored, but can cause trouble if it happens to cross page boundaries. The final few iterations may attempt to read the most extreme elements of S, which will read 1B beyond the array in most aligment cases. Dynamically compute the stop. This could be hoised from the loop, but will require a more extensive change. Likewise, cleanup the iteration increment statements to make it more obvious they do channel count (3) elements per pass. This should resolve #16137 * imgproc(resize): extra check	2019-12-12 13:00:44 +03:00
Paul Murphy	a011035ed6	Merge pull request #15257 from pmur:resize * resize: HResizeLinear reduce duplicate work There appears to be a 2x unroll of the HResizeLinear against k, however the k value is only incremented by 1 during the unroll. This results in k - 1 duplicate passes when k > 1. Likewise, the final pass may not respect the work done by the vector loop. Start it with the offset returned by the vector op if implemented. Note, no vector ops are implemented today. The performance is most noticable on a linear downscale. A set of performance tests are added to characterize this. The performance improvement is 10-50% depending on the scaling. * imgproc: vectorize HResizeLinear Performance is mostly gated by the gather operations for x inputs. Likewise, provide a 2x unroll against k, this reduces the number of alpha gathers by 1/2 for larger k. While not a 4x improvement, it still performs substantially better under P9 for a 1.4x improvement. P8 baseline is 1.05-1.10x due to reduced VSX instruction set. For float types, this results in a more modest 1.2x improvement. * Update U8 processing for non-bitexact linear resize * core: hal: vsx: improve v_load_expand_q With a little help, we can do this quickly without gprs on all VSX enabled targets. * resize: Fix cn == 3 step per feedback Per feedback, ensure we don't overrun. This was caught via the failure observed in Test_TensorFlow.inception_accuracy.	2019-12-09 14:54:06 +03:00
clunietp	2185bce4b7	Fix 13577	2019-11-18 07:41:34 -05:00
Vitaly Tuzov	3b015dfc7d	Merge pull request #14210 from terfendail:wui_512 AVX512 wide universal intrinsics (#14210) * Added implementation of 512-bit wide universal intrinsics(WIP) * Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP) * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics * Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons * Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction * Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values * Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float * Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT * Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images * Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave * Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks * Added implementation of 512-bit wide universal intrinsics(WIP): build fixes * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros * Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part * Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings * Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize * Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask() * Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces * Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines * Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable. * Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask. * Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512() * Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build * Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.	2019-06-03 18:05:35 +03:00
Vitaly Tuzov	99b39aa5bd	Fixed out of bound reading in LINEAR_EXACT resize for 8UC3	2019-03-05 17:21:21 +03:00
Vitaly Tuzov	334c4d62b5	Merge pull request #13781 from terfendail:warp_wintr Resize reworked using wide universal intrinsics (#13781) * Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize * Reworked linear resize using new wide LUT intrinsics * Fix for VSX intrinsics	2019-02-20 14:30:28 +03:00
Alexander Alekhin	2d5ccc7b3e	imgproc(resize): update checks (static analyzers)	2018-12-03 13:13:48 +03:00
maver1	e397434cb6	Merge pull request #12877 from maver1:3.4 * Updated ICV packages and IPP integration * core(test): minMaxIdx IPP regression test * core(ipp): workaround minMaxIdx problem * core(ipp): workaround meanStdDev() CV_32FC3 buffer overrun * Returned semicolon after CV_INSTRUMENT_REGION_IPP()	2018-10-24 15:02:53 +03:00
Michał Janiszewski	c8e6ce304f	Catch exceptions by const-reference Exceptions caught by value incur needless cost in C++, most of them can be caught by const-reference, especially as nearly none are actually used. This could allow compiler generate a slightly more efficient code.	2018-10-16 22:43:54 +02:00
take1014	24af70c7e0	resolves 11283	2018-10-12 23:08:25 +09:00
Vitaly Tuzov	9d602f2752	Replaced SSE2 area resize implementation with wide universal intrinsic implementation	2018-10-08 16:27:52 +03:00
Alexander Alekhin	92ec971453	Merge pull request #12526 from terfendail:avx2_resize_fix	2018-09-14 15:57:47 +00:00
Hamdi Sahloul	5d54def264	Add semicolons after `CV_INSTRUMENT` macros	2018-09-14 06:45:31 +09:00
Vitaly Tuzov	29770e13e8	Fixed bit-exact resize SIMD implementation for AVX2 baseline	2018-09-13 18:20:27 +03:00
Hamdi Sahloul	a39e0daacf	Utilize CV_UNUSED macro	2018-09-07 20:33:52 +09:00
Vitaly Tuzov	f9a5c4d181	Fixed bit-exact resize wide intrinsics implementation for 16U	2018-09-03 20:37:25 +03:00
Vitaly Tuzov	e345cb03d5	Bit-exact resize reworked to use wide intrinsics (#12038 ) * Bit-exact resize reworked to use wide intrinsics * Reworked bit-exact resize row data loading * Added bit-exact resize row data loaders for SIMD256 and SIMD512 * Fixed type punned pointer dereferencing warning * Reworked loading of source data for SIMD256 and SIMD512 bit-exact resize	2018-08-31 16:54:05 +03:00
Alexander Alekhin	b09a4a98d4	opencv: Use cv::AutoBuffer<>::data()	2018-07-04 19:11:29 +03:00
gnthibault	b46fef327e	Fixed Assertin error due to Size.area() overflowing	2018-06-08 11:22:36 +02:00
Alexander Alekhin	5d36ee2fe7	imgproc: apply CV_OVERRIDE/CV_FINAL	2018-03-28 17:57:59 +03:00
Maksim Shabunin	8b87c4b96a	Fixed several warnings produced by clang 6 and static analyzers	2018-01-16 15:26:28 +03:00
Alexander Alekhin	8acd05f12a	Merge pull request #10421 from cezheng:patch-1	2018-01-05 09:24:33 +00:00
Vadim Pisarevsky	3f68d6d8a7	Merge pull request #10392 from terfendail:bitexact_fallback	2017-12-22 13:23:55 +00:00
Vitaly Tuzov	5fdb42a7c9	Added fallback to generic linear resize in case bit-exact resize of provided matrix isn't supported	2017-12-22 14:29:50 +03:00
Ce Zheng	602b08d9c7	Update resize inline comments Reading through the implementation, I feel this line of comment is not consistent with the actually code, so this is for correcting it.	2017-12-22 16:03:12 +08:00
Vitaly Tuzov	019162486c	Disabled universal intrinsic based implementation for bit-exact resize of 3-channel images	2017-12-22 10:08:30 +03:00
Vitaly Tuzov	1eb2fa9efb	Added universal intrinsics based implementations for CV_8UC2, CV_8UC3, CV_8UC4 bit-exact resizes.	2017-12-20 17:17:10 +03:00

1 2

54 Commits