Commit Graph

28653 Commits

Author SHA1 Message Date
Andrew Ryrie
ea7d4be3f8
Merge pull request #20658 from smbz:lstm_optimisation
* dnn: LSTM optimisation

This uses the AVX-optimised fastGEMM1T for matrix multiplications where available, instead of the standard cv::gemm.

fastGEMM1T is already used by the fully-connected layer.  This commit involves two minor modifications:
 - Use unaligned access.  I don't believe this involves any performance hit in on modern CPUs (Nehalem and Bulldozer onwards) in the case where the address is actually aligned.
 - Allow for weight matrices where the number of columns is not a multiple of 8.

I have not enabled AVX-512 as I don't have an AVX-512 CPU to test on.

* Fix warning about initialisation order

* Remove C++11 syntax

* Fix build when AVX(2) is not available

In this case the CV_TRY_X macros are defined to 0, rather than being undefined.

* Minor changes as requested:

 - Don't check hardware support for AVX(2) when dispatch is disabled for these
 - Add braces

* Fix out-of-bounds access in fully connected layer

The old tail handling in fastGEMM1T implicitly rounded vecsize up to the next multiple of 8, and the fully connected layer implements padding up to the next multiple of 8 to cope with this.  The new tail handling does not round the vecsize upwards like this but it does require that the vecsize is at least 8.  To adapt to the new tail handling, the fully connected layer now rounds vecsize itself at the same time as adding the padding(which makes more sense anyway).

This also means that the fully connected layer always passes a vecsize of at least 8 to fastGEMM1T, which fixes the out-of-bounds access problems.

* Improve tail mask handling

 - Use static array for generating tail masks (as requested)
 - Apply tail mask to the weights as well as the input vectors to prevent spurious propagation of NaNs/Infs

* Revert whitespace change

* Improve readability of conditions for using AVX

* dnn(lstm): minor coding style changes, replaced left aligned load
2021-11-29 21:43:00 +00:00
yuki takehara
a6277370ca
Merge pull request #21107 from take1014:remove_assert_21038
resolves #21038

* remove C assert

* revert C header

* fix several points in review

* fix test_ds.cpp
2021-11-27 18:34:52 +00:00
Alexander Alekhin
b55d8f46f4 Merge pull request #21133 from alalek:dnn_test_ie_update_3.4 2021-11-26 20:35:58 +00:00
Alexander Alekhin
985aa0423d dnn(test): update InferenceEngine tests 2021-11-26 18:46:26 +00:00
Alexander Alekhin
c14a8dce93 Merge pull request #21131 from cclauss:codespell 2021-11-26 18:32:54 +00:00
Alexander Alekhin
f5d45221ca Merge pull request #21130 from cclauss:print-function 2021-11-26 18:31:32 +00:00
Christian Clauss
ebe4ca6b60 Fix typos discovered by codespell 2021-11-26 12:29:56 +01:00
Alexander Alekhin
f159ed20c2 Merge pull request #21128 from cclauss:patch-3 2021-11-26 11:11:58 +00:00
Christian Clauss
cdbb042ce4 Use print() function in both Python 2 and Python 3 2021-11-26 11:57:54 +01:00
Christian Clauss
23bbe511fe
CMakeLists.txt: Fix typo discovered by codespell
https://pypi.org/project/codespell/
2021-11-26 11:07:14 +01:00
Alexander Alekhin
444218e755 Merge pull request #21123 from cclauss:patch-3 2021-11-25 18:06:33 +00:00
Christian Clauss
9cc60c9dd3 Use ==/!= to compare constant literals (str, bytes, int, float, tuple)
Avoid `SyntaxWarning` on Python >= 3.8
```
>>> "convolutional" == "convolutional"
True
>>> "convolutional" is "convolutional"
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True
```
Related to #21121
2021-11-25 15:39:58 +01:00
Alexander Alekhin
2c226d597d Merge pull request #21110 from alalek:update_libjpeg-turbo 2021-11-24 20:08:27 +00:00
Alexander Alekhin
c6ab32ffb9 3rdparty: libjpeg-turbo 2.1.0 => 2.1.2
https://github.com/libjpeg-turbo/libjpeg-turbo/releases/tag/2.1.2
2021-11-24 04:00:47 +00:00
Alexander Alekhin
101be77d9d Merge pull request #21092 from alalek:core_logger_show_timestamp 2021-11-22 22:44:31 +00:00
Alexander Alekhin
61f1ee2d2d core(logger): dump timestamp information with message 2021-11-20 15:34:23 +00:00
Alexander Alekhin
93b6e80cd7 Merge pull request #21063 from vrabaud:3.4_h_clamping 2021-11-19 20:28:25 +00:00
Vincent Rabaud
d4741eece1 Fix H clamping for very small negative values.
In case of very small negative h (e.g. -1e-40), with the current implementation,
you will go through the first condition and end up with h = 6.f, and will miss
the second condition.
2021-11-19 16:34:09 +01:00
Alexander Alekhin
585484cb06 Merge pull request #21067 from NickJackolson:nickjackolson/imread-warning 2021-11-18 22:48:40 +00:00
nickjackolson
b696928a5b add !empty assertion in seamlessClone()
issue #20617 addresses lack of warnings on
seamlessClone() function when src is None.
This commit adds source check using CV_Assert
therefore debugging would be easier.

Signed-off-by: nickjackolson <metedurlu@gmail.com>
2021-11-18 21:19:05 +01:00
nickjackolson
79d4e865fe Add warning message to imread()
Add a warning message using CV_LOG__WARNING().
This way api behaviour is preserved. Outputs are
the same but user gets an extra warning in case
fopen() fails to access image file for some reason.
This would help new users and also debugging
complex apps which use imread()

Signed-off-by: nickjackolson <metedurlu@gmail.com>
2021-11-18 21:19:05 +01:00
Alexander Alekhin
8380879804 Merge pull request #21077 from alalek:js_test_pin_cli_table 2021-11-18 18:34:44 +00:00
Alexander Alekhin
de7f8eec04 js(test): pin cli-table dependency 2021-11-18 05:40:16 +00:00
Alexander Alekhin
49432009fe Merge pull request #21064 from alalek:doc_videoio_api_preference_3.4 2021-11-16 20:05:59 +00:00
Alexander Alekhin
473f10877c doc(videoio): fix apiPreference note, replace DSHOW(deprecated)->MSMF 2021-11-16 17:08:11 +00:00
Qiushi Zheng
3e51448ef0
Merge pull request #17889 from ZhengQiushi:my_3.4
QR code (encoding process)

* add qrcode encoder

* qr encoder fixes

* qr encoder: fix api and realization

* fixed qr encoder, added eci and kanji modes

* trigger CI

* qr encoder constructor fixes

Co-authored-by: APrigarina <ann73617@gmail.com>
2021-11-15 17:15:39 +00:00
Alexander Alekhin
8041ab8a61
Merge pull request #21025 from alalek:issue_21004
* dnn(ocl4dnn): fix LRN layer accuracy problems

- FP16 intermediate computation is not accurate and may provide NaN values

* dnn(test): update tolerance for FP16
2021-11-12 01:54:07 +03:00
tv3141
cb286a66be
Merge pull request #21030 from tv3141:fix_seg_fault_houghlinespointset
Fix seg fault houghlinespointset

* Clarify parameter doc for HoughLinesPointSet

* Fix seg fault.

* Add regression test.

* Fix latex typo
2021-11-10 22:15:38 +03:00
Alexander Alekhin
f33828a1ca Merge pull request #20870 from pkubaj:master 2021-11-10 16:08:20 +00:00
Piotr Kubaj
68e425f869 Add support for runtime CPU feature check on POWER on FreeBSD.
1. Code uses PPC_FEATURE_HAS_VSX, but it's not checked similarly to
PPC_FEATURE2_ARCH_3_00 and PPC_FEATURE2_ARCH_3_00 for availability. FreeBSD has
those macros in machine/cpu.h, but I went with the way chosen for
PPC_FEATURE2_ARCH_3_00 and PPC_FEATURE2_ARCH_3_00. Other than that, FreeBSD also
has sys/auxv.h and that's where elf_aux_info() is defined.
2. getauxval() is actually Linux-only, but code checked for __unix__. It won't
work on all UNIX, so change it back to __linux__. Add another code variant
strictly for FreeBSD.
3. Update comment. This commit adds code for FreeBSD, but recently there
appeared support for powerpc64 in OpenBSD.
2021-11-10 13:28:09 +00:00
ZaKiiiiiiiii
98b6ce353c
Merge pull request #20904 from Crayon-new:fix_bug_in_maxLayer
fix bug: wrong output dimension when "keep_dims" is false in pooling layer.

* fix bug in max layer

* code align

* delete permute layer and add test case

* add name assert

* check other cases

* remove c++11 features

* style:add "const" remove assert

* style:sanitize file names
2021-11-09 19:24:04 +03:00
Alexander Alekhin
1ac7baceff Merge pull request #21005 from nikpappas:bug-samples-falsecolor-trackbar 2021-11-06 14:19:58 +00:00
Nikolaos Pappas
968d94d417 Fix trackbar in falsecolor cpp sample 2021-11-06 10:11:58 +00:00
Alexander Alekhin
2ce47fda88 Merge pull request #21011 from vrabaud:3.4 2021-11-05 09:25:50 +00:00
Vincent Rabaud
ffd010767f Only use fma functions when CV_FMA3 is set.
In practice, processors offering AVX2/AVX512 also FMA, that is why it got unnoticed.
2021-11-04 23:15:49 +01:00
Alexander Alekhin
edf533c83e Merge pull request #21007 from alalek:cmake_dnn_fix_wrong_tengine_order 2021-11-04 12:28:27 +00:00
Alexander Alekhin
c1d61c88e9 dnn(cmake): don't hijack OpenCL options with Tengine 2021-11-04 09:59:19 +00:00
Alexander Alekhin
6360b846c6 Merge pull request #21003 from APrigarina:add_test_qrdetect_fix 2021-11-03 22:32:00 +00:00
Alexander Alekhin
57900d07f0 Merge pull request #20882 from flytogcp:flytogcp-patch-1 2021-11-03 20:30:24 +00:00
APrigarina
8e72e1ed88 add test case for QR detect fix 2021-11-03 19:56:54 +00:00
Alexander Alekhin
10c547396d Merge pull request #20990 from alalek:fix_warnings_msvc_clang_dshow_3.4 2021-11-03 19:51:31 +00:00
cpengu
66dd871288 Update qrcode.cpp
Fixed issue #20880, QRDetect::searchHorizontalLines() boundary condition will skip the matched qrcode near the end
2021-11-03 19:45:24 +00:00
Alexander Alekhin
d484939c02
Merge pull request #20999 from alalek:dnn_replace_deprecated_calls
dnn(protobuf): replace deprecated calls

* dnn: replace deprecated ByteSize() => ByteSizeLong()

* dnn: replace deprecated calls, use GetRepeatedFieldRef
2021-11-03 15:59:36 +00:00
Alexander Alekhin
85fd8729ce Merge pull request #20970 from s-trinh:update_Bayer_naming 2021-11-02 20:41:17 +00:00
Alexander Alekhin
b3e16c6423 videoio(dshow): eliminate build warnings from MSVC-Clang 2021-11-02 19:22:47 +00:00
Souriya Trinh
30d6766db4 Add conventional Bayer naming. 2021-11-02 20:17:57 +01:00
Alexander Alekhin
bce76a7977 Merge pull request #20980 from alalek:highgui_fix_cvGetWindowImageRect_3.4 2021-10-30 13:59:08 +00:00
Alexander Alekhin
0ee61d178f highgui: drop invalid cvGetWindowImageRect
- return type is C++ template
- removal from 'extern "C"' scope broke ABI anyway, so this symbols is removed completelly
2021-10-30 12:40:20 +00:00
Alexander Alekhin
0e9453a395 Merge pull request #20971 from alalek:cmake_build_type_use_release 2021-10-29 17:45:09 +00:00
Alexander Alekhin
e5647cf70d cmake: use CMAKE_BUILD_TYPE=Release by default 2021-10-29 15:15:17 +00:00