mirror/opencv: Open Source Computer Vision Library - opencv - Gitea

mirror/opencv

mirror of https://github.com/opencv/opencv.git synced 2025-06-07 09:25:45 +08:00

Open Source Computer Vision Library

c-plus-plus computer-vision deep-learning image-processing opencv

Go to file

Andrew Ryrie ea7d4be3f8 Merge pull request #20658 from smbz:lstm_optimisation * dnn: LSTM optimisation This uses the AVX-optimised fastGEMM1T for matrix multiplications where available, instead of the standard cv::gemm. fastGEMM1T is already used by the fully-connected layer. This commit involves two minor modifications: - Use unaligned access. I don't believe this involves any performance hit in on modern CPUs (Nehalem and Bulldozer onwards) in the case where the address is actually aligned. - Allow for weight matrices where the number of columns is not a multiple of 8. I have not enabled AVX-512 as I don't have an AVX-512 CPU to test on. * Fix warning about initialisation order * Remove C++11 syntax * Fix build when AVX(2) is not available In this case the CV_TRY_X macros are defined to 0, rather than being undefined. * Minor changes as requested: - Don't check hardware support for AVX(2) when dispatch is disabled for these - Add braces * Fix out-of-bounds access in fully connected layer The old tail handling in fastGEMM1T implicitly rounded vecsize up to the next multiple of 8, and the fully connected layer implements padding up to the next multiple of 8 to cope with this. The new tail handling does not round the vecsize upwards like this but it does require that the vecsize is at least 8. To adapt to the new tail handling, the fully connected layer now rounds vecsize itself at the same time as adding the padding(which makes more sense anyway). This also means that the fully connected layer always passes a vecsize of at least 8 to fastGEMM1T, which fixes the out-of-bounds access problems. * Improve tail mask handling - Use static array for generating tail masks (as requested) - Apply tail mask to the weights as well as the input vectors to prevent spurious propagation of NaNs/Infs * Revert whitespace change * Improve readability of conditions for using AVX * dnn(lstm): minor coding style changes, replaced left aligned load		2021-11-29 21:43:00 +00:00
.github	Updated more links to forum.opencv.org	2021-01-19 21:55:45 +03:00
3rdparty	3rdparty: libjpeg-turbo 2.1.0 => 2.1.2	2021-11-24 04:00:47 +00:00
apps	Fix typo 'Applicatioin'	2020-11-17 15:02:55 +00:00
cmake	cmake: fix popcnt detection with Intel Compiler	2021-10-28 05:37:23 +00:00
data	fix files permissions	2020-04-13 04:29:55 +00:00
doc	Fix typos discovered by codespell	2021-11-26 12:29:56 +01:00
include	add missing DNN header to opencv2/opencv.hpp	2018-02-15 15:59:14 +01:00
modules	Merge pull request #20658 from smbz:lstm_optimisation	2021-11-29 21:43:00 +00:00
platforms	Fix typos discovered by codespell	2021-11-26 12:29:56 +01:00
samples	Fix typos discovered by codespell	2021-11-26 12:29:56 +01:00
.editorconfig	add .editorconfig	2018-10-11 17:57:51 +00:00
.gitattributes	cmake: generate and install ffmpeg-download.ps1	2018-06-09 13:19:48 +03:00
.gitignore	Ignore Visual Studio cmake configuration file	2020-03-17 21:12:54 +03:00
CMakeLists.txt	CMakeLists.txt: Fix typo discovered by codespell	2021-11-26 11:07:14 +01:00
CONTRIBUTING.md	migration: github.com/opencv/opencv	2016-07-12 12:51:12 +03:00
LICENSE	copyright: 2021	2021-01-01 13:40:32 +00:00
README.md	update forum link	2021-01-11 18:39:46 +00:00

README.md

OpenCV: Open Source Computer Vision Library

Resources

Homepage: http://opencv.org
Docs: http://docs.opencv.org/3.4/
Q&A forum: https://forum.opencv.org
- previous forum (read only): http://answers.opencv.org
Issue tracking: https://github.com/opencv/opencv/issues

Contributing

Please read the contribution guidelines before starting work on a pull request.

Summary of the guidelines:

One pull request per issue;
Choose the right base branch;
Include tests and documentation;
Clean up "oops" commits before submitting;
Follow the coding style guide.