Commit Graph

62 Commits

Author SHA1 Message Date
Andrew Ryrie
ea7d4be3f8
Merge pull request #20658 from smbz:lstm_optimisation
* dnn: LSTM optimisation

This uses the AVX-optimised fastGEMM1T for matrix multiplications where available, instead of the standard cv::gemm.

fastGEMM1T is already used by the fully-connected layer.  This commit involves two minor modifications:
 - Use unaligned access.  I don't believe this involves any performance hit in on modern CPUs (Nehalem and Bulldozer onwards) in the case where the address is actually aligned.
 - Allow for weight matrices where the number of columns is not a multiple of 8.

I have not enabled AVX-512 as I don't have an AVX-512 CPU to test on.

* Fix warning about initialisation order

* Remove C++11 syntax

* Fix build when AVX(2) is not available

In this case the CV_TRY_X macros are defined to 0, rather than being undefined.

* Minor changes as requested:

 - Don't check hardware support for AVX(2) when dispatch is disabled for these
 - Add braces

* Fix out-of-bounds access in fully connected layer

The old tail handling in fastGEMM1T implicitly rounded vecsize up to the next multiple of 8, and the fully connected layer implements padding up to the next multiple of 8 to cope with this.  The new tail handling does not round the vecsize upwards like this but it does require that the vecsize is at least 8.  To adapt to the new tail handling, the fully connected layer now rounds vecsize itself at the same time as adding the padding(which makes more sense anyway).

This also means that the fully connected layer always passes a vecsize of at least 8 to fastGEMM1T, which fixes the out-of-bounds access problems.

* Improve tail mask handling

 - Use static array for generating tail masks (as requested)
 - Apply tail mask to the weights as well as the input vectors to prevent spurious propagation of NaNs/Infs

* Revert whitespace change

* Improve readability of conditions for using AVX

* dnn(lstm): minor coding style changes, replaced left aligned load
2021-11-29 21:43:00 +00:00
Alexander Alekhin
1aacb9bb15 dnn(perf): update convolution tests 2021-09-10 13:11:02 +00:00
Alexander Alekhin
28aab134db dnn(test): update tests for OpenVINO 2021.2 2020-12-17 07:53:35 +00:00
Sergei Slashchinin
61144f935e
Merge pull request #18783 from sl-sergei:fix_conv1d
Add support for Conv1D on OpenCV backend

* Add support for Conv1D on OpenCV backend

* disable tests on other targets/backends

* Fix formatting

* Restore comment

* Remove unnecessary flag and fix test logic

* Fix perf test

* fix braces

* Fix indentation, assert check and remove unnecessary condition

* Remove unnecessary changes

* Add test cases for variable weights and bias

* dnn(conv): fallback on OpenCV+CPU instead of failures

* coding style
2020-11-13 22:22:10 +00:00
Alexander Alekhin
6da05f7086 dnn(test): update tests for OpenVINO 2021.1 2020-10-08 10:22:31 +00:00
Alexander Alekhin
b2ebd37ee2 Merge pull request #17856 from alalek:dnn_openvino_2020.4.0 2020-07-16 20:08:00 +00:00
Alexander Alekhin
81e027eef7 dnn: fix OpenCL implementation of Slice layer 2020-07-16 04:33:52 +00:00
Alexander Alekhin
1c371d07b5 dnn(test): adjust tests for OpenVINO 2020.4 2020-07-15 23:47:40 +00:00
Alexander Alekhin
99c4b76a6d dnn(test): add YOLOv4-tiny tests 2020-07-06 21:36:19 +00:00
Alexander Alekhin
e58e545584 Merge pull request #17392 from alalek:dnn_test_yolov4 2020-05-28 22:52:21 +00:00
Dmitry Kurtaev
d9bada9867 dnn: EfficientDet 2020-05-28 17:23:42 +03:00
Alexander Alekhin
6b89154afd dnn(test): add YOLOv4 tests 2020-05-28 13:27:40 +00:00
Dmitry Kurtaev
d8e10f3a8d Enable MaxPooling with indices in Inference Engine 2019-12-04 19:14:55 +03:00
Lubov Batanina
7523c777c5 Merge pull request #15537 from l-bat:ngraph
* Support nGraph

* Fix resize
2019-12-02 16:16:06 +03:00
Dmitry Kurtaev
6193e403e7 Enable some tests for 2019R2 2019-08-07 09:07:53 +03:00
Dmitry Kurtaev
a0c3bb70a9 Modify SSD from TensorFlow graph generation script to enable MyriadX 2019-07-26 13:57:08 +03:00
Alexander Alekhin
416c693b3f dnn(test): OpenVINO 2019R2 2019-07-25 19:01:16 +03:00
Lubov Batanina
8bcd7e122a Merge pull request #14842 from l-bat:ocv_conv3d
* Support Conv3D on OCV backend

* Add header

* Add perf tests

* Support pool3d

* Enable Resnet34_kinetics on OCV backend

* Add test

* Fix conv

* Optimize Conv2D
2019-07-11 20:13:52 +03:00
Alexander Alekhin
13a782c039 test: fix usage of findDataFile()
misused 'optional' mode
2019-06-20 18:20:14 +03:00
Dmitry Kurtaev
9c0af1f675 Enable more deconvolution layer configurations with IE backend 2019-06-03 08:15:52 +03:00
Dmitry Kurtaev
44d21e5a79 Enable Slice layer on Inference Engine backend 2019-05-27 16:28:01 +03:00
Alexander Alekhin
cafa010389 dnn(test): skip tests 2019-04-03 17:49:05 +03:00
Alexander Alekhin
fcb07c64f3 cmake: fix build of dnn tests with shared common code
- don't share .cpp files (PCH support is broken)
2019-03-31 08:52:25 +00:00
Lubov Batanina
7d3d6bc4e2 Merge pull request #13932 from l-bat:MyriadX_master_dldt
* Fix precision in tests for MyriadX

* Fix ONNX tests

* Add output range in ONNX tests

* Skip tests on Myriad OpenVINO 2018R5

* Add detect MyriadX

* Add detect MyriadX on OpenVINO R5

* Skip tests on Myriad next version of OpenVINO

* dnn(ie): VPU type from environment variable

* dnn(test): validate VPU type

* dnn(test): update DLIE test skip conditions
2019-03-29 16:42:58 +03:00
Dmitry Kurtaev
ed710eaa1c Make Inference Engine R3 as a minimal supported version 2019-02-21 09:32:26 +03:00
Liubov Batanina
183c0fcab1 Changed condition for resize and lrn layers 2019-02-14 13:11:14 +03:00
Dmitry Kurtaev
f0ddf302b2 Move Inference Engine to new API 2019-01-17 14:28:48 +03:00
Maksim Shabunin
fe459c82e5 Merge pull request #13332 from mshabunin:dnn-backends
DNN backends registry (#13332)

* Added dnn backends registry

* dnn: process DLIE/FPGA target
2018-12-05 18:11:45 +03:00
Dmitry Kurtaev
0d117312c9 DNN_TARGET_FPGA using Intel's Inference Engine 2018-11-19 11:41:43 +03:00
Alexander Alekhin
96c71dd3d2 dnn: reduce set of ignored warnings 2018-11-15 13:15:59 +03:00
tompollok
0b77600718 change area() emptiness checks to empty() 2018-10-13 21:35:10 +02:00
Alexander Alekhin
c557193b8c dnn(test): use dnnBackendsAndTargets() param generator 2018-08-31 15:11:58 +03:00
Alexander Alekhin
3e6b3a6856 dnn(perf): fix and merge Convolution tests
- OpenCL tests didn't run any OpenCL kernels
- use real configuration from existed models (the first 100 cases)
- batch size = 1
2018-08-31 15:02:19 +03:00
Dmitry Kurtaev
8e034053af Faster-RCNN from TensorFlow on CPU with Intel's Inference Engine backend 2018-08-01 11:29:58 +03:00
Dmitry Kurtaev
2c291bc2fb Enable FastNeuralStyle and OpenFace networks with IE backend 2018-06-09 15:57:12 +03:00
Dmitry Kurtaev
40765c5f8d Enable SSD models from TensorFlow with OpenCL plugin of Intel's Inference Engine 2018-06-08 16:55:21 +03:00
David
7175f257b5 Added ResizeBilinear op for tf (#11050)
* Added ResizeBilinear op for tf

Combined ResizeNearestNeighbor and ResizeBilinear layers into Resize (with an interpolation param).

Minor changes to tf_importer and resize layer to save some code lines

Minor changes in init.cpp

Minor changes in tf_importer.cpp

* Replaced implementation of a custom ResizeBilinear layer to all layers

* Use Mat::ptr. Replace interpolation flags
2018-06-07 16:29:04 +03:00
Dmitry Kurtaev
f3a6ae5f00 Wrap Inference Engine init to try-catch 2018-06-07 12:55:52 +03:00
Vadim Pisarevsky
3cbd2e2764 Merge pull request #11650 from dkurt:dnn_default_backend 2018-06-06 09:30:39 +00:00
Alexander Alekhin
6816495bee dnn(test): reuse test/test_common.hpp, eliminate dead code warning 2018-06-05 12:52:53 +03:00
Dmitry Kurtaev
b781ac7346 Make Intel's Inference Engine backend is default if no preferable backend is specified. 2018-06-04 18:31:46 +03:00
Dmitry Kurtaev
f96f934426 Update Intel's Inference Engine deep learning backend (#11587)
* Update Intel's Inference Engine deep learning backend

* Remove cpu_extension dependency

* Update Darknet accuracy tests
2018-05-31 14:05:21 +03:00
Li Peng
1b517a45ae add fp16 accuracy and perf test
Signed-off-by: Li Peng <peng.li@intel.com>
2018-05-16 22:45:07 +08:00
Dmitry Kurtaev
bd77d100e1 Enable some tests for clDNN plugin from Intel's Inference Engine 2018-04-20 10:47:46 +03:00
Dmitry Kurtaev
97fec07d96 Support YOLOv3 model from Darknet 2018-04-16 18:44:12 +03:00
Dmitry Kurtaev
709cf5d038 OpenCL GPU target for Inference Engine deep learning backend
Enable FP16 GPU target for DL Inference Engine backend.
2018-04-09 17:21:35 +03:00
Dmitry Kurtaev
7972f47ed4 Load networks from intermediate representation of Intel's Deep learning deployment toolkit. 2018-03-26 07:24:21 +03:00
Dmitry Kurtaev
7fe97376c2 MobileNet-SSD from TensorFlow 1.3 and Inception-V2-SSD using Inference Engine backend 2018-02-09 13:45:45 +03:00
Dmitry Kurtaev
ed94136548 OpenCV face detection network using Inference Engine backend 2018-02-06 17:53:24 +03:00
Alexander Alekhin
2a1f46c42d Merge pull request #9770 from alalek:refactor_test_files 2018-02-06 09:33:58 +00:00