Alexander Alekhin
8041ab8a61
Merge pull request #21025 from alalek:issue_21004
...
* dnn(ocl4dnn): fix LRN layer accuracy problems
- FP16 intermediate computation is not accurate and may provide NaN values
* dnn(test): update tolerance for FP16
2021-11-12 01:54:07 +03:00
Alexander Alekhin
d934bb15b0
Merge pull request #20998 from alalek:update_protobuf_3.19.1
...
3rdparty(protobuf): upgrade 3.5.2 => 3.19.1
* 3rdparty(protobuf): upgrade 3.5.2 => 3.19.1
* dnn: update protobuf files (3.19.1)
* 3rdparty(protobuf): re-apply OpenCV patch for custom fields (3.19.1)
* protobuf: suppress new build warnings
* protobuf: remove unused files
2021-11-10 12:03:45 +00:00
ZaKiiiiiiiii
98b6ce353c
Merge pull request #20904 from Crayon-new:fix_bug_in_maxLayer
...
fix bug: wrong output dimension when "keep_dims" is false in pooling layer.
* fix bug in max layer
* code align
* delete permute layer and add test case
* add name assert
* check other cases
* remove c++11 features
* style:add "const" remove assert
* style:sanitize file names
2021-11-09 19:24:04 +03:00
Alexander Alekhin
7842181b47
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-11-05 09:27:46 +00:00
Alexander Alekhin
562f2375c5
dnn(test): skip tests with high memory usage
...
- 32-bit configuration may fail due to memory fragmentation
2021-11-04 13:26:33 +00:00
Alexander Alekhin
edf533c83e
Merge pull request #21007 from alalek:cmake_dnn_fix_wrong_tengine_order
2021-11-04 12:28:27 +00:00
Alexander Alekhin
c1d61c88e9
dnn(cmake): don't hijack OpenCL options with Tengine
2021-11-04 09:59:19 +00:00
Alexander Alekhin
d484939c02
Merge pull request #20999 from alalek:dnn_replace_deprecated_calls
...
dnn(protobuf): replace deprecated calls
* dnn: replace deprecated ByteSize() => ByteSizeLong()
* dnn: replace deprecated calls, use GetRepeatedFieldRef
2021-11-03 15:59:36 +00:00
Alexander Alekhin
ec10f2e72b
Merge pull request #20877 from rogday:simple_layers
2021-10-20 17:00:38 +00:00
rogday
b3f966e2ca
Merge pull request #20883 from rogday:eltwise_refactoring
...
* backport elementwise_layers refactor
* keep NULL
2021-10-19 13:29:22 +00:00
Alexander Alekhin
1926e919be
dnn(int8): fix using of incorrect UMat constructor
2021-10-18 04:46:00 +00:00
Alexander Alekhin
31c40fa4cc
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-10-15 13:35:03 +00:00
Smirnov Egor
1feb3838b5
add Ceil, Floor, Log, Round, Sqrt, Not, Equal, Less, Greater
2021-10-15 16:02:46 +03:00
Alexander Alekhin
53d6c9b9c0
Merge pull request #20860 from rogday:sum_fix
2021-10-12 15:36:32 +00:00
Smirnov Egor
238dbffb48
change asserts for Sum
2021-10-11 20:59:44 +03:00
Smirnov Egor
a9d7b6eab7
fix const - input and remove unimplemented function
2021-10-11 18:58:10 +03:00
Alexander Alekhin
4672dbda2a
Merge pull request #20818 from rogday:yolov4x_mish_cuda
2021-10-08 19:12:43 +00:00
Smirnov Egor
9c84749e2c
backport YOLOv4x-mish new_coords CUDA implementation
2021-10-08 14:14:49 +03:00
Alexander Alekhin
cca4c47781
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-10-08 11:05:45 +00:00
Alexander Alekhin
81e7988eb9
Merge pull request #20840 from alalek:dnn_ocl_cleanup_code
2021-10-08 05:07:51 +00:00
Alexander Alekhin
8c2dd5fb9a
dnn(ocl4dnn): cleanup dead code, improve logging
2021-10-08 00:39:40 +00:00
Alexander Alekhin
724e04e979
dnn(ocl4dnn): add extra checks to convolution layer
...
- prevent running code over unsupported/non-tested configurations
- prevent integer div by zero
2021-10-07 23:18:32 +00:00
Alexander Alekhin
03a08435e2
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-10-07 04:27:22 +00:00
Alexander Alekhin
1a29ea1038
Merge pull request #20829 from alalek:dnn_ocl_skip_int8_tests
2021-10-06 22:46:21 +00:00
Alexander Alekhin
94e92cd6c0
dnn(ocl): skip int8 tests due to memory access issues
2021-10-06 21:27:18 +00:00
Alexander Alekhin
822d468232
Merge pull request #20813 from rogday:soft_nms
2021-10-06 20:20:34 +00:00
Smirnov Egor
2221dcc9f2
add SoftNMS implementation
2021-10-06 21:31:45 +03:00
Oliver Kuckertz
a3d7811f24
Merge pull request #20725 from mologie:fix-dnn-tf-on-arm
...
* dnn: fix unaligned memory access crash on armv7
The getTensorContent function would return a Mat pointing to some
member of a Protobuf-encoded message. Protobuf does not make any
alignment guarantees, which results in a crash on armv7 when loading
models while bit 2 is set in /proc/cpu/alignment (or the relevant
kernel feature for alignment compatibility is disabled). Any read
attempt from the previously unaligned data member would send SIGBUS.
As workaround, this commit makes an aligned copy via existing clone
functionality in getTensorContent. The unsafe copy=false option is
removed. Unfortunately, a rather crude hack in PReLUSubgraph in fact
writes(!) to the Protobuf message. We limit ourselves to fixing the
alignment issues in this commit, and add getTensorContentRefUnaligned
to cover the write case with a safe memcpy. A FIXME marks the issue.
* dnn: reduce amount of .clone() calls
* dnn: update FIXME comment
Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2021-10-06 16:41:05 +00:00
Alexander Alekhin
646924fce8
dnn(pytest/test_input_3d): reload model between switching targets
2021-10-05 23:23:08 +00:00
HAN Liutong
e5fb50476c
Merge pull request #20521 from hanliutong:dev-rvv-multiVLEN
...
Make the implementation of optimization in DNN adjustable to different vector sizes with RVV intrinsics.
* Update fastGEMM for multi VLEN.
* Update fastGEMM1T for multi VLEN.
* Update fastDepthwiseConv for multi VLEN.
* Update fastConv for multi VLEN.
* Replace malloc with cv::AutoBuffer.
2021-10-05 15:35:00 +00:00
Alexander Alekhin
3e6f27522b
pre: OpenCV 4.5.4 (version++)
2021-10-04 22:35:47 +00:00
Alexander Alekhin
1b70f94282
Merge pull request #20782 from YashasSamaga:cuda4dnn-eltwise-broadcast
2021-10-04 22:35:00 +00:00
Alexander Alekhin
ebef84e9ea
pre: OpenCV 3.4.16 (version++)
2021-10-04 20:47:07 +00:00
Jebastin Nadar
cce78cc5e2
Merge pull request #20535 from SamFC10:onnx-q
...
dnn : int8 quantized layers support in onnx importer
* added quantized layers support in onnx importer
* added more cases in eltwise node, some more checks
* added tests for quantized nodes
* relax thresholds for failed tests, address review comments
* refactoring based on review comments
* added support for unsupported cases and pre-quantized resnet50 test
* relax thresholds due to int8 resize layer
2021-10-04 18:07:38 +00:00
Zihao Mu
9085b933d8
Merge pull request #20702 from zihaomu:tf_expand_dim_layer
...
Add ExpandDims layer of tf_importer.cpp
* Add ExpandDims to tf_importer.
* add -1 expand test case.
* Support different dimensions of input.
* Compatible with 5-dimensional NDHWC data
* Code align
* support 3-dim input.
* 3-dim bug fixed.
* fixing error of code format.
2021-10-04 16:37:38 +00:00
YashasSamaga
505dde09de
support broadcasting in eltwise ops
2021-10-04 12:38:45 +05:30
SamFC10
87ebf2e50b
fix illegal memory access in int8 convolution
2021-10-03 15:16:01 +05:30
Alexander Alekhin
37c3f0d8a0
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-10-02 17:57:18 +00:00
Alexander Alekhin
f977d10a19
dnn(ocl): fix conv DWCONV workgroup
2021-10-01 18:52:07 +00:00
Alexander Alekhin
846317ef37
dnn(ocl): fix conv BASIC workgroup
2021-09-29 14:55:46 +00:00
Alexander Alekhin
24fcb7f813
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-09-25 17:50:00 +00:00
rogday
38b9ec7a18
Merge pull request #20682 from rogday:min
...
* Add Min layer to CPU, OpenCL, Halide, Inference Engine, NGraph and CUDA
* fix indentation
* add min to fusion and halide tests; fix doc
2021-09-22 15:17:37 +03:00
SamFC10
9c5d7716e2
fix for unsqueeze opset version 13
2021-09-17 17:40:57 +05:30
Alexander Alekhin
46fd26e366
Merge pull request #20699 from alalek:dnn_perf_update_convolution_tests
2021-09-16 17:11:32 +00:00
rogday
c410d7a97d
Merge pull request #20671 from rogday:yolov4x-mish
...
Add support for YOLOv4x-mish
* backport to 3.4 for supporting yolov4x-mish
* add YOLOv4x-mish test
* address review comments
Co-authored-by: Guo Xu <guoxu@1school.com.cn>
2021-09-14 17:49:49 +00:00
YashasSamaga
50462dcdc6
fix effrank assert to allow input effrank <= output effrank
2021-09-13 20:44:33 +05:30
Alexander Alekhin
6e66a9222a
dnn(onnx): fix format specifier
2021-09-11 22:26:52 +00:00
Alexander Alekhin
c3ac834526
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-09-11 21:27:26 +00:00
Zihao Mu
51b03b87e6
BiasAdd could load Const from second place.
2021-09-11 15:34:41 +00:00
Alexander Alekhin
1aacb9bb15
dnn(perf): update convolution tests
2021-09-10 13:11:02 +00:00
Alexander Alekhin
6ace801418
Merge pull request #20661 from alalek:dnn_ocl_fix_gemm_like_kernel
2021-09-10 11:58:52 +00:00
rogday
d31b93b513
Merge pull request #20674 from rogday:prelu_slope
...
Fix PReLU negative slope access pattern
* fix prelu negative slope access pattern
* change begin() to ptr()
2021-09-10 11:07:16 +00:00
rogday
4807cd8a6e
Merge pull request #20605 from rogday:split_slice_shenanigans
...
Add Normalize subgraph, fix Slice, Mul and Expand
* Add Normalize subgraph, support for starts<0 and axis<0 in Slice, Mul broadcasting in the middle and fix Expand's unsqueeze
* remove todos
* remove range-based for loop
* address review comments
* change >> to > > in template
* fix indexation
* fix expand that does nothing
2021-09-09 14:41:40 +03:00
Alexander Alekhin
35e824c287
dnn(ocl): fix out of bound access in GEMM-like kernels
...
- dropped usage of CreateSubBuffer() - buffers lifetime management issue
- fixed elementwise offset
- avoid out of bounds read access
2021-09-06 18:17:21 +00:00
Alexander Alekhin
5578ad5e14
dnn(ocl): fix automatic globalsize adjusting
...
- if kernel code doesn't support that
2021-09-06 03:11:29 +00:00
Alexander Alekhin
0a43b23275
Merge pull request #20651 from alalek:issue_18361
2021-09-04 18:22:12 +00:00
Alexander Alekhin
7967683296
Merge pull request #20648 from alalek:issue_20615
2021-09-04 18:21:58 +00:00
Alexander Alekhin
5b2c016834
dnn(ocl): avoid out of buffer access in copyWeightsSwizzled
2021-09-04 15:45:59 +00:00
Alexander Alekhin
407adc7061
dnn(ocl): fix buffer offsets in IDLF kernel
...
- drop CreateSubBuffer
- fix FUSED_CONV_ELTWISE mode
2021-09-04 15:28:35 +00:00
rogday
d0e612dc36
Merge pull request #20647 from rogday:resize_concat_optimization
...
Fix resize+concat optimization
* fix resize+concat optimization
* add comment and fix indentation
2021-09-03 12:32:29 +00:00
Alexander Alekhin
5aa7435d25
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-09-02 15:24:04 +00:00
Alexander Alekhin
060a76dc3e
Merge pull request #20573 from rogday:onnx_scale_fix
2021-09-01 14:09:17 +00:00
WJJ1995
edc442afdb
Merge pull request #20511 from wjj19950828:add_humanseg_support_0806
...
* support PPSeg model for dnn module
* fixed README for CI
* add test case
* fixed bug
* deal with comments
* rm dnn_model_runner
* update test case
* fixed bug for testcase
* update testcase
2021-09-01 10:10:05 +00:00
Alexander Alekhin
ae6fabc6fe
dnn(ocl): drop CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE check
...
- it is a hint and it should not block kernel execution
2021-08-30 20:40:14 +00:00
Alexander Alekhin
4c05a697fa
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-08-28 21:30:28 +00:00
Vincent Rabaud
38d0063c36
Do not use deprecated ReleaseCleared in protobuf library.
...
This is to make code work with protobuf arenas for memory
management (ReleaseCleared is incompatible).
The cleaning of the memory is also simpler.
2021-08-26 15:36:22 +02:00
Alexander Alekhin
6fbfc58602
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-08-21 17:25:18 +00:00
Alexander Alekhin
77a5c43d50
Merge pull request #20586 from alalek:issue_20585
2021-08-21 17:22:58 +00:00
Alexander Alekhin
f28e4b86fb
dnn(ocl): fix top initialization in verifyResult
2021-08-21 16:04:13 +00:00
rogday
6801dd043d
Merge pull request #20494 from rogday:onnx_diagnostic_fix
...
fix ONNXImporter diagnostic mode layer registration issue
* fix layer registration, thread unsafe access and align the behavior of DNN_DIAGNOSTICS_RUN between onnx and tf importers
* move skipModelInput
* print all missing layers
* address TF issue
2021-08-20 14:43:47 +00:00
Alexander Alekhin
a9817e9127
Merge pull request #20556 from rogday:onnx_split_sum_fix
2021-08-20 08:10:18 +00:00
Vincent Rabaud
9cfa84313c
Use the one argument version of SetTotalBytesLimit.
...
The two argument versions has been deprecated, cf
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream
2021-08-19 14:31:29 +02:00
SamFC10
fa90e14b06
int8 layers and 8-bit quantization support
2021-08-19 09:56:47 +05:30
Smirnov Egor
fe625a558e
fix hasDynamicShapes for batch_size and fix axis selection in Scale layer
2021-08-18 19:22:24 +03:00
thezane
210bfaf8d6
Merge pull request #20483 from thezane:support-cumsum-layer-for-onnx
...
* Support cumsum layer for onnx
* Add unit tests
* Address review comments
2021-08-17 20:09:25 +03:00
Smirnov Egor
9ef41f68fb
fix Split partial sum
2021-08-16 15:44:54 +03:00
Alexander Alekhin
05d733e707
Merge pull request #20524 from yichenj:dnn_text_recognition_enhance
2021-08-15 12:30:25 +00:00
Alexander Alekhin
0c01cf7c85
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-08-14 18:24:00 +00:00
Julia Bareeva
cfb36443fb
Merge pull request #20506 from JulieBar:lstm_activations
...
* Support activations(Sigmoid, Tanh) for LSTM
* fix warning
2021-08-13 15:41:00 +03:00
Alexander Alekhin
9d3826c676
Merge pull request #20525 from SamFC10:fix-prior-variances
2021-08-13 10:06:55 +00:00
JIANG Yichen
955cf35d5f
Implement ctc prefix beam search decode for TextRecognitionModel.
...
The algorithm is based on Hannun's paper: First-Pass Large Vocabulary
Continuous Speech Recognition using Bi-Directional Recurrent DNNs
2021-08-12 20:33:31 +08:00
HAN Liutong
aaca4987c9
Merge pull request #20287 from hanliutong:dev-rvv-0.10
...
Optimization of DNN using native RISC-V vector intrinsics.
* Use RVV to optimize fastGEMM (FP32) in DNN.
* Use RVV to optimize fastGEMM1T in DNN.
* Use RVV to optimize fastConv in DNN.
* Use RVV to optimize fastDepthwiseConv in DNN.
* Vectorize tails using vl.
* Use "vl" instead of scalar to handle small block in fastConv.
* Fix memory access out of bound in "fastGEMM1T".
* Remove setvl.
* Remove useless initialization.
* Use loop unrolling to handle tail part instead of switch.
2021-08-11 01:16:03 +03:00
Smirnov Egor
739ff84732
add Max layer to TFImporter
2021-08-09 14:01:51 +03:00
SamFC10
2a177052de
fix bug in prior-box variances
2021-08-09 12:08:55 +05:30
Alexander Alekhin
424eaba4c5
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-08-07 17:25:06 +00:00
Julia Bareeva
e1cafa3834
Merge pull request #20442 from JulieBar:gru_layer
...
* Add initialization and inference for GRU layer
* fix issues found on review
2021-08-07 10:07:37 +03:00
Julia Bareeva
633fedaa96
Merge pull request #20480 from JulieBar:lstm_pytest
...
Add Python's test for LSTM layer
* Add Python's test for LSTM layer
* Set different test threshold for FP16 target
* rename test to test_input_3d
Co-authored-by: Julie Bareeva <julia.bareeva@xperience.ai>
2021-08-05 18:13:17 +03:00
Alexander Alekhin
907743eee7
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-07-30 14:50:36 +00:00
Smirnov Egor
27392f832d
reimplement onnx refactor for master
2021-07-30 13:00:13 +03:00
rogday
cff0168f3a
Merge pull request #20453 from rogday:onnx_importer_fix
...
Split layer dispatch into functions in ONNXImporter
* split layer dispatch into functions
* fixes
* identation and comment fixes
* fix constness
2021-07-28 18:06:24 +03:00
Alexander Alekhin
f4d6a3ec4e
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-07-24 15:34:58 +00:00
Julia Bareeva
4e5699fa71
Merge pull request #20450 from JulieBar:lstm_inside
...
Support non-zero hidden state for LSTM
* fully support non-zero hidden state for LSTM
* check dims of hidden state for LSTM
* fix failed test Test_Model.TextRecognition
* add new tests for LSTM w/ non-zero hidden params
Co-authored-by: Julie Bareeva <julia.bareeva@xperience.ai>
2021-07-23 17:11:50 +03:00
Smirnov Egor
024b43ca06
implement asymmetric padding for conv2d, max_pool and conv2d_backprop_input
2021-07-22 16:58:40 +03:00
Alexander Alekhin
b61a55eebf
Merge pull request #20402 from rogday:tf_diag_dummy
2021-07-16 15:44:29 +00:00
Smirnov Egor
c30078c5a3
add NotImplemented layer
2021-07-16 15:39:54 +03:00
Alexander Alekhin
39b91c97f0
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-07-16 10:35:42 +00:00
Alexander Alekhin
8334ee18e6
Merge pull request #20394 from SamFC10:conv-asymmetric-pads
2021-07-16 10:33:42 +00:00
SamFC10
96d35f7c54
Fix convolution asymmetric padding bug in onnx importer
2021-07-16 09:39:41 +05:30
Alexander Alekhin
fbde0c6c96
dnn(ie): fix handling of 1D and non-32F outputs of InferenceEngine
2021-07-15 21:47:05 +00:00
Alexander Alekhin
602e7c83e2
dnn(test): add extra IR models, more checks in IE testing code
2021-07-15 21:47:05 +00:00