Alexander Alekhin
c722625f28
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2020-04-28 16:53:19 +00:00
Alexander Alekhin
9181ecfc7b
cmake: fix protobuf handling
2020-04-27 02:11:19 +00:00
Alexander Alekhin
2cef100303
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2020-04-16 18:28:27 +00:00
Ilya Lavrenov
91b0100287
Fixed compilation when NN builder is not built
2020-04-14 15:05:01 +03:00
Alexander Alekhin
e661ad2a67
eliminate build warnings
2020-03-27 11:39:07 +00:00
Alexander Alekhin
b4b4d21212
eliminate build warnings
2020-03-26 19:18:09 +00:00
Alexander Alekhin
d00e58cdb0
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2020-03-10 22:49:51 +00:00
Alexander Alekhin
510a8520c7
Merge pull request #16746 from alalek:dnn_switch_ie_backend_ngraph
2020-03-10 13:52:33 +00:00
Alexander Alekhin
db95aec4a7
dnn(ie): switch to nGraph backend by default
2020-03-10 14:33:22 +03:00
Alexander Alekhin
9b3be01b83
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2020-03-09 20:27:34 +00:00
NesQl
0bcdf7d03e
Merge pull request #16724 from liqi-c:3.4-tengine
...
* Add Tengine support .
* Modify printf to CV_LOG_WARNING
* a few minor fixes in the code
* Renew Tengine version
* Add header file for CV_LOG_WARNING
* Add #ifdef HAVE_TENGINE in tengine_graph_convolution.cpp
* remove trailing whitespace
* Remove trailing whitespace
* Modify for compile problem
* Modify some code style error
* remove whitespace
* Move some code style problem
* test
* add ios limit and build problem
* Modified as alalek suggested
* Add cmake 2.8 support
* modify cmake 3.5.1 problem
* test and set BUILD_ANDROID_PROJECTS OFF
* remove some compile error
* remove some extra code in tengine
* close test.
* Test again
* disable android.
* delete ndk version judgement
* Remove setenv() call . and add License information
* Set tengine default OFF. Close test .
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
2020-03-09 14:59:23 +00:00
Alexander Alekhin
124bf8339f
dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
...
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
2020-03-03 08:07:54 +00:00
Alexander Alekhin
29d214474f
dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
...
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
2020-03-03 07:45:09 +00:00
Julien
4e2ef8c8f5
Merge pull request #16218 from JulienMaille:cuda-dnn-for-older-gpus
...
Enable cuda4dnn on hardware without support for __half
* Enable cuda4dnn on hardware without support for half (ie. compute capability < 5.3)
Update CMakeLists.txt
Lowered minimum CC to 3.0
* UPD: added ifdef on new copy kernel
* added fp16 support detection at runtime
* Clarified #if condition on atomicAdd definition
* More explicit CMake error message
2020-01-15 18:28:37 +03:00
Alexander Alekhin
92b9888837
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-12-12 13:02:19 +03:00
Alexander Alekhin
5ee7abbe3c
Merge pull request #16088 from alalek:dnn_eltwise_layer_different_src_channels
...
dnn(eltwise): fix handling of different number of channels
* dnn(test): reproducer for Eltwise layer issue from PR16063
* dnn(eltwise): rework support for inputs with different channels
* dnn(eltwise): get rid of finalize(), variableChannels
* dnn(eltwise): update input sorting by number of channels
- do not swap inputs if number of channels are same after truncation
* dnn(test): skip "shortcut" with batch size 2 on MYRIAD targets
2019-12-11 20:16:58 +03:00
Alexander Alekhin
4b0132ed7a
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-12-02 16:26:52 +03:00
Lubov Batanina
7523c777c5
Merge pull request #15537 from l-bat:ngraph
...
* Support nGraph
* Fix resize
2019-12-02 16:16:06 +03:00
Yashas Samaga B L
613c12e590
Merge pull request #14827 from YashasSamaga:cuda4dnn-csl-low
...
CUDA backend for the DNN module
* stub cuda4dnn design
* minor fixes for tests and doxygen
* add csl public api directory to module headers
* add low-level CSL components
* add high-level CSL components
* integrate csl::Tensor into backbone code
* switch to CPU iff unsupported; otherwise, fail on error
* add fully connected layer
* add softmax layer
* add activation layers
* support arbitary rank TensorDescriptor
* pass input wrappers to `initCUDA()`
* add 1d/2d/3d-convolution
* add pooling layer
* reorganize and refactor code
* fixes for gcc, clang and doxygen; remove cxx14/17 code
* add blank_layer
* add LRN layer
* add rounding modes for pooling layer
* split tensor.hpp into tensor.hpp and tensor_ops.hpp
* add concat layer
* add scale layer
* add batch normalization layer
* split math.cu into activations.cu and math.hpp
* add eltwise layer
* add flatten layer
* add tensor transform api
* add asymmetric padding support for convolution layer
* add reshape layer
* fix rebase issues
* add permute layer
* add padding support for concat layer
* refactor and reorganize code
* add normalize layer
* optimize bias addition in scale layer
* add prior box layer
* fix and optimize normalize layer
* add asymmetric padding support for pooling layer
* add event API
* improve pooling performance for some padding scenarios
* avoid over-allocation of compute resources to kernels
* improve prior box performance
* enable layer fusion
* add const layer
* add resize layer
* add slice layer
* add padding layer
* add deconvolution layer
* fix channelwise ReLU initialization
* add vector traits
* add vectorized versions of relu, clipped_relu, power
* add vectorized concat kernels
* improve concat_with_offsets performance
* vectorize scale and bias kernels
* add support for multi-billion element tensors
* vectorize prior box kernels
* fix address alignment check
* improve bias addition performance of conv/deconv/fc layers
* restructure code for supporting multiple targets
* add DNN_TARGET_CUDA_FP64
* add DNN_TARGET_FP16
* improve vectorization
* add region layer
* improve tensor API, add dynamic ranks
1. use ManagedPtr instead of a Tensor in backend wrapper
2. add new methods to tensor classes
- size_range: computes the combined size of for a given axis range
- tensor span/view can be constructed from a raw pointer and shape
3. the tensor classes can change their rank at runtime (previously rank was fixed at compile-time)
4. remove device code from tensor classes (as they are unused)
5. enforce strict conditions on tensor class APIs to improve debugging ability
* fix parametric relu activation
* add squeeze/unsqueeze tensor API
* add reorg layer
* optimize permute and enable 2d permute
* enable 1d and 2d slice
* add split layer
* add shuffle channel layer
* allow tensors of different ranks in reshape primitive
* patch SliceOp to allow Crop Layer
* allow extra shape inputs in reshape layer
* use `std::move_backward` instead of `std::move` for insert in resizable_static_array
* improve workspace management
* add spatial LRN
* add nms (cpu) to region layer
* add max pooling with argmax ( and a fix to limits.hpp)
* add max unpooling layer
* rename DNN_TARGET_CUDA_FP32 to DNN_TARGET_CUDA
* update supportBackend to be more rigorous
* remove stray include from preventing non-cuda build
* include op_cuda.hpp outside condition #if
* refactoring, fixes and many optimizations
* drop DNN_TARGET_CUDA_FP64
* fix gcc errors
* increase max. tensor rank limit to six
* add Interp layer
* drop custom layers; use BackendNode
* vectorize activation kernels
* fixes for gcc
* remove wrong assertion
* fix broken assertion in unpooling primitive
* fix build errors in non-CUDA build
* completely remove workspace from public API
* fix permute layer
* enable accuracy and perf. tests for DNN_TARGET_CUDA
* add asynchronous forward
* vectorize eltwise ops
* vectorize fill kernel
* fixes for gcc
* remove CSL headers from public API
* remove csl header source group from cmake
* update min. cudnn version in cmake
* add numerically stable FP32 log1pexp
* refactor code
* add FP16 specialization to cudnn based tensor addition
* vectorize scale1 and bias1 + minor refactoring
* fix doxygen build
* fix invalid alignment assertion
* clear backend wrappers before allocateLayers
* ignore memory lock failures
* do not allocate internal blobs
* integrate NVTX
* add numerically stable half precision log1pexp
* fix indentation, following coding style, improve docs
* remove accidental modification of IE code
* Revert "add asynchronous forward"
This reverts commit 1154b9da9da07e9b52f8a81bdcea48cf31c56f70.
* [cmake] throw error for unsupported CC versions
* fix rebase issues
* add more docs, refactor code, fix bugs
* minor refactoring and fixes
* resolve warnings/errors from clang
* remove haveCUDA() checks from supportBackend()
* remove NVTX integration
* changes based on review comments
* avoid exception when no CUDA device is present
* add color code for CUDA in Net::dump
2019-10-21 14:28:00 +03:00
Alexander Alekhin
2ad0487cec
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-08-13 18:32:29 +00:00
Tomoaki Teshima
40c71a2463
suppress noisy warning
...
* add -Wno-psabi when using GCC 6
* add -Wundef for CUDA 10
* add -Wdeprecated-declarations when using GCC 7
* add -Wstrict-aliasing and -Wtautological-compare for GCC 7
* replace cudaThreadSynchronize with cudaDeviceSynchronize
2019-08-08 21:49:32 +09:00
Yashas Samaga B L
ae279966c2
Merge pull request #14660 from YashasSamaga:dnn-cuda-build
...
add cuDNN dependency and setup build for cuda4dnn (#14660 )
* update cmake for cuda4dnn
- Adds FindCUDNN
- Adds new options:
* WITH_CUDA
* OPENCV_DNN_CUDA
- Adds CUDA4DNN preprocessor symbol for the DNN module
* FIX: append EXCLUDE_CUDA instead of overwrite
* remove cuDNN dependency for user apps
* fix unused variable warning
2019-06-02 14:47:15 +03:00
Alexander Alekhin
fcb07c64f3
cmake: fix build of dnn tests with shared common code
...
- don't share .cpp files (PCH support is broken)
2019-03-31 08:52:25 +00:00
Sayed Adel
de22442046
dnn:perf add missing definition __OPENCV_TEST to fix pch
2019-03-31 03:28:33 +02:00
Lubov Batanina
7d3d6bc4e2
Merge pull request #13932 from l-bat:MyriadX_master_dldt
...
* Fix precision in tests for MyriadX
* Fix ONNX tests
* Add output range in ONNX tests
* Skip tests on Myriad OpenVINO 2018R5
* Add detect MyriadX
* Add detect MyriadX on OpenVINO R5
* Skip tests on Myriad next version of OpenVINO
* dnn(ie): VPU type from environment variable
* dnn(test): validate VPU type
* dnn(test): update DLIE test skip conditions
2019-03-29 16:42:58 +03:00
Alexander Alekhin
96c71dd3d2
dnn: reduce set of ignored warnings
2018-11-15 13:15:59 +03:00
Dmitry Kurtaev
c8f3579f93
Fix #12542 ( #12603 )
...
* Fix #12542
* Remove ignore of non-virtual-dtor error
2018-09-26 16:08:51 +03:00
Alexander Alekhin
29bee6f07e
cmake: move Matlab scripts to opencv_contrib ( #12541 )
...
* matlab: move to opencv_contrib
* cmake: preserve variables scope for processing modules
- use macro instead of function to avoid scope resets
2018-09-17 14:55:42 +03:00
Lubov Batanina
0c8590027f
Merge pull request #12071 from l-bat/l-bat:onnx_parser
...
* Add Squeezenet support in ONNX
* Add AlexNet support in ONNX
* Add Googlenet support in ONNX
* Add CaffeNet and RCNN support in ONNX
* Add VGG16 and VGG16 with batch normalization support in ONNX
* Add RCNN, ZFNet, ResNet18v1 and ResNet50v1 support in ONNX
* Add ResNet101_DUC_HDC
* Add Tiny Yolov2
* Add CNN_MNIST, MobileNetv2 and LResNet100 support in ONNX
* Add ONNX models for emotion recognition
* Add DenseNet121 support in ONNX
* Add Inception v1 support in ONNX
* Refactoring
* Fix tests
* Fix tests
* Skip unstable test
* Modify Reshape operation
2018-09-10 21:07:51 +03:00
Dmitry Kurtaev
50bceea038
Include preprocessing nodes to object detection TensorFlow networks ( #12211 )
...
* Include preprocessing nodes to object detection TensorFlow networks
* Enable more fusion
* faster_rcnn_resnet50_coco_2018_01_28 test
2018-08-31 15:41:56 +03:00
Maksim Shabunin
7cf52de47e
dnn: modified IE search, R2 compatibility fixed
2018-07-31 14:48:06 +03:00
Dmitry Kurtaev
28e08ae0bd
Add a sample which tests OpenVINO models
2018-07-23 19:08:51 +03:00
Alexander Alekhin
e2b5d11290
dnn: allow to use external protobuf
...
"custom layers" feature will not work properly in these builds.
2018-07-09 17:28:45 +03:00
Vadim Pisarevsky
dc27d52221
temporarily disabled OpenCL use in DNN module on Mac ( #11828 )
...
* temporarily disabled OpenCL use in DNN module on Mac, since some of the tests fail
* disable OpenCL in DNN on Mac at CMake level, not source level (thanks to alalek for the advice)
2018-06-26 09:35:18 +03:00
Maksim Shabunin
020ad1ac76
dnn: allow setting IE paths via command line
2018-05-22 14:40:03 +03:00
Alexander Alekhin
6c8014e7d1
cmake: disable checks for protobuf generated files
2018-03-28 18:43:28 +03:00
Dmitry Kurtaev
2f3a9ba1d4
Update OpenCVDetectInferenceEngine.cmake
2018-03-28 16:34:37 +03:00
Alexander Alekhin
1b83bc48a1
dnn: make OpenCL DNN code optional
2018-03-01 12:12:40 +03:00
luz.paz
5718d09e39
Misc. modules/ typos
...
Found via `codespell`
2018-02-12 07:09:43 -05:00
Alexander Alekhin
5a791e6e06
cmake: update reporting of excluded dispatching files ( #10711 )
...
* cmake: add ocv_get_smart_file_name() macro
* cmake: avoid adding files for unavailable dispatch modes
2018-02-12 14:48:20 +03:00
Dmitry Kurtaev
10e1de74d2
Intel Inference Engine deep learning backend ( #10608 )
...
* Intel Inference Engine deep learning backend.
* OpenFace network using Inference Engine backend
2018-02-06 11:57:35 +03:00
Alexander Alekhin
3d6659112f
cmake: fix includes processing
2018-02-02 21:52:54 +03:00
Maksim Shabunin
e56d6054aa
Do not build protobuf without dnn ( #10689 )
...
* Do not build protobuf if dnn is disabled
* Added BUILD_LIST cmake option to the cache
* Moved protobuf to the top level
* Fixed static build
* Fixed world build
* fixup! Fixed world build
2018-02-01 16:30:23 +03:00
Alexander Alekhin
4d84999452
dnn: protobuf build warnings
2018-01-15 21:15:23 +00:00
Alexander Alekhin
6674a024fc
dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option
...
replaces REUSE_DNN_MEMORY compile-time option
2018-01-07 18:38:14 +00:00
Alexander Alekhin
7d67d60fb1
cmake(opt): AVX512_SKX
2017-12-29 07:18:11 +00:00
Alexander Alekhin
8e7af7f089
Merge pull request #10456 from dkurt:dnn_allocate_mem_for_optimized_concat
2017-12-28 16:04:51 +00:00
Alexander Alekhin
898ca38257
cmake: AVX512 -> AVX_512F
2017-12-28 15:20:27 +00:00
Dmitry Kurtaev
a9807d8f54
Allocate new memory for optimized concat to prevent collisions.
...
Add a flag to disable memory reusing in dnn module.
2017-12-28 16:45:53 +03:00
Arjan van de Ven
2938860b3f
Provide a few AVX512 optimized functions for the DNN module
...
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2017-12-26 16:00:17 +00:00