opencv

mirror of https://github.com/opencv/opencv.git synced 2024-12-16 02:19:12 +08:00

Author	SHA1	Message	Date
Dmitry Kurtaev	3700f9e1e9	Merge pull request #25709 from dkurt:wrap_addLayer * Wrap dnn addLayer * Add typing stubs	2024-06-07 20:39:44 +03:00
Kumataro	1bd5ca1ebe	Merge pull request #25686 from Kumataro:fix25674 Suppress build warnings for GCC14 #25686 Close #25674 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-06-02 14:14:04 +03:00
CNOCycle	98b8825031	Merge pull request #25613 from CNOCycle:tflite/ops Support Global_Pool_2D ops in .tflite model #25613 ### Pull Request Readiness Checklist Merge with extra: https://github.com/opencv/opencv_extra/pull/1180 This PR adds support for `GlobalAveragePooling2D` and `GlobalMaxPool2D` on the TFlite backend. When the k`eep_dims` option is enabled, the output is a 2D tensor, necessitating the inclusion of an additional flatten layer. Additionally, the names of these layers have been updated to match the output tensor names generated by `generate.py` from the opencv_extra repository. - [X] I agree to contribute to the project under Apache 2 License. - [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [X] The feature is well documented and sample code can be built with the project CMake	2024-05-31 19:31:21 +03:00
Abduragim Shtanchaev	d7f04a9d33	Merge pull request #25660 from Abdurrahheem:ash/fix-slice-empty-input Slice layer parser fix to support empty input case #25660 This PR fixes Slice Layer's parser to handle empty input cases (cases with initializer) It fixed the issue rased in #24838 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-05-31 13:13:36 +03:00
Danial Javady	05e48605a0	Merge pull request #25412 from ZelboK:update-cudnn-to-9 Refactor DNN module to build with cudnn 9 #25412 A lot of APIs that are currently being used in the dnn module have been removed in cudnn 9. They were deprecated in 8. This PR updates said code accordingly to the newer API. Some key notes: 1) This is my first PR. I am new to openCV. 2) `opencv_test_core` tests pass 3) On a 3080, cuda 12.4(should be irrelevant since I didn't build the `opencv_modules`, gcc 11.4, WSL 2. 4) For brevity I will avoid including macro code that will allow for older versions of cudnn to build. I was unable to get the tests working for `opencv_test_dnn` and `opencv_perf_dnn`. The errors I get are of the following: ``` OpenCV tests: Can't find required data file: dnn/onnx/conformance/node/test_reduce_prod_default_axes_keepdims_example/model.onnx in function 'findData' " thrown in the test body. ``` So before I spend more time investigating I was hoping to get a maintainer to point me in the right direction here. I would like to run these tests and confirm things are working as intended. I may have missed some details. ### Pull Request Readiness Checklist relevant issue (https://github.com/opencv/opencv/issues/24983 - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-05-28 09:54:08 +03:00
Alexander Smorkalov	0b39a51be8	pre: OpenCV 4.10.0 (version++).	2024-05-21 11:37:05 +03:00
Alexander Smorkalov	5f509e2ec1	Skip Test_Caffe_layers.Concat with Vulkan due to sporadic failures.	2024-05-17 11:54:25 +03:00
Yuantao Feng	bc0618b688	Merge pull request #25582 from fengyuentau:dnn/dump_pbtxt Current net exporter `dump` and `dumpToFile` exports the network structure (and its params) to a .dot file which works with `graphviz`. This is hard to use and not friendly to new user. What's worse, the produced picture is not looking pretty. dnn: better net exporter that works with netron #25582 This PR introduces new exporter `dumpToPbtxt` and uses this new exporter by default with environment variable `OPENCV_DNN_NETWORK_DUMP`. It mimics the string output of a onnx model but modified with dnn-specific changes, see below for an example. ![image](https://github.com/opencv/opencv/assets/17219438/0644bed1-da71-4019-8466-88390698e4df) ## Usage Call `cv::dnn::Net::dumpToPbtxt`: ```cpp TEST(DumpNet, dumpToPbtxt) { std::string path = "/path/to/model.onnx"; auto net = readNet(path); Mat input(std::vector<int>{1, 3, 640, 480}, CV_32F); net.setInput(input); net.dumpToPbtxt("yunet.pbtxt"); } ``` Set `export OPENCV_DNN_NETWORK_DUMP=1` ```cpp TEST(DumpNet, env) { std::string path = "/path/to/model.onnx"; auto net = readNet(path); Mat input(std::vector<int>{1, 3, 640, 480}, CV_32F); net.setInput(input); net.forward(); } ``` --- Note: - `pbtxt` is registered as one of the ONNX model suffix in netron. So you can see `module: ai.onnx` and such in the model. - We can get the string output of an ONNX model with the following script ```python import onnx net = onnx.load("/path/to/model.onnx") net_str = str(net) file = open("/path/to/model.pbtxt", "w") file.write(net_str) file.close() ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-05-17 11:07:05 +03:00
Alexander Smorkalov	78ed6de518	Merge pull request #25594 from LaurentBerger:I25587 typo	2024-05-16 08:46:56 +03:00
CNOCycle	7713c84465	Merge pull request #25297 from CNOCycle:tflite/transpose Support Transpose op in TFlite #25297 Merge with extra: https://github.com/opencv/opencv_extra/pull/1168 The purpose of this PR is to introduce support for the Transpose op in TFlite format and to add a shape comparison between the output tensors and the references. In some occasional cases, the shape of the output tensor is `[1,4,1,1]`, while the shape of the reference tensor is `[1,4]`. Consequently, the norm check incorrectly reports that the test has passed, as the residual is zero. Below is a Python script for generating testing data. The generated data can be integrated into the repo `opencv_extra`. ```python import numpy as np import tensorflow as tf PREFIX_TFL = '/path/to/opencv_extra/testdata/dnn/tflite/' def generator(input_tensor, model, saved_name): # convert keras model to .tflite format converter = tf.lite.TFLiteConverter.from_keras_model(model) #converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.optimizations = [None] tflite_model = converter.convert() with open(f'{PREFIX_TFL}/{saved_name}.tflite', 'wb') as f: f.write(tflite_model) # save the input tensor to .npy if input_tensor.ndim == 4: opencv_tensor = np.transpose(input_tensor, (0,3,1,2)) else: opencv_tensor = input_tensor opencv_tensor = np.copy(opencv_tensor, order='C').astype(np.float32) np.save(f'{PREFIX_TFL}/{saved_name}_inp.npy', opencv_tensor) # generate output tenosr and save it to .npy mat_out = model(input_tensor).numpy() mat_out = np.copy(mat_out, order='C').astype(np.float32) if mat_out.ndim == 4: mat_out = np.transpose(mat_out, (0,3,1,2)) interpreter = tf.lite.Interpreter(model_content=tflite_model) out_name = interpreter.get_output_details()[0]['name'] np.save(f'{PREFIX_TFL}/{saved_name}_out_{out_name}.npy', mat_out) def build_transpose(): model_name = "keras_permute" mat_in = np.array([[[1,2,3], [4,5,6]]], dtype=np.float32) model = tf.keras.Sequential() model.add(tf.keras.Input(shape=(2,3))) model.add(tf.keras.layers.Permute((2,1))) model.summary() generator(mat_in, model, model_name) if __name__ == '__main__': build_transpose() ``` ### Pull Request Readiness Checklist - [x] I agree to contribute to the project under Apache 2 License. - [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [X] The feature is well documented and sample code can be built with the project CMake	2024-05-15 20:07:25 +03:00
unknown	5009109167	typo	2024-05-15 16:16:07 +02:00
Laurent Berger	76d9f7aaeb	Merge pull request #25591 from LaurentBerger:I25589 Remove dnn::layer::allocate in doc #25591 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work #25589 - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-05-15 17:08:52 +03:00
alexlyulkov	03507e06b4	Merge pull request #25518 from alexlyulkov:al/fixed-gemm-openvino Fixed OpenVINO gemm layer #25518 Fixed OpenVINO gemm layer The problem was that our layer didn't properly handle all the possible gemm options in OpenVINO mode Fixes #25472 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-05-14 17:41:19 +03:00
Alexander Smorkalov	d8e18f4576	Made fcn-resnet50-12.onnx model optional.	2024-05-03 16:14:22 +03:00
Alexander Smorkalov	ac9a858377	Merge pull request #25524 from alexlyulkov:al/openvino-layers Added more OpenVINO layers to dnn	2024-05-03 13:16:56 +03:00
Wanli	ed47cce1c5	change fcn8s-heavy-pascal tests from caffe to onnx	2024-05-03 00:15:09 +08:00
Alexander Lyulkov	f3f29fa62c	Added more OpenVINO layers to dnn	2024-05-02 14:37:40 +03:00
alexlyulkov	f9dd20eb07	Merge pull request #25414 from alexlyulkov:al/range-fixed Fixed ONNX range layer #25414 Partially address https://github.com/opencv/opencv/issues/25363 Fixed ONNX range layer. It should support any input type. Added tests (extra [PR](https://github.com/opencv/opencv_extra/pull/1170)) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-04-17 09:38:21 +03:00
Alexander Smorkalov	ecbfc1bfd8	Merge pull request #25395 from susumu-iino:fix-dnn-plugin-build-win32 Fix dnn plugin build win32	2024-04-12 11:05:34 +03:00
Yuantao Feng	197626a5bf	Merge pull request #25387 from fengyuentau:complete-float16_t-renaming Rename remaining float16_t for future proof #25387 Resolves comment: https://github.com/opencv/opencv/pull/25217#discussion_r1547733187. `std::float16_t` and `std::bfloat16_t` are introduced since c++23: https://en.cppreference.com/w/cpp/types/floating-point. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-04-11 14:02:44 +03:00
Alexander Smorkalov	e4677fbf64	Merge pull request #25361 from hanliutong:rvv-f32 Further optimize fastDepthwiseConv for RISC-V Vector.	2024-04-09 16:04:02 +03:00
ecchen	e63690a2d9	Add a shape checker for tflite models	2024-04-08 13:28:05 +00:00
Susumu IINO	a0b28f8b06	Add Definition "_USE_MATH_DEFINES" for dnn plugin on Win32 build	2024-04-07 21:08:09 +09:00
Liutong HAN	5be158a2b6	Further optimize fastDepthwiseConv for RVV.	2024-04-07 11:34:41 +08:00
Yuantao Feng	55d7e3f8cc	Merge pull request #1165 from fengyuentau:gold_yolo [BugFix] dnn (ONNX): Foce dropping constant inputs in parseClip if they are shared #25319 Resolves https://github.com/opencv/opencv/issues/25278 Merge with https://github.com/opencv/opencv_extra/pull/1165 In Gold-YOLO ,`Div` has a constant input `B=6` which is then parsed into a `Const` layer in the ONNX importer, but `Clip` also has the shared constant input `max=6` which is already a `Const` layer and then connected to `Elementwise` layer. This should not happen because in the `forward()` of `Elementwise` layer, the legacy code goes through and apply activation to each input. More details on https://github.com/opencv/opencv/issues/25278#issuecomment-2032199630. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-04-03 15:56:59 +03:00
Dmitry Kurtaev	13c95efa74	Merge pull request #25312 from dkurt:dnn_hotfix_tflite Ownership check in TFLite importer #25312 ### Pull Request Readiness Checklist resolves https://github.com/opencv/opencv/issues/25310 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-04-03 09:41:40 +03:00
HAN Liutong	eba158fb0c	Merge pull request #25230 from hanliutong/rvv-conv Optimize int8 layers in DNN modules by using RISC-V Vector intrinsic. #25230 This patch optimize 3 functions in the int8 layer by using RVV Native Intrinsic. This patch was tested on QEMU using VLEN=128 and VLEN=256 on `./bin/opencv_test_dnn --gtest_filter="Int8"`; On the real device (k230, VLEN=128), `EfficientDet_int8` in `opencv_perf_dnn` showed a performance improvement of 1.46x. \| Name of Test \| Original \| optimized \| Speed-up \| \| ------------------------------------------ \| -------- \| ---------- \| -------- \| \| EfficientDet_int8::DNNTestNetwork::OCV/CPU \| 2843.467 \| 1947.013 \| 1.46 \| ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-03-31 16:47:06 +03:00
Yuantao Feng	b758897c29	Merge pull request #25271 from fengyuentau:matmul_bias Merge with https://github.com/opencv/opencv_extra/pull/1158 Todo: - [x] Fix Attention pattern recognition. - [x] Handle other backends. Benchmark: "VIT_B_32 OCV/CPU", M1, results in milliseconds. \| Model \| 4.x \| This PR \| \| - \| - \| - \| \| VIT_B_32 OCV/CPU \| 87.66 \| 83.83 \| ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-03-29 17:35:23 +03:00
Alexander Smorkalov	9fc4b61074	Merge pull request #25291 from dkurt:einsum_openvino Einsum OpenVINO backend	2024-03-29 15:54:26 +03:00
Dmitry Kurtaev	cfa42e4338	Einsum OpenVINO backend	2024-03-29 14:29:45 +03:00
Dmitry Kurtaev	01dc010436	Merge pull request #25273 from dkurt:tflite_new_layers TFLite new layers #25273 ### Pull Request Readiness Checklist resolves https://github.com/opencv/opencv/issues/25272, https://github.com/opencv/opencv/issues/24965 Merge with extra: https://github.com/opencv/opencv_extra/pull/1160 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-03-29 11:21:13 +03:00
Yuantao Feng	accf200408	Merge pull request #25238 from fengyuentau:optimized_const dnn: avoid const layer forwarding in layer norm layer and attention layer #25238 While profiling ViTs with dnn, I found `ConstLayer` can take a proportion of the inference time, which is weird. This comes from the data copy during the inference of `ConstLayer`. There is a chance that we can improve the efficiency of data copying but the easiest and most convenient way is to avoid `ConstLayer`. This PR change the way how we handle constants in layer normalization layer and attention layer, which is storing in the layer blobs instead of making constant layers for them. Checklists: - [x] Backend compatibility in layer normalization layer. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-03-26 15:09:51 +03:00
Alexander Smorkalov	fc34554475	Merge pull request #25184 from dkurt:avoid_extra_memset Avoid extra memset	2024-03-25 13:07:49 +03:00
Yuantao Feng	025e7602b9	Merge pull request #25166 from fengyuentau:fix_cann_gemm dnn (CANN): Fix incorrect shape of 1d bias in Gemm #25166 Gemm layer was refactored some time ago. Users found that the mobilenet example in https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend does not work because of incorrect shape set for 1d bias in Gemm. This PR resolves this issue. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-03-25 09:47:28 +03:00
Dmitry Kurtaev	0b6c9a2123	Merge pull request #25181 from dkurt:release_conv_weights Release convolution weightsMat after usage #25181 ### Pull Request Readiness Checklist related (but not resolved): https://github.com/opencv/opencv/issues/24134 Minor memory footprint improvement. Also, adds a test for VmHWM. RAM top memory usage (-230MB) \| YOLOv3 (237MB file) \| 4.x \| PR \| \|---------------------\|---------\|---------\| \| no winograd \| 808 MB \| 581 MB \| \| winograd \| 1985 MB \| 1750 MB \| See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-03-25 09:03:28 +03:00
Oleg Pipikin	6da2ddcf0e	Fix for OpenVINO 2024.0 Remove support OpenVINO lower than 2022.1 release Remove legacy InferenceEngine wrappers	2024-03-18 15:05:50 +04:00
Dmitry Kurtaev	6a370ba9e7	Avoid extra memset in convolution initialization	2024-03-08 10:46:07 +03:00
Dmitry Kurtaev	98aed21dd4	Avoid copy of ONNX graph during import	2024-03-05 18:22:46 +03:00
Alexander Smorkalov	daa8f7dfc6	Partially back-port #25075 to 4.x	2024-03-05 12:15:39 +03:00
Laurent Berger	5fe3933346	Merge pull request #25120 from LaurentBerger:I25103 Fixed ReduceMean layer behaviour #25120 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake `a93c31e3c9/onnxruntime/core/providers/cpu/reduction/reduction_ops.cc (L433-L443)`	2024-03-04 09:36:53 +03:00
CSBVision	e8582f2cf8	Update net_impl.cpp See issue #25112	2024-03-01 14:56:00 +01:00
Yuantao Feng	5aa5c39210	Merge pull request #25076 from fengyuentau:improve_attention dnn: try improving performance of Attention layer #25076 Checklist: - [x] Use `Mat` over `Mat::zeros` for temporary buffer in forward - [x] Use layer internal buffer over temporary Mat buffer - [x] Try a single fastGemmBatch on the Q/K/V calculation Performance: Performance test case is `Layer_Attention.VisionTransformer/0`, which has input of shape {1, 197, 768}, weight of shape {768, 2304} and bias {2304}. Data is in millisecond. \| \| macOS 14.2.1, Apple M1 \| Ubuntu 22.04.2, Intel i7 12700K \| \| - \| - \| - \| \| Current \| 10.96 \| 1.58 \| \| w/ Mat \| 6.27 \| 1.41 \| \| w/ Internals \| 5.87 \| 1.38 \| \| w/ fastGemmBatch \| 6.12 \| 2.14 \| ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-02-28 16:47:08 +03:00
Laurent Berger	3c712cf77d	Merge pull request #25100 from LaurentBerger:I25077 Fix issue #25077 #25100 Fixes https://github.com/opencv/opencv/issues/25077 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake	2024-02-27 14:15:11 +03:00
Dhanwanth1803	12aa0fe898	Merge pull request #24985 from Dhanwanth1803:hardswish Fixes #24974 support HardSwishInt8 #24985 As given very clearly in the issue #24974 I made the required 2 changes to implement HardSwish Layer in INT8. Requesting comments. resolves https://github.com/opencv/opencv/issues/24974 - [X] I agree to contribute to the project under Apache 2 License. - [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [X] The PR is proposed to the proper branch - [X] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake Co-authored-by: Dhanwanth1803 <dhanwanthvarala@gmail,com>	2024-02-16 18:19:29 +03:00
Alexander Smorkalov	4b35b2f968	Merge pull request #24973 from asmorkalov:as/fix_weigths_proto_mess Fix proto and weights mess in dnn performance tests	2024-02-07 11:10:32 +03:00
Alexander Smorkalov	77af137285	Fix proto and weights mess in dnn performance tests.	2024-02-07 09:16:09 +03:00
fengyuentau	fcaa8ce3c2	fix incorrect steps and elemsize when dtype changes	2024-02-06 16:27:25 +08:00
Haosonn	87f749277d	Merge pull request #24768 from Haosonn:pre-pr-2 Vulkan backend for NaryEltwiseLayer in DNN module #24768 We improve Vulkan backend for ``NaryEltwiseLayer`` in DNN module by: - add a basic framework for Vulkan backend in ``NaryEltwiseLayer`` - add a compute shader for binary forwarding (an imitation of what has been done in native OpenCV backend including broadcasting and eltwise-operation) - typo fixed: - Wrong info output in ``context.cpp`` Currently, our implementation (or all layers supporting Vulkan backend) runs pretty slow on discrete GPUs basically due to IO cost in function ``copyToHost``, and we are going to fix that by - find out the best ``VkMemoryProperty`` for various discrete GPUs - prevent ``copyToHost`` in middle layers during forwarding, (i.e keep data in GPU memory) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake Co-authored-by: IskXCr <IskXCr@outlook.com>	2024-01-29 18:41:49 +03:00
Alexander Alekhin	efc9837df1	Merge pull request #24892 from opencv-pushbot:gitee/alalek/dnn_avoid_16s_usage DNN: avoid CV_16S usage for FP16 #24892 Merge after: #24918 TODO: - [x] measure performance changes - [x] optimize convertTo for OpenCL: #24918 12700K iGPU: \|Name of Test\|0\|1\|1 vs 0 (x-factor)\| \|---\|:-:\|:-:\|:-:\| \|AlexNet::DNNTestNetwork::OCV/OCL_FP16\|7.441\|7.480\|0.99\| \|CRNN::DNNTestNetwork::OCV/OCL_FP16\|10.776\|10.736\|1.00\| \|DenseNet_121::DNNTestNetwork::OCV/OCL_FP16\|52.762\|52.833\|1.00\| \|EAST_text_detection::DNNTestNetwork::OCV/OCL_FP16\|60.694\|60.721\|1.00\| \|EfficientNet::DNNTestNetwork::OCV/OCL_FP16\|33.373\|33.173\|1.01\| \|FastNeuralStyle_eccv16::DNNTestNetwork::OCV/OCL_FP16\|81.840\|81.724\|1.00\| \|GoogLeNet::DNNTestNetwork::OCV/OCL_FP16\|20.965\|20.927\|1.00\| \|Inception_5h::DNNTestNetwork::OCV/OCL_FP16\|22.204\|22.173\|1.00\| \|Inception_v2_SSD_TensorFlow::DNNTestNetwork::OCV/OCL_FP16\|47.115\|47.460\|0.99\| \|MPHand::DNNTestNetwork::OCV/OCL_FP16\|6.760\|6.670\|1.01\| \|MPPalm::DNNTestNetwork::OCV/OCL_FP16\|10.188\|10.171\|1.00\| \|MPPose::DNNTestNetwork::OCV/OCL_FP16\|12.510\|12.561\|1.00\| \|MobileNet_SSD_Caffe::DNNTestNetwork::OCV/OCL_FP16\|17.290\|17.072\|1.01\| \|MobileNet_SSD_v1_TensorFlow::DNNTestNetwork::OCV/OCL_FP16\|19.473\|19.306\|1.01\| \|MobileNet_SSD_v2_TensorFlow::DNNTestNetwork::OCV/OCL_FP16\|22.874\|23.404\|0.98\| \|OpenFace::DNNTestNetwork::OCV/OCL_FP16\|9.568\|9.517\|1.01\| \|OpenPose_pose_mpi_faster_4_stages::DNNTestNetwork::OCV/OCL_FP16\|539.899\|539.845\|1.00\| \|PPHumanSeg::DNNTestNetwork::OCV/OCL_FP16\|18.015\|18.769\|0.96\| \|PPOCRv3::DNNTestNetwork::OCV/OCL_FP16\|63.122\|63.540\|0.99\| \|ResNet_50::DNNTestNetwork::OCV/OCL_FP16\|34.947\|34.925\|1.00\| \|SFace::DNNTestNetwork::OCV/OCL_FP16\|10.249\|10.206\|1.00\| \|SSD::DNNTestNetwork::OCV/OCL_FP16\|213.068\|213.108\|1.00\| \|SqueezeNet_v1_1::DNNTestNetwork::OCV/OCL_FP16\|4.867\|4.878\|1.00\| \|VIT_B_32::DNNTestNetwork::OCV/OCL_FP16\|200.563\|190.788\|1.05\| \|VitTrack::DNNTestNetwork::OCV/OCL_FP16\|7.528\|7.173\|1.05\| \|YOLOX::DNNTestNetwork::OCV/OCL_FP16\|132.858\|132.701\|1.00\| \|YOLOv3::DNNTestNetwork::OCV/OCL_FP16\|209.559\|208.809\|1.00\| \|YOLOv4::DNNTestNetwork::OCV/OCL_FP16\|221.357\|220.924\|1.00\| \|YOLOv4_tiny::DNNTestNetwork::OCV/OCL_FP16\|24.446\|24.382\|1.00\| \|YOLOv5::DNNTestNetwork::OCV/OCL_FP16\|43.922\|44.080\|1.00\| \|YOLOv8::DNNTestNetwork::OCV/OCL_FP16\|64.159\|63.842\|1.00\| \|YuNet::DNNTestNetwork::OCV/OCL_FP16\|10.177\|10.231\|0.99\| \|opencv_face_detector::DNNTestNetwork::OCV/OCL_FP16\|15.121\|15.445\|0.98\| Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>	2024-01-26 16:34:17 +03:00
Yuantao Feng	37156a4719	Merge pull request #24925 from fengyuentau:loongarch_handle_warnings Handle warnings in loongson-related code #24925 See https://github.com/fengyuentau/opencv/actions/runs/7665377694/job/20891162958#step:14:16 Warnings needs to be handled before we add the loongson server to our CI. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake	2024-01-26 13:38:00 +03:00

1 2 3 4 5 ...

2285 Commits